正确使用arraylist和linkedlist

正确使用arraylist和linkedlist—性能的改进

来源：技术资料中心发布会员：新书城收集整理发布时间：2006-7-9 人气：70

USING ARRAYLIST AND LINKEDLIST

ArrayListand LinkedList are two Collections classes used for storing lists ofobject references. For example, you could have an ArrayList of Strings,or a LinkedList of Integers. This tip compares the performance ofArrayList and LinkedList, and offers some suggestions about which ofthese classes is the right choice in a given situation.

Thefirst key point is that an ArrayList is backed by a primitive Objectarray. Because of that, an ArrayList is much faster than a LinkedListfor random access, that is, when accessing arbitrary list elementsusing the get method. Note that the get method is implemented forLinkedLists, but it requires a sequential scan from the front or backof the list. This scan is very slow. For a LinkedList, there‘s no fastway to access the Nth element of the list.

Consider thefollowing example. Suppose you have a large list of sorted elements,either an ArrayList or a LinkedList. Suppose too that you do a binarysearch on the list. The standard binary search algorithm starts bychecking the search key against the value in the middle of the list. Ifthe middle value is too high, then the upper half of the list iseliminated. However, if the middle value is too low, then the lowerhalf of the list is ignored. This process continues until the key isfound in the list, or until the lower bound of the search becomesgreater than the upper bound.

Here‘s a program that does a binary search on all the elements in an ArrayList or a LinkedList:

import java.util.*;

public class ListDemo1 {
static final int N = 10000;

static List values;

// make List of increasing Integer values

static {
Integer vals[] = new Integer[N];

Random rn = new Random();

for (int i = 0, currval = 0; i < N; i++) {
vals[i] = new Integer(currval);
currval += rn.nextInt(100) + 1;
}

values = Arrays.asList(vals);
}

// iterate across a list and look up every
// value in the list using binary search

static long timeList(List lst) {
long start = System.currentTimeMillis();

for (int i = 0; i < N; i++) {

// look up a value in the list
// using binary search

int indx = Collections.binarySearch(
lst, values.get(i));

// sanity check for result
// of binary search

if (indx != i) {
System.out.println(
"*** error ***\n");
}
}

return System.currentTimeMillis() - start;
}

public static void main(String args[]) {

// do lookups in an ArrayList

System.out.println("time for ArrayList = " +
timeList(new ArrayList(values)));

// do lookups in a LinkedList

System.out.println(
"time for LinkedList = " +
timeList(new LinkedList(values)));
}
}

TheListDemo1 program sets up a List of sorted Integer values. It then addsthe values to an ArrayList or a LinkedList. ThenCollections.binarySearch is used to search for each value in the list.

When you run this program, you should see a result that looks something like this:

time for ArrayList = 31

time for LinkedList = 4640

ArrayListis about 150 times faster than LinkedList. (Your results might differdepending on your machine characteristics, but you should see adistinct difference in the result for ArrayList as compared to that forLinkedList. The same is true for the other programs in this tip.)Clearly, LinkedList is a bad choice in this situation. The binarysearch algorithm inherently uses random access, and LinkedList does notsupport fast random access. The time to do a random access in aLinkedList is proportional to the size of the list. By comparison,random access in an ArrayList has a fixed time.

You can use the RandomAccess marker interface to check whether a List supports fast random access:

void f(List lst) {
if (lst instanceof RandomAccess) {
// supports fast random access
}
}

ArrayListimplements the RandomAccess interface, and LinkedList. does not. Notethat Collections.binarySearch does take advantage of the RandomAccessproperty, to optimize searches.

Do these results prove thatArrayList is always a better choice? Not necessarily. There are manycases where LinkedList does better. Also note that there are manysituations where an algorithm can be implemented efficiently forLinkedList. An example is reversing a LinkedList usingCollections.reverse. The internal algorithm does this, and getsreasonable performance, by using forward and backward iterators.

Let‘slook at another example. Suppose you have a list of elements, and youdo a lot of element inserting and deleting to the list. In this case,LinkedList is the better choice. To demonstrate that, consider thefollowing "worst case" scenario. In this demo, a program repeatedlyinserts elements at the beginning of a list. The code looks like this:

import java.util.*;

public class ListDemo2 {
static final int N = 50000;

// time how long it takes to add
// N objects to a list

static long timeList(List lst) {
long start = System.currentTimeMillis();

Object obj = new Object();

for (int i = 0; i < N; i++) {
lst.add(0, obj);
}

return System.currentTimeMillis() - start;
}

public static void main(String args[]) {

// do timing for ArrayList

System.out.println(
"time for ArrayList = " +
timeList(new ArrayList()));

// do timing for LinkedList

System.out.println(
"time for LinkedList = " +
timeList(new LinkedList()));
}
}

When you run this program, the result should look something like this:

time for ArrayList = 4859

time for LinkedList = 125

These results are pretty much the reverse of the previous example.

Whenan element is added to the beginning of an ArrayList, all of theexisting elements must be pushed back, which means a lot of expensivedata movement and copying. By contrast, adding an element to thebeginning of a LinkedList simply means allocating an internal recordfor the element and then adjusting a couple of links. Adding to thebeginning of a LinkedList has fixed cost, but adding to the beginningof an ArrayList has a cost that‘s proportional to the list size.

Sofar, this tip has looked at speed issues, but what about space? Let‘slook at some internal details of how ArrayList and LinkedList areimplemented in Java 2 SDK, Standard Edition v 1.4. These details arenot part of the external specification of these classes, but areillustrative of how such classes work internally.

The LinkedList class has a private internal class defined like this:

private static class Entry {
Object element;
Entry next;
Entry previous;
}

EachEntry object references a list element, along with the next andprevious elements in the LinkedList -- in other words, a doubly-linkedlist. A LinkedList of 1000 elements will have 1000 Entry objects linkedtogether, referencing the actual list elements. There is significantspace overhead in a LinkedList structure, given all these Entryobjects.

An ArrayList has a backing Object array to store theelements. This array starts with a capacity of 10. When the array needsto grow, the new capacity is computed as:

newCapacity = (oldCapacity * 3) / 2 + 1;

Noticethat the array capacity grows each time by about 50%. This means thatif you have an ArrayList with a large number of elements, there will bea significant amount of space wasted at the end. This waste isintrinsic to the way ArrayList works. If there was no spare capacity,the array would have to be reallocated for each new element, andperformance would suffer dramatically. Changing the growth strategy tobe more aggressive (such as doubling the size at each reallocation)would result in slightly better performance, but it would waste morespace.

If you know how many elements will be in an ArrayList,you can specify the capacity to the constructor. You can also call thetrimToSize method after the fact to reallocate the internal array. Thisgets rid of the wasted space.

So far, this discussion hasassumed that either an ArrayList or a LinkedList is "right" for a givenapplication. But sometimes, other choices make more sense. For example,consider the very common situation where you have a list of key/valuepairs, and you would like to retrieve a value for a given key.

Youcould store the pairs in an N x 2 Object array. To find the right pair,you could do a sequential search on the key values. This approachworks, and is a useful choice for very small lists (say 10 elements orless), but it doesn‘t scale to big lists.

Another approach isto sort the key/value pairs by ascending key value, store the result ina pair of ArrayLists, and then do a binary search on the keys list.This approach also works, and is very fast. Yet another approach is tonot use a list structure at all, but instead use a map structure (hashtable), in the form of a HashMap.

Which is faster, a binary search on an ArrayList, or a HashMap? Here‘s a final example that compares these two:

import java.util.*;

public class ListDemo3 {
static final int N = 500000;

// Lists of keys and values

static List keys;
static List values;

// fill the keys list with ascending order key
// values and fill the values list with
// corresponding values (-key)

static {
Integer keyvec[] = new Integer[N];
Integer valuevec[] = new Integer[N];

Random rn = new Random();

for (int i = 0, currval = 0; i < N; i++) {
keyvec[i] = new Integer(currval);
valuevec[i] = new Integer(-currval);
currval += rn.nextInt(100) + 1;
}

keys = Arrays.asList(keyvec);
values = Arrays.asList(valuevec);
}

// fill a Map with key/value pairs

static Map map = new HashMap();

static {
for (int i = 0; i < N; i++) {
map.put(keys.get(i), values.get(i));
}
}

// do binary search lookup of all keys

static long timeList() {
long start = System.currentTimeMillis();

for (int i = 0; i < N; i++) {
int indx = Collections.binarySearch(
keys, keys.get(i));

// sanity check of returned value
// from binary search

if (indx != i) {
System.out.println(
"*** error ***\n");
}
}

return System.currentTimeMillis() - start;
}

// do Map lookup of all keys

static long timeMap() {
long start = System.currentTimeMillis();

for (int i = 0; i < N; i++) {
Integer value = (Integer)map.get(
keys.get(i));

// sanity check of value returned
// from map lookup

if (value != values.get(i)) {
System.out.println(
"*** error ***\n");
}
}

return System.currentTimeMillis() - start;
}

public static void main(String args[]) {

// do timing for List implementation

System.out.println("List time = " +
timeList());

// do timing for Map implementation

System.out.println("Map time = " +
timeMap());
}
}

Theprogram sets up Lists of keys and values, and then uses two differenttechniques to map keys to values. One approach uses a binary search ona list, the other a hash table.

When you run the ListDemo3 program, you should get a result that looks something like this:

ArrayList time = 1000

HashMap time = 281

Inthis example, N has a value of 500000. Approximately, log2(N) - 1comparisons are required in an average successful binary search, soeach binary search lookup in the ArrayList will take about 18comparisons. By contrast, a properly implemented hash table typicallyrequires only 1-3 comparisons. So you should expect the hash table tobe faster in this case.

However, binary search is stilluseful. For example, you might want to do a lookup in a sorted list andthen find keys that are close in value to the key used for the lookup.Doing this is easy with binary search, but impossible in a hash table.Keys in a hash table are stored in apparent random order. Also, if youare concerned with worst-case performance, the binary search algorithmoffers a much stronger performance guarantee than a hash table scheme.You might also consider using TreeMap for doing lookups in sortedcollections of key/value pairs.

Let‘s summarize the key points presented in this tip:

Appendingelements to the end of a list has a fixed averaged cost for bothArrayList and LinkedList. For ArrayList, appending typically involvessetting an internal array location to the element reference, butoccasionally results in the array being reallocated. For LinkedList,the cost is uniform and involves allocating an internal Entry object.
Insertingor deleting elements in the middle of an ArrayList implies that therest of the list must be moved. Inserting or deleting elements in themiddle of a LinkedList has fixed cost.
A LinkedList does not support efficient random access
AnArrayList has space overhead in the form of reserve capacity at the endof the list. A LinkedList has significant space overhead per element.
Sometimes a Map structure is a better choice than a List.

本站仅提供存储服务，所有内容均由用户发布，如发现有害或侵权内容，请点击举报。