Java Technology Home Page
A-Z Index
Java Developer Connection(SM)
Books


Downloads, APIs, Documentation
Java Developer Connection
Tutorials, Tech Articles, Training
Online Support
Community Discussion
News & Events from Everywhere
Products from Everywhere
How Java Technology is Used Worldwide
 
Book Excerpt Index

Core Java 2, Volume II

by Cay S. Horstmann and Gary Cornell

Chapter 2: Collections


 

Bulk Operations

So far, most of our examples used an iterator to traverse a collection, one element at a time. However, you can often avoid iteration by using one of the bulk operations in the library.

Suppose you want to find the intersection of two sets, the elements that two sets have in common. First, make a new set to hold the result.

Set result = new HashSet(a);
Here, you use the fact that every collection has a constructor whose parameter is another collection that holds the initialization values.

Now, use the retainAll method:

result.retainAll(b);
It retains all elements that also happen to be in b. You have formed the intersection without programming a loop.

You can carry this idea further and apply a bulk operation to a view. For example, suppose you have a map that maps employee IDs to employee objects, and you have a set of the IDs of all employees that are to be terminated.

Map staffMap = . . .;
Set terminatedIDs = . . .;
Simply form the key set and remove all IDs of terminated employees.
staffMap.keySet().removeAll(terminatedIDs);
Because the key set is a view into the map, the keys and associated employee names are automatically removed from the map.

By using a subrange view, you can restrict bulk operations to sublists and subsets. For example, suppose you want to add the first ten elements of a list to another container. Form a sublist to pick out the first ten:

relocated.addAll(staff.subList(0, 10));
The subrange can also be a target of a mutating operation.
staff.subList(0, 10).clear();
Interfacing with Legacy APIs

Since large portions of the Java platform API were designed before the collections framework was created, you occasionally need to translate between traditional arrays and vectors and the more modern collections. First, consider the case where you have values in an array or vector and you want to put them into a collection. If the values are inside a Vector, simply construct your collection from the vector:

Vector values = . . .;
HashSet staff = 
   new HashSet(values);
All collection classes have a constructor that can take an arbitrary collection object. Since the Java 2 platform, the Vector class implements the List interface. 

If you have an array, you need to turn it into a collection. The Arrays.asList wrapper serves this purpose:

String[] values = . . .;
HashSet staff = new HashSet(
          Arrays.asList(values));
Conversely, if you need to call a method that requires a vector, you can construct a vector from any collection:
Vector values = new Vector(staff);
Obtaining an array is a bit trickier. Of course, you can use the toArray method:
Object[] values = staff.toArray();
But the result is an array of objects. Even if you know that your collection contained objects of a specific type, you cannot use a cast:
String[] values = (String[])staff.toArray(); // Error!
The array returned by the toArray method was created as an Object[] array, and you cannot change its type. Instead, you need to use a variant of the toArray method. Give it an array of length 0 of the type that you'd like. Then, the returned array is created as the same array type, and you can cast it:
String[] values = (
    String[])staff.toArray(
              new String[0]);
NOTE: You may wonder why you don't simply pass a Class object (such as String.class) to the toArray method. However, as you can see from the API notes, this method does double duty, both to fill an existing array (provided it is long enough) and to create a new array.

java.util.Collection

  • Object[] toArray(Object[] array)

  • checks if the array parameter is larger than the size of the collection. If so, it adds all elements of the collection into the array, followed by a null terminator, and it returns the array. If the length of array equals the size of the collection, then the method adds all elements of the collection to the array but does not add a null terminator. If there isn't enough room, then the method creates a new array, of the same type as the incoming array, and fills it with the elements of the collection. 
    Parameters: array the array that holds the collection elements, or whose element type is used to create a new array to hold the collection elements
Algorithms

Generic collection interfaces have a great advantage--you only need to implement your algorithms once. For example, consider a simple algorithm to compute the maximum element in a collection. Traditionally, programmers would implement such an algorithm as a loop. Here is how you find the largest element of an array. 

if (a.length == 0) throw new NoSuchElementException();
Comparable largest = a[0];
for (int i = 1; i < a.length; i++)
   if (largest.compareTo(a[i]) < 0) largest = a[i];
   
Of course, to find the maximum of a vector, the code would be slightly different. 
if (v.size() == 0) throw new NoSuchElementException();
Comparable largest = (Comparable)v.get(0);
for (int i = 1; i < v.size(); i++)
   if (largest.compareTo((Comparable)v.get(i)) < 0) 
      largest = v.get(i);
      
What about a linked list? You don't have random access in a linked list. But you can use an iterator.
if (
 l.isEmpty()) throw 
    new NoSuchElementException();
Iterator iter = l.iterator();
Comparable largest = (
          Comparable)iter.next();
while (iter.hasNext())
{  Comparable next = (
          Comparable)iter.next();
   if (largest.compareTo(
        next) < 0) largest = next;
}
These loops are tedious to write, and they are just a bit error-prone. Is there an off-by-one error? Do the loops work correctly for empty containers? For containers with only one element? You don't want to test and debug this code every time, but you also don't want to implement a whole slew of methods such as these: 
Object max(Comparable[] a)
Object max(Vector v)
Object max(LinkedList l)
That's where the collection interfaces come in. Think of the minimal collection interface that you need to efficiently carry out the algorithm. Random access with get and set comes higher in the food chain than simple iteration. As you have seen in the computation of the maximum element in a linked list, random access is not required for this task. Computing the maximum can be done simply by iterating through the elements. Therefore, you can implement the max method to take any object that implements the Collection interface.
public static Object max(Collection c)
{  if (c.isEmpty()) throw 
     new NoSuchElementException();
   Iterator iter = c.iterator();
   Comparable largest = (
               Comparable)iter.next();
   while (iter.hasNext())
   {  Comparable next = (
             Comparable)iter.next();
      if (
       largest.compareTo(next) < 
                          0) largest = 
                                next;
   }
   return largest;
}
Now you can compute the maximum of a linked list, a vector, or an array, with a single method.
LinkedList l;
Vector v;
Employee[] a;
. . .
largest = max(l);
largest = max(v);
largest = max(Arrays.asList(a));
That's a powerful concept. In fact, the standard C++ library has dozens of useful algorithms, each of which operates on a generic collection. The Java library is not quite so rich, but it does contain the basics: sorting, binary search, and some utility algorithms. 

Sorting and Shuffling

Computer old-timers will sometimes reminisce about how they had to use punched cards and how they actually had to program sorting algorithms by hand. Nowadays, of course, sorting algorithms are part of the standard library for most programming languages, and the Java programming language is no exception.

The sort method in the Collections class sorts a collection that implements the List interface.

List staff = new LinkedList();
// fill collection . . .;
Collections.sort(staff);
This method assumes that the list elements implement the Comparable interface. If you want to sort the list in some other way, you can pass a Comparator object as a second parameter. (We discussed comparators in the section on sorted sets.) Here is how you can sort a list of employees by increasing salary.
Collections.sort(staff, 
   new Comparator()
   {  public compare(Object a, Object b)
      {  double salaryDifference = (
                Employee)a.getSalary()
            - (Employee)b.getSalary();
         if (salaryDifference < 0) 
                            return -1;
         if (salaryDifference > 0) 
                             return 1;
         return 0;
      }
   });
   
If you want to sort a list in descending order, then use the static convenience method Collections.reverseOrder(). It returns a comparator that returns b.compareTo(a). (The objects must implement the Comparable interface.) For example,
Collections.sort(
    staff, Collections.reverseOrder())
sorts the elements in the list staff in reverse order, according to the ordering given by the compareTo method of the element type.

You may wonder how the sort method sorts a list. Typically, when you look at a sorting algorithm in a book on algorithms, it is presented for arrays and uses random element access. But random access in a list can be inefficient. You can actually sort lists efficiently by using a form of merge sort (see, for example, Algorithms in C++, Parts 1-4, by Robert Sedgwick [Addison-Wesley 1998, p. 366-369]). However, the implementation in the Java programming language does not do that. It simply dumps all elements into an array, sorts the array, using a different variant of merge sort, and then copies the sorted sequence back into the list.

The merge sort algorithm used in the collections library is a bit slower than quick sort, the traditional choice for a general-purpose sorting algorithm. However, it has one major advantage: it is stable, that is, it doesn't switch equal elements. Why do you care about the order of equal elements? Here is a common scenario. Suppose you have an employee list that you already sorted by name. Now you sort by salary. What happens to employees with equal salary? With a stable sort, the ordering by name is preserved. In other words, the outcome is a list that is sorted first by salary, then by name.

Because collections need not implement all of their optional methods, all methods that receive collection parameters need to describe when it is safe to pass a collection to an algorithm. For example, you clearly cannot pass an unmodifiableList list to the sort algorithm. What kind of list can you pass? According to the documentation, the list must be modifiable but need not be resizable. 

These terms are defined as follows:

  • A list is modifiable if it supports the set method.
  • A list is resizable if it supports the add and remove operations.
The Collections class has an algorithm shuffle that does the opposite of sorting--it randomly permutes the order of the elements in a list. You supply the list to be shuffled and a random number generator. For example,
ArrayList cards = . . .;
Collections.shuffle(cards);
The current implementation of the shuffle algorithm requires random access to the list elements, so it won't work too well with a large linked list.

The program in Example 2-5 fills an array list with 49 Integer objects containing the numbers 1 through 49. It then randomly shuffles the list and selects the first 6 values from the shuffled list. Finally, it sorts the selected values and prints them out.

Example 25: ShuffleTest.java

import java.util.*;

public class ShuffleTest
{  public static void main(
               String[] args)
   {  List numbers = 
                  new ArrayList(49);
      for (int i = 1; i <= 49; i++)
         numbers.add(new Integer(i));
      Collections.shuffle(numbers);
      List winningCombination = 
              numbers.subList(0, 6);
      Collections.sort(
               winningCombination);
      System.out.println(
               winningCombination);
   }
}
java.util.Collections
  • static void sort(List elements)
  • static void sort(List elements, Comparator c)

  • sort the elements in the list, using a stable sort algorithm. The algorithm is guaranteed to run in O(n log n) time, wherenis the length of the list. Parameters: elements the list to sort c the comparator to use for sorting 
  • static void shuffle(List elements)
  • static void shuffle(List elements, Random r)

  • randomly shuffles the elements in the list. This algorithm runs in O(n a(n)) time, wherenis the length of the list and a(n) is the average time to access an element. Parameters: elements the list to shuffle the source of randomness for shuffling 
  • static Comparator reverseOrder()

  • returns a comparator that sorts elements in the reverse order of the one given by the compareTo method of the Comparable interface.

Binary Search

To find an object in an array, you normally need to visit all elements until you find a match. However, if the array is sorted, then you can look at the middle element and check if it is larger than the element that you are trying to find. If so, you keep looking in the first half of the array; otherwise, you look in the second half. That cuts the problem in half. You keep going in the same way. For example, if the array has 1024 elements, you will locate the match (or confirm that there is none) after 10 steps, whereas a linear search would have taken you an average of 512 steps if the element is present and 1024 steps to confirm that it is not.

The binarySearch of the Collections class implements this algorithm. Note that the collection must already be sorted or the algorithm will return the wrong answer. To find an element, supply the collection (which must implement the List interface--more on that in the note below) and the element to be located. If the collection is not sorted by the compareTo element of the Comparable interface, then you need to supply a comparator object as well.

i = Collections.binarySearch(
                      c, element); 
i = Collections.binarySearch(
           c, element, comparator);
If the return value of the binarySearch method is ³ 0, it denotes the index of the matching object. That is, c.get(i) is equal to element under the comparison order. If the value is negative, then there is no matching element. However, you can use the return value to compute the location where you should insert element into the collection to keep it sorted. The insertion location is 
insertionPoint = -i - 1;
It isn't simply -i because then the value of 0 would be ambiguous. In other words, the operation
if (i < 0)
   c.add(-i - 1, element);
adds the element in the correct place.

To be worthwhile, binary search requires random access. If you have to iterate one by one through half of a linked list to find the middle element, you have lost all advantage of the binary search. Therefore, the binarySearch algorithm reverts to a linear search if you give it a linked list.



NOTE: Unfortunately, since there is no separate interface for an ordered collection with efficient random access, the binarySearch method employs a very crude device to find out whether to carry out a binary or a linear search. It checks whether the list parameter implements the AbstractSequentialList class. If it does, then the parameter is certainly a linked list, because the abstract sequential list is a skeleton implementation of a linked list. In all other cases, the binarySearch algorithm makes the assumption that the collection supports efficient random access and proceeds with a binary search. 

java.util.Collections

  • static int binarySearch(List elements, Object key)
  • static int binarySearch(List elements, Object key, Comparator c)

  • search for a key in a sorted list, using a linear search if elements extends the AbstractSequentialList class, a binary search in all other cases. The methods are guaranteed to run in O(a(n) log n) time, wherenis the length of the list and a(n) is the average time to access an element. The methods return either the index of the key in the list, or a negative value i if the key is not present in the list. In that case, the key should be inserted at index -i - 1 for the list to stay sorted. Parameters: elements the list to search key the object to find
    c the comparator used for sorting the list elements 
Simple Algorithms

The Collections class contains several simple but useful algorithms. Among them is the example from the beginning of this section, finding the maximum value of a collection. Others include copying elements from one list to another, filling a container with a constant value, and reversing a list. Why supply such simple algorithms in the standard library? Surely most programmers could easily implement them with simple loops. We like the algorithms because they make life easier for the programmer reading the code. When you read a loop that was implemented by someone else, you have to decipher the original programmer's intentions. When you see a call to a method such as Collections.max, you know right away what the code does.

The following API notes describe the simple algorithms in the Collections class.

java.util.Collections

  • static Object min(Collection elements)
  • static Object max(Collection elements)
  • static Object min(Collection elements, Comparator c)
  • static Object max(Collection elements, Comparator c)

  • return the smallest or largest element in the collection. Parameters: elements the collection to search c the comparator used for sorting the elements 
  • static void copy(List to, List from)

  • copies all elements from a source list to the same positions in the target list. The target list must be at least as long as the source list. Parameters: to the target list from the source list 
  • static void fill(List l, Object value)

  • sets all positions of a list to the same value. Parameters: l the list to fill value the value with which to fill the list 
  • static void reverse(List l)

  • reverses the order of the elements in a list. This method runs in O(n) time, wherenis the length of the list. Parameters: l the list to reverse
Writing Your Own Algorithms

If you write your own algorithm (or in fact, any method that has a collection as a parameter), you should work with interfaces, not concrete implementations, whenever possible. For example, suppose you want to fill a JComboBox with a set of strings. Traditionally, such a method might have been implemented like this: 

void fillComboBox(
      JComboBox comboBox, 
                   Vector choices)
{  for (int i = 0; i < 
           choices.size(); i++)
      comboBox.addItem(
                 choices.get(i));
}
However, you now constrained the caller of your method--the caller must supply the choices in a vector. If the choices happened to be in another container, they need to first be repackaged. It is much better to accept a more general collection. 

You should ask yourself what is the most general collection interface that can do the job. In this case, you just need to visit all elements, a capability of the basic Collection interface. Here is how you can rewrite the fillComboBox method to accept collections of any kind.

void fillComboBox(
   JComboBox comboBox, Collection choices)
{  Iterator iter = choices.iterator();
   while (iter.hasNext())
      comboBox.addItem(iter.next());
}
Now, anyone can call this method with a vector or even with an array, wrapped with the Arrays.asList wrapper.

NOTE: If it is such a good idea to use collection interfaces as method parameters, why doesn't the Java library follow this rule more often? For example, the JComboBox class has two constructors: 

JComboBox(Object[] items)
JComboBox(Vector items)
The reason is simply timing. The Swing library was created before the collections library. You should expect future APIs to rely more heavily on the collections library. In particular, vectors should be on their way out because of the synchronization overhead.

If you write a method that returns a collection, you don't have to change the return type to a collection interface. The user of your method might in fact have a slight preference to receive the most concrete class possible. However, for your own convenience, you may want to return an interface instead of a class, because you can then change your mind and reimplement the method later with a different collection.

For example, let's write a method getAllItems that returns all items of a combo box. You could simply return the collection that you used to gather the items, say, an ArrayList.

ArrayList getAllItems(JComboBox comboBox)
{  ArrayList items = 
      new ArrayList(comboBox.getItemCount());
   for (int i = 
        0; i < comboBox.getItemCount(); i++)
      items.set(i, comboBox.getItemAt(i));
   return items;
}
Or, you could change the return type to List.
List getAllItems(JComboBox comboBox)
Then, you are free to change the implementation later. For example, you may decide that you don't want to copy the elements of the combo box but simply provide a view into them. You achieve this by returning an anonymous subclass of AbstractList.
List getAllItems(final JComboBox comboBox)
{  return new
      AbstractList()
      {  public Object get(int i)
         {  return comboBox.getItemAt(i);
         }
         public int size()
         {  return comboBox.getItemCount();
         }
      };
}
Of course, this is an advanced technique. If you employ it, be careful to document exactly which optional operations are supported. In this case, you must advise the caller that the returned object is an unmodifiable list.

Legacy Collections

In this section, we discuss the collection classes that existed in the Java programming language since the beginning: the Hashtable class and its useful Properties subclass, the Stack subclass of Vector, and the BitSet class.

The Hashtable Class

The classic Hashtable class serves the same purpose as the HashMap and has essentially the same interface. Just like methods of the Vector class, the Hashtable methods are synchronized. If you do not require synchronization or compatibility with legacy code, you should use the HashMap instead.
 

NOTE: The name of the class is Hashtable, with a lowercase t. Under Windows, you'll get strange error messages if you use HashTable, because the Windows file system is not case sensitive but the Java compiler is.


Enumerations

The legacy collections use the Enumeration interface for traversing sequences of elements. The Enumeration interface has two methods, hasMoreElements and nextElement. These are entirely analogous to the hasNext and next methods of the Iterator interface.

For example, the elements method of the Hashtable class yields an object for enumerating the values in the table: 

Enumeration e = staff.elements();
while (e.hasMoreElements())
{   Employee e = (
       Employee)e.nextElement();
   . . .
}
You will occasionally encounter a legacy method that expects an enumeration parameter. The static method Collections.enumeration yields an enumeration object that enumerates the elements in the collection. For example,
// a sequence of input streams
ArraySet streams = . . .; 
SequenceInputStream in 
   = new SequenceInputStream(
     Collections.enumeration(streams));
   // the SequenceInputStream constructor 
   //expects an enumeration


NOTE: In C++, it is quite common to use iterators as parameters. Fortunately, in programming for the Java platform, very few programmers use this idiom. It is much smarter to pass around the collection than to pass an iterator. The collection object is more useful. The recipients can always obtain the iterator from it when they need it, plus they have all the collection methods at their disposal. However, you will find enumerations in some legacy code since they were the only available mechanism for generic collections until the collections framework appeared in the Java 2 platform.

java.util.Enumeration

  • boolean hasMoreElements()

  • returns true if there are more elements yet to be inspected.
  • Object nextElement()

  • returns the next element to be inspected. Do not call this method if hasMoreElements() returned false.
java.util.Hashtable
  • Enumeration keys()

  • returns an enumeration object that traverses the keys of the hash table.
  • Enumeration elements()

  • returns an enumeration object that traverses the elements of the hash table. 
java.util.Vector
  • Enumeration elements()

  • returns an enumeration object that traverses the elements of the vector.
Property Sets

A property set is a map structure of a very special type. It has three particular characteristics.

  • The keys and values are strings.
  • The table can be saved to a file and loaded from a file.
  • There is a secondary table for defaults.
The Java platform class that implements a property set is called Properties.

Property sets are useful in specifying configuration options for programs. The environment variables in Unix and DOS are good examples. On a PC, your AUTOEXEC.BAT file might contain the settings:

SET PROMPT=$p$g
SET TEMP=C:\Windows\Temp
SET CLASSPATH=c:\jdk\lib;.
Here is how you would model those settings as a property set in the Java programming language.
Properties settings = new Properties();
settings.put("PROMPT", "$p$g");
settings.put(
          "TEMP", "C:\\Windows\\Temp");
settings.put(
      "CLASSPATH", "c:\\jdk\\lib;.");
Use the store method to save this list of properties to a file. Here, we just print the property set to the standard output. The second argument is a comment that is included in the file.
settings.store(System.out, 
           "Environment settings");
The sample table gives the following output.
CLASSPATH=c:\\jdk\\lib;.
TEMP=C:\\Windows\\Temp
PROMPT=$p$g
System information
Here's another example of the ubiquity of the Properties set: information about your system is stored in a Properties object that is returned by a method of the System class. Applications have complete access to this information, but applets that are loaded from a Web page do not--a security exception is thrown if they try to access certain keys. The following code prints out the key/value pairs in the Properties object that stores the system properties.
import java.util.*;

public class SystemInfo
{  public static void main(
                     String args[])
   {   Properties systemProperties = 
            System.getProperties();
       Enumeration enum = 
            systemProperties.propertyNames();
       while (enum.hasMoreElements())
       {  Stringkey = (
                 String)enum.nextElement();
           System.out.println(
                            key + "=" +
            systemProperties.getProperty(
                                   key));
       }
   }
}
Here is an example of what you would see when you run the program. You can see all the values stored in this Properties object. (What you would get will, of course, reflect your machine's settings):
java.specification.name=
     Java Platform API Specification
awt.toolkit=sun.awt.windows.WToolkit
java.version=1.2.1
java.awt.graphicsenv=
     sun.awt.Win32GraphicsEnvironment
user.timezone=America/Los_Angeles
java.specification.version=1.2
java.vm.vendor=Sun Microsystems Inc.
user.home=C:\WINDOWS    
java.vm.specification.version=1.0
os.arch=x86
java.awt.fonts=
java.vendor.url=http://java.sun.com/
user.region=US
file.encoding.pkg=sun.io
java.home=C:\JDK1.2.1\JRE
java.class.path=.
line.separator=
java.ext.dirs=C:\JDK1.2.1\JRE\lib\ext
java.io.tmpdir=C:\WINDOWS\TEMP\
os.name=Windows 95
java.vendor=Sun Microsystems Inc.
java.awt.printerjob=
      sun.awt.windows.WPrinterJob
java.vm.specification.vendor=
            Sun Microsystems Inc.
sun.io.unicode.encoding=UnicodeLittle
file.encoding=Cp1252
java.specification.vendor=
               Sun Microsystems Inc.
user.language=en
user.name=Cay
java.vendor.url.bug=
  http://java.sun.com/cgi-bin/bugreport.cgi
java.vm.name=Classic VM
java.class.version=46.0
java.vm.specification.name=
       Java Virtual Machine Specification
sun.boot.library.path=C:\JDK1.2.1\JRE\bin
os.version=4.10
java.vm.version=1.2.1
java.vm.info=build JDK-1.2.1-A, 
            native threads, symcjit
java.compiler=symcjit
path.separator=;
file.separator=\
user.dir=C:\temp


NOTE: For security reasons, applets can only access a small subset of these properties.

BACK | NEXT



[ This page was updated: 12-Jan-2000 ]
Products & APIs | Developer Connection | Docs & Training | Online Support
Community Discussion | Industry News | Solutions Marketplace | Case Studies
Glossary - Applets - Tutorial - Employment - Business & Licensing - Java Store - Java in the Real World
FAQ | Feedback | Map | A-Z Index
For more information on Java technology
and other software from Sun Microsystems, call:
(800) 786-7638
Outside the U.S. and Canada, dial your country's AT&T Direct Access Number first.
Sun Microsystems, Inc.
Copyright © 1995-2000 Sun Microsystems, Inc.
All Rights Reserved. Terms of Use. Privacy Policy