|
Conceptually a set does not allow duplicate elements. It is a collection of unique elements. Keep in mind that the elements in set need not be stored in an ordered fashion. Uniqueness is the most important characteristics of sets. The order is not important. The collection framework represents set with the Set interface. You can find three main implementers of this interface, HashSet, LinkedHashSet and TreeSet.
The HashSet class implements Set interface and internally uses a hash-table data structure to store the elements. This data structure use a special function called as hash code of an object to decide where (at what index) the object will be stored. For instance, if you are storing strings in the collection then hashcode function could be the length of the string. Thus a string with length 3 will be stored at 3rd location and so on. Note that this is a very simple example, the actual implementation of hash code function can be quite complex. Usually the hash code functions are very efficient and work in such a way that elements are evenly distributed in the available positions within the collection.
There is another important thing to remember about the storage using hashing. Since exactly where the element is stored depends on the hash code of that element, the elements are not ordered. In simple words, the first element might not be stored at the first location and so on.
You can use this class when you want a collection with no duplicates and when you do not really need to maintain order among the elements. HashSet only guarantees that each element will be unique. However, it does not remember the order in which the elements were inserted. Therefore, when you iterate over its elements, you will not necessarily get elements in the order they were inserted. For instance,
java.util.HashSet set = new java.util.HashSet();
set.add(new Integer(4));
set.add(new Integer(6));
set.add(new Integer(8));
set.add(new Integer(2));
set.add(new Integer(2));
System.out.println(set);
After executing this code, [8, 6, 4, 2] is printed. As you can see, the elements are unique, but they are not displayed in the order they are inserted. If you want to ensure the uniqueness as well as some kind of order among elements, Java provides you with two classes, LinkedHashSet and TreeSet.
The LinkedHashSet class implements the Set interface and uses the hash-table data structure to store the elements. This class extends the HashSet class to implement its ordered version. Internally it uses a doubly linked list to maintain order with which the elements are inserted. You can use this class instead of HashSet when you need to keep the insertion order. For instance,
java.util.LinkedHashSet set = new java.util.LinkedHashSet();
set.add(new Integer(4));
set.add(new Integer(6));
set.add(new Integer(8));
set.add(new Integer(2));
set.add(new Integer(2));
System.out.println(set); // prints [4, 6, 8, 2]
After executing this code, [4, 6, 8, 2] is printed. As you can see, the elements are unique and also they are displayed in the same order they are inserted. Another interesting thing about LinkedHashSet class is that you can create an ordered set from any Set implementation. Therefore, you can create a order copy of any Set implementations as:
Set s = aSet; //instance of any class implementing Set interface
Set copy = new LinkedHashSet(s);
In the above code, aSet can be instance of any class that implements Set interface.
The TreeSet implements the SortedSet interface. It maintains uniqueness among elements. In addition it keeps elements in sorted order. This order is a natural order among elements. For instance, in case of strings, the natural order would be alphabetical ascending order.
It internally uses an instance of TreeMap (we will see later in this chapter) which in turn uses a tree as underlying data structure. This data structure helps it to maintain the elements in a sorted order. In the following example, the strings are displayed in their ascending alphabetical order.
java.util.TreeSet jungleBook = new java.util.TreeSet();
jungleBook.add("mogualy");
jungleBook.add("seeta");
jungleBook.add("bhalu");
jungleBook.add("seeta");
System.out.println(jungleBook); // prints [bhalu, mogualy, seeta]
Note even if the string “seeta” is entered twice, it appears only once in a TreeSet. If you are adding Integer objects, the elements are stored in ascending order of their value.
|
|
In the alphabetical ascending order the upper case characters appear before the lower case character. For instance, if you put “Zebra” and “rabbit” in a TreeSet instance, “Zebra” will appear before “rabbit”. |
The natural order in a TreeSet is determined by either the Comparable or Comparator interfaces implemented by the elements in the collection.
|
|
Interface java.util.Comparator defines a comparison method (int compare(Object o1, Object o2) ) that imposes the natural order on the collection of objects. This method is usually implemented in such a way that it returns 0 if two objects are equal. It returns negative number if first object is smaller than the second object. Comparators are used to control the order of certain data structures (such as TreeSet or TreeMap). Alternatively the elements in the collection can implement another interface java.lang.Comparable to impose the order. This interface imposes a natural order on the objects of a class that implements it. The elements need to implement the compareTo() method (int compareTo(Object o) ) of this interface for performing the comparison . |
By default, the TreeSet stores most of the elements in ascending order, but its use is not limited to only ascending order. Optionally, you can construct a TreeSet that allows you to specify what the natural order should be rather than relying on the default ordering defined by the elements’ class. For instance, if you wish to sort the Integer objects in descending order, you can define your own Comparator instance as:
java.util.Comparator desc_Comparator = new java.util.Comparator() {
public int compare(Object obj1, Object obj2) {
int val1 = Integer.parseInt(obj1.toString());
int val2 = Integer.parseInt(obj2.toString());
int result = - (val1 – val2); // For descending order
return result;
}
};
The anonymous inner class is defined which implements the Comparator interface by defining the compare() method. This method compares its two arguments for order. It usually returns a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second. But since we want descending order, we negated the result. Now you can pass this comparator instance to the constructor of TreeSet to ask it to sort elements in descending order as:
java.util.TreeSet set = new java.util.TreeSet(desc_Comparator);
set.add(new Integer(4));
set.add(new Integer(6));
set.add(new Integer(8));
set.add(new Integer(2));
set.add(new Integer(2));
System.out.println(set); // prints [8, 6, 4, 2]
|
|
You need not know the exact implementation details of TreeSet (such as the details of implementing the Comparator interface or the details of how the class TreeSet internally maintains the order). But you need to know the difference between TreeSet and other implementers of the Set interface. |
Table 14.2 compares the three implementations of the Set interface. This comparison would be easier to remember when you take into account the underlying data structure used for storing elements.
Table 14.2 Comparing three implementations of Set interface
|
Classes implementing Set |
Underlying data structure |
Pros |
Cons |
|
HashSet |
hash-table |
1. Useful when you want to impose only uniqueness among elements. 2. Faster than other Set implementation as it does not need to maintain any order. |
1. Do not guarantee any order among elements. 2. Not thread –safe.* |
|
LinkedHashSet |
hash-table and doubly linked list |
1. Maintains the insertion order. |
1. Slower than HashSet as it maintains the insertion order. 2. Not thread –safe.* |
|
TreeSet |
tree |
1. Maintains the natural order among elements. 2. You can impose different order by passing your own Comparator. |
1. Not thread –safe.* |
|
* |
None of the Set implementers are thread safe. If you want thread safe set, you can use the utility method synchronizedSet() of Collections class. We will see this class later in this chapter.
|