Java / Set and its implementations
One of the HashSet overloaded constructors accept ArrayList as an argument to create Set out of it. Passing the AsList method view of the array to the HashSet constructor, the Set can be created.
Set<Integer> mySet = new HashSet(Arrays.asList(1,4,3,4,4));
When you put an object into a HashSet, it uses hashcode value to determine where to put the object in the set. It also compares the objects hashcode value to other object's hashcode in the hashset. But two objects having same hashcode might not be equal. If the hashcode of two objects are equal then hashset uses equal() to see if the hashcode matched objects are really equal. And if they are equal the hashset knows that the new object is duplicate of something exist in the HashSet. And the add does not happen. The add() of hashcode returns false.
- EnumSet.
- HashSet.
- LinkedHashSet.
- TreeSet.
The above classes are defined in java.util package.
Use retainAll method that retains only the elements in the set that are contained in the specified collection (optional operation).
In other words, removes from this set all of its elements that are not contained in the specified collection. If the specified collection is also a set, this operation effectively modifies this set so that its value is the intersection of the two sets.
Using removeAll method we can find the difference of two sets.
HashSet stores the object in random order. There is no guarantee that the element we inserted first in the HashSet will be printed first in the output . TreeSet maintains the elements in sorted order.
HashSet can store null object while TreeSet does not allow null object. If one try to store null object in TreeSet object , it will throw NullPointerException.
HashSet are internally backed by hashmap. While TreeSet is backed by a Navigable TreeMap.
HashSet take constant time performance for the basic operations like add, remove contains and size.While TreeSet guarantees log(n) time cost for the basic operations (add,remove,contains).
A LinkedHashSet is ordered as it maintains a doubly-linked List across all elements. When you iterate through a HashSet the order is unpredictable, while a LinkedHashSet lets you iterate through the elements in the order in which they were inserted.
LinkedHashSet is the Hashtable and linked list implementation of the Set interface with predictable iteration order.
The linked list defines the iteration ordering, which is the order in which elements were inserted into the set. Insertion order is not affected if an element is re-inserted into the set.
LinkedHashSet is the subclass of HashSet class.
HashSet does not maintain order and returns random order every time we retrieve elements.In TreeSet elements are naturally sorted but there is increased cost associated to it.
LinkedHashSet maintains the order of elements without incurring increased cost.
LinkedHashSet is introduced in JDK 1.4 while HashSet and TreeSet are introduced in JDK 1.2.
The capacity is the number of buckets in the set, and the initial capacity is the capacity at the time HashSet is created. Each bucket represent a key value pair.
The load factor is a measure of how full the Hash table is allowed to get before its capacity is automatically increased.
The default value of initial capacity and load factor is 16 and 0.75f respectively.
LinkedHashSet is an extended version of HashSet. HashSet does not maintain order while LinkedHashSet maintains insertion order. HashSet uses HashMap object internally to store its elements while LinkedHashSet uses LinkedHashMap object internally.
There are 4 constructors in LinkedHashSet class. All these constructors simply call super class constructor i.e., constructor of HashSet class.
LinkedHashSet inherits the uniqueness property from LinkedHashMap as internally LinkedHashMap gets created indirectly when LinkedHashSet is created.
LinkedHashSet.
Implementation: LinkedHashSet is backed by linked list and LinkedHashMap, TreeSet is implemented as TreeMap till Java 5 and now using NavigableMap from Java 6 onward, and HashSet is also backed by HashMap in Java.
Synchronization: HashSet, TreeSet, and LinkedHashSet are not synchronized.
Ordering of elements: HashSet does not guarantee any order, while TreeSet sorts all object based upon there natural ordering by using compareTo() method, or custom order by using compare() method Comparator passed to them. LinkedHashSet also provides ordering support to keep elements in the order they are added into Collection.
Element as null: Both HashSet and LinkedHashSet allow null element while TreeSet does not allow null elements and throws java.lang.NullPointerException when you try to add a null object.
Iterator: Iterator returned by all three Set implementations are fail-fast, which means if you modify collection once iteration begins i.e. add or delete elements without using Iterator's remove method, it will throw ConcurrentModificationException.
Performance: HashSet is the fastest for common operation, for example, add, search and remove, LinkedHashSet is close second, as it suffers a little drop in performance due to overhead of maintaining doubly linked list when an element is inserted or deleted. TreeSet is much slower than these two because it needs to perform sorting every time there is change in TreeSet.
When HashSet object is created, it creates a HashMap object internally. When an element is passed to Set, it is added as a key in the HashMap by add(Element e) method. As HashMap maintains key value pair, a value needs to be associated to the key. Java uses a Dummy value (new Object) which is called PRESENT in HashSet.
Perform remove before adding the element into the HashSet.
if(!myHashSet.add(myObject)) { myHashSet.remove(myObject); myHashSet.add(myObject); }
Java HashSet handle Hash collisions automatically, however it is important to override both the equals and the hashcode methods, as both methods are utilized by Set to differentiate duplicate or unique entries.
CopyOnWriteArraySet is a Set implementation backed up by a copy-on-write array. All mutative operations, such as add, set, and remove, are implemented by making a new copy of the array; no locking is ever required.
CopyOnWriteArraySet is best suited as read-only collection whose size is small enough to copy .CopyOnWriteArraySet is backed by CopyOnWriteArrayList.
CopyOnWriteArraySet iterators do not support the mutative remove operation.
To create a clone or copy of the Set object, HashSet internally uses shallow copy in clone() method, the elements themselves are not cloned.
There is no get method in HashSet as Set is not index based. Use the contains method to determine if the object exists or not, if exists use the Iterator to retrieve element.
Collections.emptySet() returns the empty immutable Set, not containing null.
TreeSet does NOT use hashCode. It uses either compareTo or the Comparator passed to the constructor. This is used by methods like contains to find objects in the set.