Wednesday, June 16, 2010
SortedSet vs HashSet
HashSet<T> is very good at add and search operations. Any search operation (Contains, Remove, and similar operations) are O(1). That's great. However, on the minus side, the HashSet<T> is not a sorted collection. Therefore, enumerating the elements in a sorted order forces you to copy the items to a different collection (like a List<T>) and sort the resulting list. You could construct a LINQ query to order the elements, however internally that query will likely use some form of temporary storage to create the sorted sequence. That means every sort will be an expensive operation. Sort is typically an O(n ln n) operation, Also, because the HashSet<T> does not have a sort method, you'll also have increased memory pressure and time cost to copy the elements.
SortedSet is new to .NET 4.0 System.Collections.Generic namespace. SortedSet<T> has different characteristics. The sorted set ensures that the elements in the set are always in sorted order. Every Add operation places the new element in the correct location in the set. That means Add is an O(ln n) operation. The SortedSet<T> must perform a binary search to find the correct location for the new element. The search happens on any of the search actions (Contains, Remove, etc). Those operations also have an O(ln n) performance characteristic. That sounds like the SortedSet<T> is always slower than the HashSet<T>. No one would use it if it was always slower. SortedSet<T> is much faster for iterating the set in sorted order. It's already in the correct order, so the enumeration becomes an O(n) operation.
Conclusion
SortedSet<T> will typically be faster than HashSet<T> when the majority of your operations require enumerating the set in one particular order. If, instead, most of the operations are searching, you'll find better performance using the HashSet<T>. The frequency of insert operations also has an effect on which collection would be better. The more frequently insert operations occur, the more likely HashSet<T> will be faster.
Subscribe to:
Posts (Atom)
Be A Developer That Uses AI
Developers will not be replaced by AI, they'll be replaced by developers that use AI. Generative AI tools are revolutionizing the way de...
-
I like NLog because it is probably the easiest logging framework I used. By simply copying NLog.config file to the project and set the ...
-
Recently ran into OutOfMemoryException from a .NET 3.0 WCF web service whenever the w3wp.exe reaches ~1.395 GB memory. WCF web service is ho...
-
Mutable Mutable is the most common collection type in the .NET world. These are collections such as List ; that allow reading, as...