General Data Structures
[If you haven't read my previous post on General Trees, it might be a good idea to do so before reading further.]
A Binary Tree is a refined version of the General Tree that limits the maximum number of children each node can have - two, to be exact, hence the name. Child nodes are thus referred to as the Left and Right child nodes. The BinaryTree<T> class in NGenerics provides an implementation of a Binary Tree, which, when simplified, looks something like this :
We can represent a simple binary tree visually like this :
In addition to pre-order and post-order traversal discussed in my previous post, Binary Trees also support in-order traversal. An in-order traversal visits the left node, the root node and finally the right node. When supplied with an InOrderVisitor instance, tree nodes are visited in the order (7, 4, 8, 2, 5, 1, 3, 9, 6, 10) :
As with General Trees, the BinaryTree<T> class supports breadth first traversal and has ICollection<T> semantics. Binary Search Trees Search trees attempt to address the problem of finding items quickly. While not as efficient as Hash tables with their O(1) performance, they do cut down search times significantly, especially over larger data sets. Search trees are able to do this by adding additional constraints to tree data structures. A Binary Search Tree is the simplest of search trees with the following rules in place :
- Each value in the tree is distinct from any other (unique).
- Each left child node has a value smaller than the value of its parent.
- Each right child node has a value larger than the value of its parent.
The BinarySearchTree<TKey, TValue> class in NGenerics implements the IDictionary<TKey, TValue> interface for simple lookups based on keys, and supports the usual traversal algorithms. Re-using our example tree, and changing it to comply to the rules might result in the following tree (depending on the order the items are added to the tree) :
It's easy to see how a Binary Tree can cut down search times - when using a sequential data structure (array, list), or when manually searching an unordered tree like the plain BinaryTree, we potentially need to search through every item in the collection before we find it with a worst-case performance of O(n). With Binary Search Trees however, we can cut down that time significantly (especially when the tree grows large) by comparing node values with the search value and deciding on which branch to continue searching on - enough to make searching a Binary Trees on average an O(log n) operation. To illustrate, searching for the value "3" in our sample tree will result in 4 comparisons :
Although Binary Trees are not nearly as efficient as Hashtables with regards to searching (assuming that you've implement a proper hash function), Binary Search Trees (and most other search trees) provide an enormous benefit over Hashtables in that items can be traversed in sorted order without any preprocessing. An in-order traversal of a Binary tree will give us exactly that :
Binary Trees do, however, suffer from a couple of worst-case performance scenarios where they perform as badly as O(n) in searches. Adding items to the tree in ascending order will result in the following worst-case scenario :
Adding items in descending order will result in a tree that grows in the opposite direction, with the same devastating effect. Self-balancing trees have been introduced to compensate for these scenarios - but that's a topic for a future post. If the data you're pumping into the tree is random you'll probably never hit this situation - but if you have data with some order in it a self-balancing tree might be a better fit.