After reading yesterday’s post, one of my coworkers asked me “so what exactly is a hash and how does it work”. Fair question, in my post I concentrated more on how to use the dictionary than what it was, so it does deserve a little explanation.
When you enter a Key as part of the Key/Value pair, you use something human readable like “ArcaneCode”. While good for people, it’s very inefficient for the computer. Thus the .Net Framework runs your key through a sophisticated algorithm called a “hash”.
The output of this hash algorithm is a number, and that number is what the framework uses for the true key to your data. Thus “ArcaneCode” might produce a hash of 1234. It’s very fast for the computer to find your data when it can compare numbers, hence the usefulness of a hash.
The pseudo code below shows the general process a hash table goes through when retrieving your data.
- Key is passed in.
- Key is converted to a hash.
- The collection is searched for a matching hash.
- If found, the value in the dictionary associated with the hashed key is returned.
Hashes are unique, no two strings will ever yield the same hash value. Additionally, they are case sensitive. “ArcaneCode”, “ARCANECODE”, “arcanecode” and “aRCANEcODE” all produce different hash values.
The concept of hash tables are not unique to the .Net Framework, many languages have them, it’s just .Net hides the ugliness and makes them easy to use.
Came in looking for the virtual pc / ubuntu 6.10 guide. Even having problems with text mode. Looks like I have have to go back to vmware 😦
Anyway, a hash function in a hashtable is seldom truly unique. It would be way too slow. So collisions will occur. Check out this nice intro: http://en.wikipedia.org/wiki/Hash_map
Nice blog btw 🙂