Answer to Dictionary<> fun trivia

Here is an answer to my previous post about Dictionary<>.

The answer is yes, it is possible that added instance can't be found later on. Consider this, a bit modified, piece of code:

class Program { static void Main(string[] args) { Dictionary<Tubo, object> dictionary = new Dictionary<Tubo, object>(); Tubo tubo = new Tubo(); dictionary.Add(tubo, new object()); // some code here and there tubo.Value = 1; bool keyExists = dictionary.ContainsKey(tubo); Console.WriteLine(keyExists); Console.ReadLine(); } } public class Tubo { public int Value; }

Will I get true as an output? Yes, I will. Now, let's take a look at this, further enhanced, piece of code (only class Tubo as class Program remains same):

public class Tubo { public int Value; public override bool Equals(object obj) { Tubo other = obj as Tubo; if (other == null) return false; if (!base.GetType().Equals(obj.GetType())) return false;

return this.Value.Equals(other.Value);

} public override int GetHashCode() { return Value.GetHashCode(); } }

What I did here is that I've override GetHashCode to return internal Value's one instead of hash code generated out from instance reference by default. I've also implemented Equals operator to check for the equality based on type and  value match.

What output do I get now? The output is false. But let's add keys instance comparison to the output before explaining the result.

static void Main(string[] args) { Dictionary<Tubo, object> dictionary = new Dictionary<Tubo, object>(); Tubo tubo = new Tubo(); dictionary.Add(tubo, new object()); // some code here and there tubo.Value = 1; bool keyExists = dictionary.ContainsKey(tubo); Console.WriteLine("KeyExists: " + keyExists); // get enumerator for keys Dictionary<Tubo, object>.KeyCollection.Enumerator enumerator = dictionary.Keys.GetEnumerator(); // move to the first record (there is only one key anyway) enumerator.MoveNext(); // reference the "original" key Tubo originalKey = enumerator.Current; Console.WriteLine("Equality operation:" + (tubo == originalKey)); Console.ReadLine(); }

And the output is:

KeyExists: False
Equality operation:True

Funny, isn't it?

The answer lies in the way that Dictionary works. When you add a key it firsts obtain its hash code (by calling key.GetHashCode()). Then it uses the obtained hash code to pick a bucket where the key will be stored. Note that hash code isn't a unique value – it is used only to pick the bucket to speed up the search. Once the bucket is picked or created, the key-value pair is stored there. You see where I am going? The search works in similar way. First, the hash code is obtained from key value being searched, then it is used to pick the bucket and finally the key.Equals method is used to compare the instances. Let's modify further method Main to see the root of the problem:

tic void Main(string[] args) { Dictionary<Tubo, object> dictionary = new Dictionary<Tubo, object>(); Tubo tubo = new Tubo(); // obtain original hashcode Console.WriteLine("Original hashcode: " + tubo.GetHashCode()); dictionary.Add(tubo, new object()); // some code here and there tubo.Value = 1; // obtain modified hashcode Console.WriteLine("New hashcode: " + tubo.GetHashCode()); bool keyExists = dictionary.ContainsKey(tubo); Console.WriteLine("KeyExists: " + keyExists); // get enumerator for keys Dictionary<Tubo, object>.KeyCollection.Enumerator enumerator = dictionary.Keys.GetEnumerator(); // move to the first record (there is only one key anyway) enumerator.MoveNext(); // reference the "original" key Tubo originalKey = enumerator.Current; Console.WriteLine("Equality operation:" + (tubo == originalKey)); Console.ReadLine(); }

The output is not that surprising:

Original hashcode: 0
New hashcode: 1
KeyExists: False
Equality operation:True

It is obvious that two hashcodes don't match and thus Dictionary fails to pick the proper bucket when key is used for search operation. So here is the answer to my trivia. Note also that in this case hashcodes are the same as original values. This is not a rule – it just happens in our sample. Also note that there can be more values that corresponds to same hashcode as hashcode is an int and it has a fixed range of values. This means that our sample application might actually work sometimes, when a proper int value (one that has the same hashcode as the original) is assigned to tubo.Value. This would be a hard problem to spot.

Anyway, the morale of this story is that you should override GetHashCode only if really needed and if you do so, you have to know what you are doing very well. Another lesson is that it is a good thing to know how the stuff really works behind the curtains.

2 thoughts on “Answer to Dictionary<> fun trivia

  1. Yes, hash code is used when adding to Dictionary. This is because Dictionary class is implemented as hash table.

    Note that:
    1. Types that override Equals must also override GetHashCode; otherwise, Hashtable might not work correctly. (http://msdn2.microsoft.com/en-us/library/bsc2ak47.aspx)
    2. If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values. (http://msdn2.microsoft.com/en-us/library/system.object.gethashcode.aspx)

    So, if you override Equals, you should also override GetHashCode.

Leave a Reply