Take this snippet of code:
var strings = new List<String>()
{ "a", "b", "b", "c", "d" };
foreach (string s in strings.Distinct())
{
Console.WriteLine(s);
}
What do you expect when you run it?
a
b
c
d
As expected, the Distinct() extension method returned a IEnumerable with the extra b removed.
Now take this snippet:
public class Foo
{
public Foo(string bar)
{
this.Bar = bar;
}
public string Bar
{
get;
set;
}
}
And...
var foos = new List<Foo>()
{ new Foo("a"), new Foo("b"), new Foo("b"), new Foo("c"), new Foo("d") };
foreach (Foo f in foos.Distinct())
{
Console.WriteLine(f.Bar);
}
Running this should return the exact same result right?
a
b
b
c
d
Alas it did not. The reason is Distinct() does its comparisons using T.Equals().
On strings, T.Equals() does a value check. On reference types (such as Foo), T.Equals() does a referential check.
This can be surprising to be people at first, especially when working with common objects like a DataRow, however Distinct() has an override that takes an IEqualityComparer.
To use this override, you either pass in one of several premade Comparer classes that come with the .NET framework, or you create your own comparer class and pass it in.
This was odd to me, because the concept of passing in a comparer class is very .NET 1.1 style, and I was surprised that you can't just pass in a compare function as a lambda.
So, I created an overridden version of Distinct that takes in a lambda as its argument.
Place these two classes in a file in your project:
public static class Extensions
{
public static IEnumerable Distinct(this IEnumerable source, Funcbool> comparer)
{
return source.Distinct(new DelegateComparer(comparer));
}
}
public class DelegateComparer : IEqualityComparer
{
private Funcbool> _equals;
public DelegateComparer(Funcbool> equals)
{
this._equals = equals;
}
public bool Equals(T a, T b)
{
return _equals(a, b);
}
public int GetHashCode(T a)
{
return a.GetHashCode();
}
}
And now, we can call Distinct() and pass it a comparer as an anonymous method:
foos.Distinct((a,b) => (String.Compare(a.Bar,b.Bar) == 0))
Much better, now the code is easy to read, and we don't have to create custom IEqualityComparer classes for all of our objects.
There is one caveat, and that is that Distinct() also uses GetHashCode(). In this example, I simply overrode GetHashCode() in Foo to return Bar.GetHashCode().
However, for more flexibility, we can modify DelegateComparer to take a second lambda for the hash method.
var distinctFoos =
foos.Distinct
(
(a, b) => (String.Compare(a.Bar, b.Bar) == 0),
(a) => a.Bar.GetHashCode()
);
Now that is a thing of beauty. The purpose of the method is easy to understand, and it is infinitely flexible.
The final version of the extension method and IEqualityComparer is below:
public static class Extensions
{
public static IEnumerable Distinct(this IEnumerable source, Funcbool> comparer)
{
return source.Distinct(new DelegateComparer(comparer));
}
public static IEnumerable Distinct(this IEnumerable source, Funcbool> comparer, Funcint> hashMethod)
{
return source.Distinct(new DelegateComparer(comparer,hashMethod));
}
}
public class DelegateComparer : IEqualityComparer
{
private Funcbool> _equals;
private Funcint> _getHashCode;
public DelegateComparer(Funcbool> equals)
{
this._equals = equals;
}
public DelegateComparer(Funcbool> equals, Funcint> getHashCode)
{
this._equals = equals;
this._getHashCode = getHashCode;
}
public bool Equals(T a, T b)
{
return _equals(a, b);
}
public int GetHashCode(T a)
{
if (_getHashCode != null)
return _getHashCode(a);
else
return a.GetHashCode();
}
}
Drop me a comment if you found these snippets useful.
Posted by Jonathan Holland on 1/28/2009.