Wednesday, March 12, 2008

IsBad(Enum.IsDefined) == true

There isn't much to say about Enum.IsDefined that hasn't been said before. Much of the confusion about why this method is bad stems from the fact that many people don't understand how Enum types have been implemented in .NET. So let's start by reviewing .NET's Enum implementation.
Consider the following enum:

enum Animal
{
cat = 0,
dog,
bird,
cow,
pig,
}

...we've set cat to be equivalent to integer value 0, so dog will be 1, bird is 2, etc. .NET allows us to specify enum values explicitly (as has been done with cat), or have them be implied (as has been done with all of the other values in the enum). This is sort of nice; we can now write code that looks like:

Animal hopefullyIAmACat = (Animal)0;
Console.WriteLine(hopefullyIAmACat);

This prints:

cat

Super. We've successfully taken an integer value and converted it to our enum type.

Now for something completely different:

Animal notSoSureAboutThisAnimal = (Animal)(-1);
Animal orThisOne = (Animal)5;
Animal definitelyNotSureAboutThisAnimal = (Animal)0.0M;
Animal ohNoes = (Animal)1.1;

Console.WriteLine(notSoSureAboutThisAnimal);
Console.WriteLine(orThisOne);
Console.WriteLine(definitelyNotSureAboutThisAnimal);
Console.WriteLine(ohNoes);

To the surprise of many, this not only compiles, but runs exception free:

-1
5
cat
dog

You might be thinking this is a bug, but this is by design. To get at why such a design decision was made, we need to fully understand C#'s enum implementation. From the C# language specification:
11.1.9 Enumeration types
An enumeration type is a distinct type with named constants. Every enumeration type has an underlying type, which shall be byte, sbyte, short, ushort, int, uint, long or ulong. Enumeration types are defined through enumeration declarations (§21.1). The direct base type of every enumeration type is the class System.Enum. The direct base class of System.Enum is System.ValueType.


There are also specific sections which explain implicit and explicit conversions with respect to enumerations:
13.1.3 Implicit enumeration conversions
An implicit enumeration conversion permits the decimal-integer-literal 0 to be converted to any enum-type.

13.2.2 Explicit enumeration conversions
The explicit enumeration conversions are:
•From sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, or decimal to any enum-type.
•From any enum-type to sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, or decimal.
•From any enum-type to any other enum-type.
An explicit enumeration conversion between two types is processed by treating any participating enum-type as the underlying type of that enum-type, and then performing an implicit or explicit numeric conversion between the resulting types. [Example: Given an enum-type E with and underlying type of int, a conversion from E to byte is processed as an explicit numeric conversion (§13.2.1) from int to byte, and a
conversion from byte to E is processed as an implicit numeric conversion (§13.1.2) from byte to int.]


By definition, enums can be explicitly cast to and from almost all of .NET's fundamental value types, so assigning my Animal enum to be "0.0M" invokes a cast from Decimal to int. The decimal gets hacked off, resulting in a cat.

This malleability brings up a couple of huge questions. Brad Abrams brings up this point:
It’s a know issue that adding values to enums is bad (from a breaking change perspective), WHEN someone is exhaustively switching over that enum.

Case in point: assume someone is iterating over my Animal enum, and writes the following code:

switch (ohNoes)
{
case Animal.pig:
break;
case Animal.dog:
break;
case Animal.cow:
break;
case Animal.cat:
break;
default:
// This one MUST be a bird!
break;
}

...of course, it won't be a bird when someone changes the enum to include an elephant entry; suddenly default maps to two values. Also, more importantly, in the above code default is a catch all! Everything that doesn't fall into the range--for example, negative one--is going to hit the default switch.

This brings us back around to Enum.IsDefined, which returns true of the supplied value is defined by the Enum. Writing some code like so is very tempting:
       if (!Enum.IsDefined (typeof(Animal), 5)
throw new InvalidEnumArgumentException();

...but this is, again, fraught with peril. Our Animal type is defined at runtime. What if someone later adds elephant to our enum? The code following this check still needs to be capable of dealing with elephant or any future types that may be defined in the enum.

Furthermore, Enum.IsDefined is pricy. And by pricy, I mean all sorts of reflection and boxing and junk under the covers. I found this call being used in SharpMap, and removing it resulted in some very respectable performance gains in a tight loop used to parse some binary data:
I tested your (suggestion to remove Enum.IsDefined) with the method below. First with the polygons countries.shp that is in the DemoWebSite App_Data. This was 16% faster. Then with points in cities.shp. This was 85% faster, nice.

Pretty clear win: the project looses some potential versioning dilemmas, and gains some sizable performance in one of the most heavily used routines.

Moral of the story: enums are not objects that handle versioning well. I personally believe they should only be used where the enumeration is clear and not likely to be expounded upon in the future. Using Enum.IsDefined should be avoided.

2 comments:

Anonymous said...

You mention that Enum.IsDefined is not performant (uses Reflection and Boxing). How do you know this? How do you measure the performance of this and see what is going on? The reason I ask, is that I don't want to blindly implement an alternative to Enum.IsDefined that may actually perform worse!

Also, does .NET 4.0 offer any new solutions that you are aware of?

kidjan said...

@anon,

I know this because I measured performance before and after with a profiler. All of the details are above.

You'd have to be a pretty terrible programmer to make something that performed worse. Seriously.

I don't know if .NET 4.0 offers any viable alternatives; it's been a few years since I've actively done .NET development.