August 9, 2010

Cleaning up my data access layer with enumerations, attributes, reflection and generics... and caching

Today's blog post is a follow up to my previous post regarding reflection and enum attributes to add flexibility to your DAL while reducing your code base by a helper class per enum. I won't go over everything I coded in my previous post, but you can find it at http://www.endswithsaurus.com/2010/08/cleaning-up-my-data-access-layer-with.html

Well, if you'd gone through the coding techniques used in the blog post, you'll now be familiar with the steps you need to go through to convert from an attribute value to an enum value and back and to convert from one attribute value to another. However, you may also have noticed - as per common trends with reflection, that the performance sucks ***. This blog post is going to help you rectify that. Because contrary to common belief, reflection with a couple of tricks can actually be made to perform quite well - indeed, in some cases almost as well as the rest of the .NET framework.

Having done some basic performance testing on my previous code, it was fairly obvious that reflection for this purpose was some order of magnitudes slower than switch statements. Your average 5 value enum, each value having a database value and a display value takes in the order of 10ms per 100,000 iterations to convert between an enum value and one of the other connected string values or vice versa, or indeed converting between the display string and database string. Having written the demo code up for attributed enumerations like:

using System;
using System.Collections;
using System.Reflection;

namespace Utilities.Enums
{
  public static class Enums
  {
    private static BindingFlags fieldBindings = 
      BindingFlags.Public | 
      BindingFlags.Static | 
      BindingFlags.GetField;

    private static Hashtable _fieldInfoArrayFromEnumCache = new Hashtable();
    private static FieldInfo[] FieldInfoArrayFromEnum<TEnum>()
    {
      Type t = typeof(TEnum);
      object cached = _fieldInfoArrayFromEnumCache[t];
      if (cached != null)
        return (FieldInfo[])cached;

      FieldInfo[] fi = t.GetFields(fieldBindings);
      _fieldInfoArrayFromEnumCache.Add(t, fi);
      return fi;
    }

    private static Hashtable _fieldInfoFromEnumValueCache = new Hashtable();
    private static FieldInfo FieldInfoFromEnumValue<TEnum>(TEnum value)
    {
      if (value == null) throw new ArgumentNullException("value");

      Type t = typeof(TEnum);
      object cached = _fieldInfoFromEnumValueCache[value];
      if (cached != null)
        return (FieldInfo)cached;

      FieldInfo fi = t.GetField(value.ToString(), fieldBindings);
      _fieldInfoFromEnumValueCache.Add(value, fi);
      return fi;
    }

    private static Hashtable _attributeEnumCache = new Hashtable();
    private static IAttribute AttributeFromEnumValue<TEnum, 
      TAttribute>(TEnum value)
    {
      if (value == null) 
        throw new ArgumentNullException("value");

      Type t = typeof(TEnum);
      Type a = typeof(TAttribute);
      object key = new { enumVal = value, attrType = a };
      object cached = _attributeEnumCache[key];
      if (cached != null)
        return (IAttribute)cached;

      FieldInfo fi = FieldInfoFromEnumValue<TEnum>(value);
      IAttribute attr = (IAttribute)fi.GetCustomAttributes(a, false)[0];
      _attributeEnumCache.Add(key, attr);
      return attr;
    }

    private static Hashtable _attributeFromValueCache = new Hashtable();
    private static IAttribute AttributeFromAttributeValue<TEnum, 
      TSourceAttribute, TTargetAttribute>(object sourceAttrValue)
    {
      if (sourceAttrValue == null)
        throw new ArgumentNullException("sourceAttrValue");

      Type t = typeof(TEnum);
      Type s = typeof(TSourceAttribute);
      Type d = typeof(TTargetAttribute);

      object key = new { 
        attrSourceType = s, 
        attrTargetType = d, 
        attrVal = sourceAttrValue 
      };
      object cached = _attributeFromValueCache[key];
      if (cached != null)
        return (IAttribute)cached;

      TEnum enumVal = Parse<TEnum, TSourceAttribute>(sourceAttrValue);
      IAttribute target = 
        AttributeFromEnumValue<TEnum, TTargetAttribute>(enumVal);
      _attributeFromValueCache.Add(key, target);
      return target;
    }

    private static Hashtable _enumFromAttributeValueCache = new Hashtable();
    public static TEnum Parse<TEnum, TAttribute>(object attrValue)
    {
      if (attrValue == null) throw new ArgumentNullException("attrValue");

      Type t = typeof(TEnum);
      Type a = typeof(TAttribute);
      object key = new { attrType = t, attrValue = attrValue };
      object cached = _enumFromAttributeValueCache[key];
      if (cached != null)
        return (TEnum)cached;

      Array enumVals = t.GetEnumValues();
      foreach (TEnum enumVal in enumVals)
      {
        object attrVal = GetAttributeValue<TEnum, TAttribute>(enumVal);
        if (attrVal.Equals(attrValue))
        {
          _enumFromAttributeValueCache.Add(key, enumVal);
          return enumVal;
        }
      }

      //At this point there was no matching enum for the attribute value provided.
      throw new ArgumentOutOfRangeException("attrValue");
    }

    private static Hashtable _attributeValueFromEnumValueCache = new Hashtable();
    public static object GetAttributeValue<TEnum, TAttribute>(TEnum enumValue)
    {
      if (enumValue == null) 
        throw new ArgumentNullException("enumValue");

      Type t = typeof(TEnum);
      Type a = typeof(TAttribute);
      object key = new { 
        enumType = t, 
        attrType = a, 
        enumVal = enumValue 
      };
      object cached = _attributeValueFromEnumValueCache[key];
      if (cached != null)
        return cached;

      IAttribute attr = 
        AttributeFromEnumValue<TEnum, TAttribute>(enumValue);
      _attributeValueFromEnumValueCache.Add(key, attr.Value);
      return attr.Value;
    }

    private static Hashtable _attributeValueFromAttributeValueCache = new Hashtable();
    public static object GetAttributeValue<TEnum, 
      TSourceAttribute, TTargetAttribute>(object sourceAttrValue)
    {
      if (sourceAttrValue == null) 
        throw new ArgumentNullException("sourceAttrValue");

      Type t = typeof(TEnum);
      Type s = typeof(TSourceAttribute);
      Type d = typeof(TTargetAttribute);
      object key = new { 
        enumType = t, 
        srcAttr = s, 
        dstAttr = d, 
        srcVal = sourceAttrValue 
      };
      object cached = _attributeValueFromAttributeValueCache[key];
      if (cached != null)
        return cached;

      IAttribute attr = AttributeFromAttributeValue<TEnum, 
        TSourceAttribute, TTargetAttribute>(sourceAttrValue);
      object attrVal = attr.Value;
      _attributeValueFromAttributeValueCache.Add(key, attrVal);
      return attrVal;
    }
  }
}

We can also convert between one attribute value and another using the same GetAttributeValue method, but this time calling the generic overload Enums.GetAttributeValue<EnumType, SourceAttributeType, TargetAttributeType>(object sourceAttributeValue):

string DbValue = 
  Enums.GetAttributeValue<DemoEnum, DisplayValue, DatabaseValue>("Fifth Value")

As you can see, by stepping through any of the method, the first thing we do is prepare a key if the lookup is for anything with a more complex key than a simple value and we check to see if we've got a previous comparison already cached in the corresponding hashtable which is serving as a static cache. If we've got a value cached in the hashtable, it is returned early. If no value is cached, then we lookup the correct value and add it to the cache before it is returned.

The fact that this is a static class, and each of the hashtables are static fields mean that the values cached by a lookup instigated by the first user of the application after it's loaded into the web server means that every other user benefits from that user's pain - not that there's that much to start with. Using this model, we see performance increase from the order of 800ms per 100,000 iterations to around 50ms per 100,000 iterations (on my machine). The large majority of the overhead is caused by doing the first check and adding the result to the cache. Every stage of the lookup is cached so that every subsequent stage benefits from caching of the lower levels. Consequently if someone has already done a lookup on an attribute on the fifth enum value, all of the attributes for that enum value have already been cached, and all we've got to do is get the value from the cached attribute.

This is a dramatic performance increase - some 16 times faster than our original code, though the large majority of the techniques introduced in the last post are still present, there is some refactoring to keep the code as clean as possible.

Of course, like anything, you need to use these techniques with care - caching isn't a get out of jail free card, what was lacking in performance with the old method, were made up for with more meagre memory requirements. We've now traded those two sides of the coin. Throwing more memory at the problem has reduced the need for processing power to solve the problem, but the increased memory overhead in some cases may turn out to actually be detrimental to your application. You have to pick the right strategy for your application - hopefully this was the right strategy for yours, but your mileage may vary given the specific circumstances of your application.

Happy reflecting!

No comments:

Post a Comment