October 1, 2013

Home network security [part 1] - for mortals

There's so much to discuss when it comes to security surrounding your home computers and your home network that I can't possibly write it all in a single blog post, or even two and there is much I need to learn along the way.  There is much that I have spent less time than I should have thinking about and, like you, probably have many questions that I should have found answers to before now... I think the biggest question on my mind right now is: What can be trusted?

It seems like nothing that we're told can be trusted is indeed as trustworthy as we're led to believe. That said, up until now, we've basically been trusting our emotional bias towards certain brands. Microsoft, Apple, Linksys, Dell, Samsung, iOS, Android et al.

So we need to evaluate a few things...

- What software can I trust?  Is Windows trustworthy?  Is Mac OS X?  Is there any operating system I can trust?
- What hardware can I trust?  Can I even trust the actual computer I'm using right now?  Can I trust my phone?
- What online services can I trust?  Is online backup actually safe or am I backing up all my local PRIVATE data and entrusting it to servers on the internet hosted by companies that can be trusted?
- What can I do to protect myself?
- Are there any companies that I can truly entrust my data to in any form?

Those questions are actually a lot bigger than they appear at first glance and each should be given their own level of gravity. I won't cover them in this blog post, instead I'll start by elaborating on the issue that I mentioned in my previous post which is that relating to HTTPS - the thing that says it's safe to enter your personal information on a website.

The technologies that we're led to put our trust in by the media, SSL and TLS - you know, the ones that put the little padlock in your address bar and claims to be secure. I'm going to give you a basic crash course on the infrastructure that holds together our online security. Don't be scared off, I'm going to purposely gloss over the heavy technical information because it only serves to complicate things and won't give you a clear picture of the overall problem.

Let's say you go to your online banking website (just an example, a purchase from Tesco or Wal-mart uses exactly the same technology). The first thing you may notice is that logging into your banking website, the address bar may have changed colour, depending on the browser you use, a padlock will have appeared somewhere on your screen and the address will start HTTPS instead of HTTP. All these are the indicators that you're led to believe keep you safe and say that it's safe to shop online using your credit card details... indeed, they're the hallmarks of the security infrastructure that's been set up to keep you safe. Let's discuss what's going on behind the scenes to give you an idea of what's going on...

The site you are visiting has acquired what's called a digital certificate that's supposed to verify the authenticity of the computer (the web server) that's sending your computer that web page.  A digital certificate is something that supposedly cannot be forged and is somewhat a simile for your passport or identity card. That is, it's legally binding, it cannot be repudiated. That server is bona-fide... allegedly.

Of course, bona-fides are only as good as the authority that provides them. Pretty much like our passport, if we can't trust the authority that provided the passport, then we can't trust the passport. So, how can you trust the authority?  Well, in a nutshell, because we're told to - does that sound right to you? Me either.  So what happens in the internet world is there is what's called a "chain of trust"... this means that the website you're visiting was given their certificate by an authority more trustworthy than them. Likewise, that authority was provided with their certificate by someone more trustworthy than them etc. all the way up to a top level authority whom we're told to trust just because someone big (like say, the government, or Microsoft) says, it's okay, you can trust them.

Well, the big top level authority that a vast number of certificates are provided by is a US company called Verisign. I'm not knocking Verisign, and I'm not setting out to make it seem like these guys are bad. They're providing a service to the best of their ability and god love 'em, they do that pretty well. The problem is, the system is flawed not because you can't trust them, but it's flawed because they're not in a position you should trust them... here's why...

Verisign is a US company, consequently they're bound by US law, which may or may not be comparable to the law of your country of residence. Recently there has been a spate of incidents where it has come to light that US companies have been compelled to violate laws they would otherwise be bound by (with legal impunity) in order for authorities to spy on people around the world, including their own citizens. It would be easy for a bad actor (a bad actor in this sense is anyone who has malicious intent and shouldn't be trusted) to set up a fake website, compel Verisign (or any certificate authority under their chain of trust), through legal means, blackmail or coersion to provide a certificate of authenticity saying that their website is the real mccoy. They can then redirect your traffic to their webserver which looks exactly likethe original, still shows the padlock and other security cues that tell you this site is safe. For all intents and purposes, it looks exactly like the original. Even if you had the technical ability to pull up the certificate and display it on your screen, it would be indeterminable. In fact, there would be very little to give the game away, only subtle clues that most everyday users would never notice... for instance, the IP address of the webserver may suddenly appear to be located in a different country than the original, but again, it may not.

When you enter a website address in your web browser, a few things happen [if you don't know what an IP address is, it's basically the phone number of your computer on the internet].

  1. You connect your computer to a trusted router - probably your home router, but could easily be the Wifi at the office, Starbucks, the airport or some other public network.
  2. You open your web browser and enter a web address in the address bar.
  3. Your computer checks in a local database called a cache to see if it already has an IP address for that website.
  4. If your computer has the IP address in its cache, we jump to step 10
  5. If your computer doesn't have the IP address in its cache, it goes to a database and finds the IP addresses for a list of available DNS servers. DNS stands for Domain Name Service, it's basically a phone book to look up the IP address for the website you entered - this list of DNS servers is usually provided automatically by your Internet Service Provider to your router when you connect it to the internet, and when you connect to your wifi your computer can get the list and ask it for the IP addresses.
  6. Your computer sends the server part of the address - the bit between the https:// and the next / for instance - www.myonlinebank.com or www.walmart.com to the first DNS server in the list.
  7. The DNS server looks to see if it has the IP address for the server you requested, if it does, it sends your computer back the IP address.
  8. If the DNS server didn't find an IP address, your computer asks the next one in the list until it finds an IP address.
  9. If none of the DNS servers found an IP address your computer receives an "unknown host" response and your web browser displays an ugly message to say it couldn't find what you're looking for and you curse.
  10. If your computer has found an IP address then it sends the address you entered in the web browser to that IP address.
  11. The computer at that IP address sends back a web page signed with a certificate - the one we discussed earlier.
  12. Your web browser checks the certificate to see if it's authentic and activates all the pretty security features on your browser that tells you the page is authentic and secure.

In that process, there are a whole heap of places that can be attacked to get between you and the real server in order to get at your information...


  1. The wifi on your router can be hacked and reprogrammed to maliciously gain access to your local network and steal data directly from your local computers via a number of attacks.
  2. Someone can gain a bad certificate and pretend to be a legitimate website, but this requires redirecting you to their site instead of the original... this can be done by modifying the programming of your router so that all DNS queries are routed to their DNS servers, modifying the real DNS servers you use to direct traffic to their website.
  3. Someone can find a way to install a certificate of trust on your computer and use a certificate signed by that certificate of trust so that everything still looks secure in your web browser, even with an certificate that's not been signed by a trustworthy authority.
  4. Someone could compel a trusted authority, such as Verisign to provide them with a "legitimate" certificate for a rogue website to obtain your information.


There is a less broken approach using an infrastructure called "Web of Trust", but honestly, that's not much less broken than the Chain of Trust that web browsers are configured for. I will cover that in another post.

So there you have it, that is why putting your faith blindly in HTTPS is not secure. What can you do to secure yourself against these issues? Am I saying don't use online banking or make purchases online? No, I'm not saying that at all. I am saying, be careful which sites you trust, and don't necessarily trust them just because your browser says to. If you see anything suspicous, like marketing emails from comapnies you don't normally get email from, emails asking you to log in to update your security information, emails claiming to be from security departments of various companies, beware. Chances are you can still make purchases from your usual online stores and chances are, your usual banking website is legitimate. If in doubt, open your web browser manually and type the address into the address bar yourself. Don't Google for your bank website, don't open it from links in an email. Be cautious.

There are some thins we should also do help mitigate certain attack approaches (what we call attack vectors):


  1. Never ever connect to a wifi access that you can't be sure you can trust. The minute you do, all of the data you transmit can be logged by whoever is in control of that access point.
  2. If you must connect to a public wifi access point, find the provider of the access point and ask them the name of it, don't assume that the ones you see can all be trusted.  It's easy to set up a fake wifi access point in a coffee shop and start harvesting people's data. [Side note, never plug your iPhone into a charger that you can't trust, there are attacks that can install malicious software on your iPhone to harvest your data as well]
  3. Log in to your home router right now and make sure you've changed the following:


  • The router should definitely not have the default network name, so change it from Linksys or DLink or whatever it was when you got it to something else. It shouldn't be anything that can easily be identified by your address, location or person. Someone shouldn't be able to identify and access your network just by knowing you unless you've given them the access information.
  • The router should definitely not have the default password, and in fact, if you have a router that lets you change the default username, change that too.
  • The router should not have remote administration enabled unless you absolutely need it, if it is enabled, don't use the default port for remote administration. Changing it isn't that much of a hindrance because an attacker with a port scanner will still find the open port, but it's one extra step they must take, reducing the chances you'll be attacked by some kid without much of a clue.
  • The router should be using at least WPA2 security... not that this is foolproof, there are known hacks that can bypass it, but it's much safer than WEP and WPA which any 15 year old with some readily available software can bypass it. I will cover configuring better security protocols on your router in a later blog post. A shared key is a security risk, if you can configure enterprise grade security on your router giving each user their own username and password, it's a lot safer. I'll cover this in a later blog post.
  • Make sure your shared key is strong. Preferably a combination of both lower and uppercase letters, numbers and other symbols longer than 8 characters.
  • If you expect to have anyone other than your local network using your wifi - for instance family and friends, set up guest wifi and use wireless isolation to make sure that every computer that is connected to your guest network is isolated in their own space.
  • Configure wireless access MAC authentication. The MAC address is a physical address assigned to the network card installed in your computer. On it's own, this is no hindrance to an attacker as the MAC address can be faked, but in combination with everything else, it's one extra step for someone to bypass.
  • If you can, reduce the wifi signal so that your wifi access point can't be connected to from outside the house. If an attacker can't hear the signal, they can't connect.
  • Find a list of DNS servers that can be trusted and configure your router to use those instead of trusting your ISP to provide them. Ideally they'll be hosted in an independent state that is known for strict privacy laws - such as Iceland or Switzerland.
  • The clock in your router is used to synchronize features that may include some security level features. If you have the ability to configure an NTP server, make sure you configure an NTP server you trust - for instance ch.pool.ntp.org in Switzerland [don't take my word for this].

That should keep you relatively safe for the moment... we'll cover WPA2 Enterprise security and how you can get that installed on your home router for better security in the next post.

As always, anyone that has further information that would be helpful in addition to this post, please post in the comments. I look forward to hearing from you.

September 27, 2013

An exploration of computer (in)security - for mortals...

Well it's been a while since my last blog post as I haven't really felt like I've had much of any relevance to contribute to the world. Lately I've been feeling a pull to contribute more and have been mulling over issues thrown into sharp relief by the revelations of the Edward Snowden saga.

By now, I'm not sure there are too many people in the world who haven't heard of this guy or what he has brought to light. For those who have no idea what I'm talking about or why you should care, here it is in a nutshell:

Edward Snowden was a member of staff working on behalf of a private US company contracted to the National Security Agency as a systems administrator. The NSA is America's version of GCHQ, the communications gathering centre behind MI5/MI6. Earlier this year, after flying to Hong Kong, he leaked a huge trove of information to Glenn Greenwald, a reporter for the UK newspaper The Guardian, regarding gob-smacking NSA abuses of the US constitution that has huge global ramifications relating to all forms of communication. After leaking this information he fled to Russia, allegedly enroute to Bolivia or Guatamala to escape the long arm of the CIA. The US revoked his passport mid flight leaving him trapped for a few weeks in the transit zone of Moscow's Sheremetyevo airport. After Bolivia granted Snowden political asylum, the US violated international law by grounding the Bolivian presidential plane in Spain under suspicion they were trying to smuggle Snowden out of Russia, which South American governments are all still furious about. The Russians eventually provided him political asylum and he was allowed to leave the airport transit zone and remain in Russia for a year while he finds alternative means to drop off the grid and escape from the CIA for good.

The information released by the Guardian sent the head of the NSA Keith B. Alexander to court to account for the actions of his office, where he committed perjury by denying having spied on the American people - evidence later proved this was a lie. During the course of all this, it came to light that a secret court was granting secret decisions to secret law enforcement requests for information on people - the Foreign Intelligence Surveillance Court (FISC for short) has been secretly granting all kinds of illegal surveillance activities on the grounds that because the court provides legal oversight, the surveillance activities aren't illegal - except that the court wasn't democratically elected (on the grounds that it was secret) and doesn't appear to answer to anyone. The outcome of these decisions are that companies are forced to provide access to any data the US government says they want and the company is gagged from discussing it under penalty of... ? Who knows what the penalty is for breaking a gag order, I have no idea - prison I guess, or the use of whatever trove of information the NSA has against you to discredit you, close down your company and destroy the remainder of whatever life and freedom you thought you had.

It turns out that the NSA is bascially recording every internet transmission, including those of both the American people and everyone else around the world and has the ability to read basically anything deemed "secure" by current internet security protocols. For instance the technologies we are all sold as keeping your vital personal information secure from prying eyes, the technology that puts the little padlock in your browser address bar that you're told to trust as the gold standard for your internet safety, labelled SSL or TLS and other three letter acronyms designed to make your eyes glaze over can easily be bypassed by the US government (I'll explain the details of that revelation in a future post).

So that's the situation in a nutshell - everything you're told is "safe and secure" on the internet should be questioned - everything!  Just about everything the public is taught about internet security by the media is at best inaccurate and at worst, a lie. Every transmission is recorded, most things can be read easily and most of what's left can be decoded with a little effort. Nothing is as cut and dry as you're led to believe, nothing you do online is as safe or secure as you're led to believe and probably take for granted.

I hear you say "So what, I'm in the UK, the NSA has no legal jurisdiction here", you may have a point, but any information you transmit over the internet most likely flows through the US or is hosted on servers based in the US or by US companies. That means your information is fair game... and the US is not above having you extradited for things you've done that are a crime in the US but not in your home country (such as Richard O'Dwyer who they sought extradition for providing links on his website to copyright infringing material... and if they can't get you legally extradited, they'll quite happily kidnap you and return you to the US for trial in what they call extraordinary rendition.

Anyone that's known me well knows that I've always had a vague fascination with encryption and cryptography, vague in the sense that any 10 year old boy shown how to write secret messages is fascinated by it. For those that are unaware of what encryption and cryptography is, it is the art of concealing a message by scrambling it up using a process that only the intended recipient can later unscramble to reveal the original message your wrote. In the meantime, anyone else looking at it won't be able to read it, nor could they modify it without the intended recipient knowing.

I've tinkered with many cryptographic and security tools in my life but have never really taken to heart just how seriously I should take it... after all, I'm a nobody, I don't do anything interesting, at least, not in the eyes of any government. I'm not a conspiracy theorist, particularly. I don't have enough political influence to be a threat to anyone, so why should I really care?

In the past couple of weeks, I've started a journey into exploring computer and internet security in the hope that I can pull together a solution that will help everyone become not only more security conscious but actually build the basic skills to manage our own computer security more comprehensively.

I will be blogging about my discoveries in (hopefully) plain English that my kids will be able to understand growing up - I want them to be able to take charge of their own computer and internet security as that starts to become important.  So stick around and hopefully my discoveries will spur insights that are of benefit to more than just me and my family, but to everyone reading my blog as well.

I encourage everyone to contribute in the comments, this is a journey for me as much as it is for everyone else. I don't by any stretch consider myself an expert in this arena yet, there is much I don't know. We're all in this together and I hope it turns out to be as fascinating to everyone else as it is to me, after all, the outcome of this journey will hopefully help keep all of our information as private as I believe it should be - and hopefully you do too.

September 28, 2010

Nine non-development jobs professional developers would benefit from doing

Nine non-development jobs that professional software developers would benefit from doing once in their lives - not including the most obvious Software Engineer :P

  • Front line technical support - This will teach you that more often than not, business metrics [as flawed as they are from your developer point of view] drive the business, not your idealism towards your software. Your 'skewed' idealistic point of view has no weight in the call centre. This will give you an insight into just how much money large corporations waste on supporting each and every bug your software goes out with.
  • Second line technical support - Dealing with the simplest of problems that front line technical support can't figure out within the definition of their required metrics - i.e. their average call time, calls handled per shift etc.
  • Third line technical support - Dealing with calls from customers that neither front line or second line technical support can't figure out. Of the technical support jobs, this is the most fun. You only get the intriguing problems that nobody else can figure out.
  • Customer service in a call centre - Dealing with random calls regarding software, from "what does it do" to "I have no idea what I'm doing". You will have to find many ways of presenting the same information in different ways because not everyone 'gets it' the way you think they should. This will teach you patience... with a safety net, you can always put the customer on hold and freak out about how stupid they are.
  • Desk side support - Dealing with customers in person, sitting by the customer and fixing their problem for them or showing them how to fix it for themselves. The biggest lesson you will learn from this job is patience - without a safety net, you can't freak out because the customer is right there. Whether you consider the person an idiot or the smartest person you've met in your life, you have to keep an even temper and demonstrate compassion for their situation - even if underneath you're fuming.
  • Retail sales - The customer is important. This job will teach you many things: Personal interaction with unfamiliar people, body language, anticipation of customer needs and the value added upsell.
  • Integration specialist for someone else's software - Going to client sites and installing software into their production environments. This will teach you what happens in corporations once your software makes it into the wild. Just because it works on your test servers, with your test data doesn't mean it's going to work in the wild, on someone else's server, under someone else's control, with their data. It will also teach you the lengths you need to go to in order to integrate your software with other business applications.
  • Graphic Design - What's the point being able to make great software if it looks ugly?
  • Typesetting/Copy writing - What's the point in being able to make great software if nobody can read it?

If anyone's got any other non-development jobs that would be useful for rounding out a professional software developer, please comment! :)

September 24, 2010

Why company politics will cause your project to fail

How can one work in a company where nobody wants to be accountable for their comments, responses to questions or decisions? Recently I had an experience that I wish to serve as a warning to others that allow company politics to dictate how a project progresses.

There is a reason that you will hear software engineers the world over bitching about business analysts and design specifications. Because we know what happens when they don't happen adequately. I completely understand business analysts frustrations too, because that is one of my hats. The problem with specifications is that they need buy in from each of the stakeholders. But what should you do when the only people that have buy in are the ones approving your invoices? That can be an extremely frustrating process.

I wear many hats with this client: I'm a business analyst, requirements analyst, strategist, data architect, systems architect, software architect, team lead, coder and integration specialist. That wasn't by choice, nor was it what I signed up for, it just ended up working that way as it sometimes does. This setup does have some incredible advantages, but it can also bring incredible frustration that someone who wears only one of these hats will never see. Even though I wear all of these hats, I cannot design and build successful software for a client without the buy in of every one of the departments and staff I need information from. By buy in, I mean that this needs to be important to them. They need motivation to make sure that the information they're providing you is complete, accurate, well thought out and is in the best interests of the the parties they may be providing information on behalf of.

Having worked with one client for a while, I've grown familiar with their company politics and some members of their staff who are either unable to or unwilling to put pen to paper for fear that someone can categorically point their finger and say "you're to blame for this!" I can understand why, to be honest, I'd hate to think that someone was always looking to point the finger at me.

Usually I'm fairly flexible with my software design strategy, I can slack off in some areas and be more rigid in others. In some areas you can afford to be, in other areas you cannot. The design brief and interface specification are areas you cannot. Like most software developers, understanding a whole business process and reasons for it before I strategize on a design is non-negotiable. This time, like no other, I spent time with management discussing how things are done and how they want things to be done going forward. The biggest mistakes I made were ignoring politics and trusting that the information provided by management allowed me to draw the correct conclusion. What I'd failed to consider is that management who don't do the job day in day out "forget" things - important things that can and will completely derail the project at the design stage... and by "forget" I mean that it's in their political interest not to give you the information - whether it's a pissing contest with with another manager or being seen as relinquishment of power to an outside their kingdom.

I drew up mockups in photoshop to display the proposed interface design along with detailed process documentation. My request for feedback, was met with "that looks great, code it up". Being a small and fairly informal company, this has always been the way with the project since long before I came on board, and so I didn't want to rock the boat too much. They've always been against documentation in favour of results. I can deal with this in some areas on a small project, i.e. when bugs are reported to me, I may not write them up any further than brief notes to serve as a reminder of what the bug is. The design documents though, I don't skimp on. They include thought processes, things that need to be considered, thoughts on failure of certain strategies etc - something this company is keen to avoid at all costs. So this is something that I [with support of the manager I report to] have always been at odds with - It seems he'd given up arguing with the rest of the managers about it before I came on board. Anyway, I digress - it seems quite apparent that my designs and documentation weren't given more than a hurried glance to make it looked like they'd read it before the go ahead was given.

Let this be the warning: Just because you've drawn up the designs which were built on top of information provided by management and "signed off" by management, doesn't mean that what you have in your hand is completely or even partially correct. When management are giving direction for the design of software that they will not be using day in day out, consult those who will be using it. Do this as soon as you've written down and understood the information provided by management. If you don't, you risk (like I did) spending time writing a feature that cannot possibly be useful in the form you've developed it - even if it's exactly what management asked for and functions exactly the way they described it should.

My prediction is that this project is going to fail, miserably - how can I be sure? Because in my book, it already is. Not because of poor management of the project itself, nor because my predecessors or I haven't made every effort to follow best practices and adhere to and exceed industry standards. It's going to fail because internal politics make it impossible to implement any of the features in a manner that is useful to the users who require it to do their jobs effectively.

But who is to blame? The 'developer' of course [I use developer here loosely, it's really the business analyst that the client is pointing the finger at, though they don't know this - as far as they're concerned, the 'developer' is a catch all term for the vendor handling the project], not once has anyone put a pen to paper to say "I approve that this interpretation of the facts and processes are complete and accurate" and not once has anyone put pen to paper to say "I agree that this design is an effective solution to allow my staff to complete their jobs in the most efficent manner possible." So who is accountable? Well everyone avoided that responsibility because it's all about he said/she said. Even now, with the wheels falling off the project and even though I have gone over and above the call of duty to try and get these things rectified, there is still a refusal to change their practices.

Word of warning to the wise: You may not think that internal politics at a client have an effect on your design process, but if you don't want to be tied up in a project that's destined for failure from the outset, heed the warning signs. If politics amongst the management are dictated by the measurement of their dicks get out before it affects your work and makes you look bad.

Putting best practices aside for a moment there are undocumented and intangible things that are important for a project to succeed:

  • Stakeholder buy in. Everyone must be in the same boat working as a team towards one common goal. If any one/all of the stakeholders are not on board, don't believe in the project or are otherwise not happy with the project, they are a threat to the project.
  • Stakeholder priorities. Everyone must make the software development a top priority. It cannot be put aside as "I'm too busy to read this", "I'm too busy to write that", "I'm too busy to take part in this", "I didn't want any part of this". If they don't make it their first priority, they cannot complain when it doesn't take their views into account. If they don't make this a priority, they are a threat to the project.
  • Stakeholder accountability. If a stakeholder is not willing to be accountable for their decisions, accuracy of their input or completeness of their input then they are a threat to the project.

Lessons learned:

  • Never trust any information provided until you have proven that it's true. If that requires job shadowing, then so be it.
  • Political stakeholders are next to useless unless they're "at the coal face". Get process information those doing the appropriate job. Their manager or their manager's manager isn't good enough unless that manager also does that job. What makes the most sense to the users of the software, frequently isn't anything like what makes most sense to management.
  • It doesn't matter how much stakeholders attempt to avoid accountability, get a signature on a piece of paper that corroborates that your understanding is 100% complete and accurate before you start on strategy and designs. Do the same with your designs before you move to coding. If you don't, eventually the finger of blame will be pointed at you, no matter how unjust that finger pointing may be.

September 15, 2010

Reply: What is too simple and small to refactor

This code is partly inspired by Cory Fowler's refactoring of code provided by John MacIntyre in his last blog post - "What is too simple and small to refactor". John used the inheritance model to remove a boolean parameter he didn't like, Cory's was a slight departure and introduced funcs to inject the calculations, but still stuck with the inheritance model. I like Cory's approach, but I wanted to see if I could use funcs to entirely ditch the inheritance model - mostly just to see how far I could take the concept he introduced. My base calculations are all collated into a single static class with properties for each calculation returning a Func<float, float, float>. Each instance of employee has the calculation Func<> pushed in using dependency injection. This means that an employee could have an entirely custom calculation without requiring an instance of WageBase/CalculationBase or whatever other inheritance scheme was used in the other posts.
My WageCalculations class is basically a bunch of properties that return Func<float, float, float> this allows us to reference the calculations in a statically typed manner WageCalculations.BasicWithoutOvertime, WageCalculations.BasicWithOvertime etc.
My employee class is very simple, one of the constructors allows the PayCalculation property to be specified at instantiation. When we need to run the calculation, we just call empInst.CalculatePay(); So here's my employee class, nothing complicated:
Disclaimer: In the interest of keeping this blog post as short as possible none of the demonstration code contains any exception handling. Exception handling can however be found in the downloadable project at http://dl.dropbox.com/u/3029830/Prototypes/Prototype%20-%20WageCalculator.zip.
public class Employee
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public float BaseRate { get; set; }

    public Func<float, float, float> PayCalculation { get; set; }

    public Employee(string firstName, 
                    string lastName, 
                    float baseRate, 
                    Func<float, float, float> payCalculation)
    {
        FirstName = firstName;
        LastName = lastName;
        BaseRate = baseRate;
        PayCalculation = payCalculation;
    }

    public float CalculatePay(float periodHours)
    {
        return PayCalculation(periodHours, BaseRate);
    }
}
Here's my PayCalculations class that provides all the basic pay calculations for the system:
public static class PayCalculations
{
    public static Func<float, float, float> BasicWithoutOvertime 
    { 
        get 
        {
            return (hours, baserate) =>
            {
                return hours * baserate;
            };
        }
    }
    public static Func<float, float, float> BasicWithOvertime
    {
        get
        {
            return (hours, baserate) =>
            {
                if (hours < 40) return hours * baserate;
                return ((hours - 40f) * 1.5f + 40f) * baserate;
            };
        }
    }
    public static Func<float, float, float> Salary
    {
        get 
        {
            return (hours, baserate) =>
            {
                return baserate;
            };
        }
    }
    public static Func<float, float, float> Contractor
    {
        get
        {
            return (hours, baserate) =>
            {
                /* Base rate */
                float subtotal = Math.Min(hours, 40) * baserate;
                hours -= Math.Min(hours, 40);
                /* Time plus a half */
                if (hours > 0) subtotal += 1.5f * Math.Min(hours, 20) * baserate;
                hours -= Math.Min(hours, 20);
                /* Double time */
                if (hours > 0) subtotal += 2.0f * Math.Min(hours, 20) * baserate;
                hours -= Math.Min(hours, 20);
                /* Doube time plus a half */
                if (hours > 0) subtotal += 2.5f * hours * baserate;

                return subtotal;
            };
        }
    }
}
Finally I've got my application code that sets up a bunch of new employees and passes their pay calculations into the constructor before calculating their pay for 50 hours:
class Program
{
    static void Main()
    {
        List Employees = new List()
        {
            new Employee("John", "MacIntyre", 40.25f, PayCalculations.BasicWithoutOvertime), 
            new Employee("Ben", "Alabaster", 40.25f, PayCalculations.BasicWithOvertime),
            new Employee("Cory", "Fowler", 2935f, PayCalculations.Salary),
            new Employee("John", "Doe", 150f, PayCalculations.Contractor), 
            new Employee("Jane", "Doe", 0f, (hours, baserate) => 3500),
            new Employee("Joe", "Bloggs", 34.25f, (hours, baserate) => {
                return hours < 15 ? 15 * baserate : hours * baserate;
            })
        };

        float hoursWorked = 50;
        Employees.ForEach(employee => {
            Console.WriteLine("{0}, {1}: ${2:#,###.00}", 
               employee.LastName, 
               employee.FirstName, 
               employee.CalculatePay(hoursWorked));
        });
    }
}
I'm not sure I can make it any simpler with this approach, I think I've taken the concept about as far as I can in this direction. Perhaps if anyone cares to take this a step further, they can write a contributory blog post and provide a link in the comments. I'd love to see other approaches to this refactor.

September 9, 2010

How many projects do I need in my solution?

Despite what people are saying, limiting yourself to three projects in a solution is as retarded as allowing things to get out of hand as far as 20... it's my belief that the fewer projects you can practically have in your solution, the better. But arbitrarily limiting yourself to any specific number is dumb. Use your common sense. Here are the questions I ask myself and the rules of thumb I follow for determining how many projects I need in my solution.

  • Am I writing an application that needs to be distributed across multiple servers?
  • Am I writing an application where different departments need different front-ends for the same data/business rules?
  • Am I writing a one off application that connects to its own database that will be used by one person or that everyone will see and use in the same way regardless of what it's being used for?

If you answered yes to the first 2 questions, then you probably need to separate your code into multiple projects or solutions. Ask yourself:

  • Does this code really need a separate project or solution than my application?
  • Is the code I want to put in the new project specific to this application or will it be useful in multiple applications? If it's not specific to this application, but it won't be useful in other applications, don't bother splitting it out.
  • Is the code I want to put in the new project dependent upon anything specific to this project? If the answer is yes, go on to the next question, if not, skip ahead to the last one.
  • Can any application specific dependencies be injected, thus removing the application dependency and therefore making it reusable in other projects? If the answer is no, then it shouldn't be its own project.
  • What is the realistic likelihood of the new project being reused? Truthfully. If it's unlikely to be reused, then it's a waste of time splitting it out.

Rules of thumb for defining the number of projects in my application solution:

  • Any code that is not specific to this application should be its own project(s), where practical in its own solution.
  • All projects should be as dependency free as is practical. This probably won't be none, it's usually far more effort than it's worth to make a project completely dependency free.
  • Any projects that are coupled so tightly that you cannot use either one without the other should be a single project. If there needs to be logical separation of concerns within that project, it should be in folders not projects.
  • Any project that can be used independently of other projects in the solution should be separated and linked with a binary reference at the earliest possible opportunity.

When I start developing a new application, I potentially start with the following projects in my solution:

  • Caching Layer [Potentially not needed] - This helps eliminate expensive hits on the database. It may not be needed depending on the application/circumstances/business needs. Most of my apps have this built into the data access layer, but a couple have required a centralized cache that can be hit with multiple instances of the DAL, thus required a separate deployment and thus a separate solution. Before the caching layer reached maturity enough that it could be used separately from my application though, it existed as a project within my solution.
  • Data Access Layer [Potentially already exists] - It would be ideal if this had no dependencies on any other project in your solution, but the fact is, it needs to return business objects, which are defined in your Business Object Layer, so your DAL and BOL are coupled, like it or not - although the BOL isn't dependent up on the Data Access Layer - it's a one way dependency.
  • Business Object Layer [Potentially already exists] - This doesn't need any dependencies on any other project in your solution. In an ideal world, your objects are just data storage devices for in-memory data. KISS. If your objects are dumb, it separates concerns and doesn't leave any awkward edges. I'm quite likely to use my business object layer in other projects because they're generic and don't do anything. I should be able to new up a business object and load it with data without needing to even reference my DAL. These are unchanging logical containers for data regarding company specific operations. In my current company that means: Stops, Manifests, Locations, Customers, Shipment. There's no logic, it's just data. These things don't change over time. A shipment is picked up from one location, delivered to another location and a customer gets billed for the shipment - that's trucking, it won't change until they invent teleportation. This allows my application to do what needs to be done. Like the Data Access Layer, this may have already been written and tested for other projects, if so, you'd just have a binary reference rather than a project/project reference in your solution.
  • Business Logic Layer [Optional] - Business rules that are non-project specific, i.e. if rules apply across all software in your organization - which they probably don't, so you probably don't need a project for this. These will relate to unchangeable business processes/practices. Given that very few business processes/practices are immutable, and given that if you change them, you'll likely need to modify your application, you probably don't need this layer separated from your application project. In cases where the rules apply to things that don't change very often - like the laws of physics, they could feasibly be stored in a business logic layer allowing multiple applications to make use of them. Usually though, business logic is application specific and as such, doesn't require its own assembly.
  • Application Layer (or Presentation Layer) - This contains application specific code: logic, helpers, utilities and configuration.
  • Projects for non-project specific utilities that don't belong in existing libraries - These are utilities being written for first time use and proofing with this project. These can't have any dependencies on any other projects in your solution (except amongst themselves) as once they reach maturity, you will be breaking them out into their own libraries and changing project references to binary references to remove clutter from this solution. If utilities belong in pre-existing libraries, they should be added to those libraries and tested before inclusion as a binary reference in this solution. Any chance to avoid adding a project to your current solution, will avoid headaches navigating around your code later. You should remove these projects and replace them with binary references at the earliest possible opportunity.

If you're really lucky and you've got a previously developed DAL and BOL and don't need any helpers that don't already exist in other libraries, you'll only need a single project in your solution. If you're not so lucky you'll be developing the DAL and BOL yourself. You'll probably also need at handful of non-project specific helpers that are logically unrelated to each other. These projects will sit in this solution until they've reached maturity enough that they can be broken out to their own solutions leaving a few binary references in their wake.

By the time you're finished, if you did it right you should just be left with your application layer and a bunch of binary references - a single project.

Of course, if you're writing a one off application and you're not going to have multiple applications floated over top of each of the different pieces, then there's precious little point in having any more than one project. You may as well just use folders in a single project from the outset.

August 9, 2010

Cleaning up my data access layer with enumerations, attributes, reflection and generics... and caching

Today's blog post is a follow up to my previous post regarding reflection and enum attributes to add flexibility to your DAL while reducing your code base by a helper class per enum. I won't go over everything I coded in my previous post, but you can find it at http://www.endswithsaurus.com/2010/08/cleaning-up-my-data-access-layer-with.html

Well, if you'd gone through the coding techniques used in the blog post, you'll now be familiar with the steps you need to go through to convert from an attribute value to an enum value and back and to convert from one attribute value to another. However, you may also have noticed - as per common trends with reflection, that the performance sucks ***. This blog post is going to help you rectify that. Because contrary to common belief, reflection with a couple of tricks can actually be made to perform quite well - indeed, in some cases almost as well as the rest of the .NET framework.

Having done some basic performance testing on my previous code, it was fairly obvious that reflection for this purpose was some order of magnitudes slower than switch statements. Your average 5 value enum, each value having a database value and a display value takes in the order of 10ms per 100,000 iterations to convert between an enum value and one of the other connected string values or vice versa, or indeed converting between the display string and database string. Having written the demo code up for attributed enumerations like:

using System;
using System.Collections;
using System.Reflection;

namespace Utilities.Enums
{
  public static class Enums
  {
    private static BindingFlags fieldBindings = 
      BindingFlags.Public | 
      BindingFlags.Static | 
      BindingFlags.GetField;

    private static Hashtable _fieldInfoArrayFromEnumCache = new Hashtable();
    private static FieldInfo[] FieldInfoArrayFromEnum<TEnum>()
    {
      Type t = typeof(TEnum);
      object cached = _fieldInfoArrayFromEnumCache[t];
      if (cached != null)
        return (FieldInfo[])cached;

      FieldInfo[] fi = t.GetFields(fieldBindings);
      _fieldInfoArrayFromEnumCache.Add(t, fi);
      return fi;
    }

    private static Hashtable _fieldInfoFromEnumValueCache = new Hashtable();
    private static FieldInfo FieldInfoFromEnumValue<TEnum>(TEnum value)
    {
      if (value == null) throw new ArgumentNullException("value");

      Type t = typeof(TEnum);
      object cached = _fieldInfoFromEnumValueCache[value];
      if (cached != null)
        return (FieldInfo)cached;

      FieldInfo fi = t.GetField(value.ToString(), fieldBindings);
      _fieldInfoFromEnumValueCache.Add(value, fi);
      return fi;
    }

    private static Hashtable _attributeEnumCache = new Hashtable();
    private static IAttribute AttributeFromEnumValue<TEnum, 
      TAttribute>(TEnum value)
    {
      if (value == null) 
        throw new ArgumentNullException("value");

      Type t = typeof(TEnum);
      Type a = typeof(TAttribute);
      object key = new { enumVal = value, attrType = a };
      object cached = _attributeEnumCache[key];
      if (cached != null)
        return (IAttribute)cached;

      FieldInfo fi = FieldInfoFromEnumValue<TEnum>(value);
      IAttribute attr = (IAttribute)fi.GetCustomAttributes(a, false)[0];
      _attributeEnumCache.Add(key, attr);
      return attr;
    }

    private static Hashtable _attributeFromValueCache = new Hashtable();
    private static IAttribute AttributeFromAttributeValue<TEnum, 
      TSourceAttribute, TTargetAttribute>(object sourceAttrValue)
    {
      if (sourceAttrValue == null)
        throw new ArgumentNullException("sourceAttrValue");

      Type t = typeof(TEnum);
      Type s = typeof(TSourceAttribute);
      Type d = typeof(TTargetAttribute);

      object key = new { 
        attrSourceType = s, 
        attrTargetType = d, 
        attrVal = sourceAttrValue 
      };
      object cached = _attributeFromValueCache[key];
      if (cached != null)
        return (IAttribute)cached;

      TEnum enumVal = Parse<TEnum, TSourceAttribute>(sourceAttrValue);
      IAttribute target = 
        AttributeFromEnumValue<TEnum, TTargetAttribute>(enumVal);
      _attributeFromValueCache.Add(key, target);
      return target;
    }

    private static Hashtable _enumFromAttributeValueCache = new Hashtable();
    public static TEnum Parse<TEnum, TAttribute>(object attrValue)
    {
      if (attrValue == null) throw new ArgumentNullException("attrValue");

      Type t = typeof(TEnum);
      Type a = typeof(TAttribute);
      object key = new { attrType = t, attrValue = attrValue };
      object cached = _enumFromAttributeValueCache[key];
      if (cached != null)
        return (TEnum)cached;

      Array enumVals = t.GetEnumValues();
      foreach (TEnum enumVal in enumVals)
      {
        object attrVal = GetAttributeValue<TEnum, TAttribute>(enumVal);
        if (attrVal.Equals(attrValue))
        {
          _enumFromAttributeValueCache.Add(key, enumVal);
          return enumVal;
        }
      }

      //At this point there was no matching enum for the attribute value provided.
      throw new ArgumentOutOfRangeException("attrValue");
    }

    private static Hashtable _attributeValueFromEnumValueCache = new Hashtable();
    public static object GetAttributeValue<TEnum, TAttribute>(TEnum enumValue)
    {
      if (enumValue == null) 
        throw new ArgumentNullException("enumValue");

      Type t = typeof(TEnum);
      Type a = typeof(TAttribute);
      object key = new { 
        enumType = t, 
        attrType = a, 
        enumVal = enumValue 
      };
      object cached = _attributeValueFromEnumValueCache[key];
      if (cached != null)
        return cached;

      IAttribute attr = 
        AttributeFromEnumValue<TEnum, TAttribute>(enumValue);
      _attributeValueFromEnumValueCache.Add(key, attr.Value);
      return attr.Value;
    }

    private static Hashtable _attributeValueFromAttributeValueCache = new Hashtable();
    public static object GetAttributeValue<TEnum, 
      TSourceAttribute, TTargetAttribute>(object sourceAttrValue)
    {
      if (sourceAttrValue == null) 
        throw new ArgumentNullException("sourceAttrValue");

      Type t = typeof(TEnum);
      Type s = typeof(TSourceAttribute);
      Type d = typeof(TTargetAttribute);
      object key = new { 
        enumType = t, 
        srcAttr = s, 
        dstAttr = d, 
        srcVal = sourceAttrValue 
      };
      object cached = _attributeValueFromAttributeValueCache[key];
      if (cached != null)
        return cached;

      IAttribute attr = AttributeFromAttributeValue<TEnum, 
        TSourceAttribute, TTargetAttribute>(sourceAttrValue);
      object attrVal = attr.Value;
      _attributeValueFromAttributeValueCache.Add(key, attrVal);
      return attrVal;
    }
  }
}

We can also convert between one attribute value and another using the same GetAttributeValue method, but this time calling the generic overload Enums.GetAttributeValue<EnumType, SourceAttributeType, TargetAttributeType>(object sourceAttributeValue):

string DbValue = 
  Enums.GetAttributeValue<DemoEnum, DisplayValue, DatabaseValue>("Fifth Value")

As you can see, by stepping through any of the method, the first thing we do is prepare a key if the lookup is for anything with a more complex key than a simple value and we check to see if we've got a previous comparison already cached in the corresponding hashtable which is serving as a static cache. If we've got a value cached in the hashtable, it is returned early. If no value is cached, then we lookup the correct value and add it to the cache before it is returned.

The fact that this is a static class, and each of the hashtables are static fields mean that the values cached by a lookup instigated by the first user of the application after it's loaded into the web server means that every other user benefits from that user's pain - not that there's that much to start with. Using this model, we see performance increase from the order of 800ms per 100,000 iterations to around 50ms per 100,000 iterations (on my machine). The large majority of the overhead is caused by doing the first check and adding the result to the cache. Every stage of the lookup is cached so that every subsequent stage benefits from caching of the lower levels. Consequently if someone has already done a lookup on an attribute on the fifth enum value, all of the attributes for that enum value have already been cached, and all we've got to do is get the value from the cached attribute.

This is a dramatic performance increase - some 16 times faster than our original code, though the large majority of the techniques introduced in the last post are still present, there is some refactoring to keep the code as clean as possible.

Of course, like anything, you need to use these techniques with care - caching isn't a get out of jail free card, what was lacking in performance with the old method, were made up for with more meagre memory requirements. We've now traded those two sides of the coin. Throwing more memory at the problem has reduced the need for processing power to solve the problem, but the increased memory overhead in some cases may turn out to actually be detrimental to your application. You have to pick the right strategy for your application - hopefully this was the right strategy for yours, but your mileage may vary given the specific circumstances of your application.

Happy reflecting!

There was an error in this gadget