Monday, October 29, 2012

State Machines Made Simple

I recently stumbled across a bug in some code where our embedded device would sometimes fail to take the correct behavior on disconnecting from our back-end servers.   This was actually the third fairly major bug in this particular code. When I dumped the code to a state machine using gliffy, it became clear we had a problem with state.  It looked something like:

Okay, okay....it wasn't that bad.  But it was very, very far from good.  Here's a (vastly) simplified overview of the flow of the code:

bool someState;
bool anotherState;
bool yetAnotherState;
while( someState )
{
    // some initial work...
}

while( !anotherState )
{
    while( ... )
    {
        // some initial work again...
    }
    // Wait here until something happens 

    // ruh roh, a disconnect, loop around...
    if( yetAnotherState)
    {
        // special blah here...
    }
}

...full version of the "driver" method was ~200 lines of code, not counting methods that loosely corresponded with the state transitions.  State was largely tracked with a motely assortment of boolean values.  The nested while loops were difficult to understand, as was the initial loop that seemingly duplicated the first inner loop.  Transitions between state were littered between the driver method and the various state methods, in part because there really was no clear home for them.

If you can clearly communicate your process with a state machine, it can be translated using a pretty typical design pattern that A) makes it very easy to understand, B) very easy to modify in the future and C) very reliable.  For instance, consider this basic (and yes, slightly silly) state machine--I was doing laundry this evening, so this was what immediately came to mind:


I obviously make no claims that this method of doing laundry is correct (I will claim my wife has drilled me into adding bleach to the water before putting the clothes in, for what it's worth).  But that's really not the point.  The point is this can translate quickly, cleanly and reliably to working code.  Start by defining all your states:

enum WashingState
{
    Start,
    AddDetergent,
    StartWash,
    AddBleach,
    FillWithClothes,
    WaitUntilComplete,
    OhNoesGoToStore
}

Then, in code, do the following:

while( not done )
{
    switch( current state )
    {
    case Start:
    case AddDetergent:
        current state = have detergent ? StartWash : OhNoesGoToStore;
        break;
    case StartWash:
        // start the wash
        current state = AddBleach;
        break;
    case AddBleach:
        current state = have bleach ? FillWithClothes : OhNoesGoToStore;
        break;
    case FillWithClothes:
        // Fill with clothes...
        current state = WaitUntilComplete;
        break;
    case WaitUntilComplete:
        // wait until done
        current state = Start;
        break;
    case OhNoesGoToStore:
        // return....nothing left to do.
        return;
    }
}

In a nutshell: your enum values correspond with states.  The switch statement handles transitioning between states, and the while loop causes the switch statement to repeat continuously, until told not to.

In some code it may be more appropriate not to have a long-running process, in which case having single methods that allow atomic transition of state is probably a better way to go--but even then, the pattern really remains the same--individual methods correspond with your states, and those methods allow atomic transitions from one state to the next.

But what I enjoy about the "while( not done ) { switch( state ) {} }" pattern--particularly when it's being applied to something like a long-running thread--is it's neat, well contained, easy to understand, and easy to extend and/or modify in the future.

A few passing comments before I close this out:
  • Ignore articles like this one that claim "there is no way to enforce state transition rules" in switch-based state machines.  This is clearly a claim desperately in search of a weird over-engineered solution; it is trivial to enforce state transition rules.
  • Avoid allowing outside forces to influence state, particularly if this is a dedicated thread.  If you must, the best option is to perform this transition immediately following the switch statement, before looping around the while, to "trump" whatever choice made by the case label.  Not pretty, but neither is life sometimes.
  • Keep It Simple Stupid. There are all sorts of fancy methods out there--hierarchical state machines, crazy inheritance, template wizardry (there is some relationship between Dr. Dobbs and a total inability to embrace simplicity), etc.  It is super awesome you can make C++ jump through hoops, but please, spend thirty seconds thinking about the last time somebody handed you a plate of completely incomprehensible code.
  • Consistency matters.  Not for you, probably, but for the next person to maintain your code.  Try to be consistent about where state transitions happen, locking (if necessary), and structure of the code.