Preventing bugs, and improving code quality with Microsoft SAL (Part 2, custom preconditions for structs & objects)

Note: At the end of part 1, I’d suggested that part 2 would be about invalid handles. This post however isn’t about invalid handles. That’ll be the next part in this series.

I very frequently come across code like this:

void dosomething( _In_ somestruct* in ) {
    //Not detected at compile time
    assert(in->some_precondition);
    }

The function, dosomething, expects/requires a specific condition. So, the author asserts it, and resolves to test every possible code path, thus verifying code correctness. When all their tests pass, they build in release mode, and the assert disappears.

Some time later, someone calls dosomething under different conditions, all of their tests pass, and they release the modified code. Users start seeing their data mysteriously corrupt itself, and their once-reliable program is crashing in ways that nobody can reproduce. After several days worth of investigation, developers thoroughly audit the code, and spot the new (and incorrect) code path. They refactor the faulty code path, add a VERY LOUD comment to dosomething, and forget about the whole thing.

Some number of months later, the same thing happens. This time, a developer has a bright idea: Let’s return an error code! Whoever wrote the code without a proper check for invalid arguments is an idiot!

The function now looks like this:

HRESULT dosomething( _In_ somestruct* in ) {
    //Not detected at compile time
    assert(in->some_precondition);
    if ( !in->some_precondition ) {
        return E_INVALIDARG;
        }
    }

We’re better off now, or at least we think so.

Now of course, everybody has to check the return code, which means more complexity, more ways to go wrong, and more code paths to test. Since there are far too many code paths to test every single one, some end user might still get to one of the untested, and also invalid, path.

Yet a third developer comes along, and thinks: What a silly C programmer. Someone will inevitably ignore the return value and the error will go unnoticed. Even worse, because I have to check the return value every single time, I can’t compose this function, and thus it’s less practical. This is an exceptional case, so we should treat it as one.

The function now looks like this:

void dosomething( _In_ somestruct* in ) {
    //Not detected at compile time
    assert(in->some_precondition);
    if ( !in->some_precondition ) {
        throw std::invalid_argument( "Function preconditions not met!" );
        }
    }

We’re a little bit better off now, at least we’re pretty sure that we are.

Now, exception safety becomes an issue. We still have to test every possible code path to make sure that none of them incorrectly call dosomething. We still can’t be confident about it.

Perhaps somebody else comes along, and writes a wrapper object, thus only “allowing” correct uses of dosomething. Now everybody has to rewrite their code use that wrapper.

Complexity has increased, this function has been refactored thrice, and we still haven’t even found a way to be code-correctness-confident, at compile-time, for all the uses of dosomething.

The cycle repeats, no end in sight.

Nearly every program stops there – they choose error codes or C++ exceptions, to carefully handle bad logic WHEN it happens. Preventing them from happening, well that becomes an issue of “professionalism” and “discipline”.

Neither “professionalism” nor “discipline”, actually prevent the mistakes of logic, but blaming the mistake on a given individual’s moral or personal failings works nicely to…well I’m not actually sure; I have noticed it is a fairly pervasive and pathological mindset in case studies of engineering disasters (Therac-25, Toyota electronic throttle malfunctions, others). But, I digress.

A few examples from Microsoft’s headers

One of Microsoft’s motivating factors in developing SAL, was the infamous blue screen. Drivers. particularly kernel-mode drivers, are extremely sensitive to misuse. A failure in the kernel can mean system instability, serious security vulnerabilities, or massive data corruption. Furthermore, the kernel APIs that drivers rely on are often very complex, interacting with many components of hardware and software – and thus very hard to get right.

Bluescreens are extremely visible, and have a tremendous impact on overall user satisfaction and trust, thus Microsoft needed a way for driver developers to validate their own code, in a very scalable and reliable manner.

Functions like ExAllocatePoolWithTag, are heavily annotated (from km/wdm.h, 8.1 SDK):

__drv_allocatesMem(Mem)
_When_((PoolType & PagedPool) != 0, _IRQL_requires_max_(APC_LEVEL))
_When_((PoolType & PagedPool) == 0, _IRQL_requires_max_(DISPATCH_LEVEL))
_When_((PoolType & NonPagedPoolMustSucceed) != 0,
       __drv_reportError("Must succeed pool allocations are forbidden. "
                         "Allocation failures cause a system crash"))
_When_((PoolType & (NonPagedPoolMustSucceed |
                    POOL_RAISE_IF_ALLOCATION_FAILURE)) == 0,
       _Post_maybenull_ _Must_inspect_result_)
_When_((PoolType & (NonPagedPoolMustSucceed |
                    POOL_RAISE_IF_ALLOCATION_FAILURE)) != 0,
       _Post_notnull_)
_Post_writable_byte_size_(NumberOfBytes)
NTKERNELAPI
PVOID
NTAPI
ExAllocatePoolWithTag (
    _In_ __drv_strictTypeMatch(__drv_typeExpr) POOL_TYPE PoolType,
    _In_ SIZE_T NumberOfBytes,
    _In_ ULONG Tag
 );

The When annotations are specifying the precise hardware conditions where Developers can call ExAllocatePoolWithTag. Calling outside the correct context will result in an immediate bluescreen, saying something similar to the IRQL_NOT_LESS_OR_EQUAL that we’ve all seen at some point.

Another interesting function is IoBuildSynchronousFsdRequest, which is interesting because the In/Out annotation is conditional based on whether the function is reading or writing data. From km/wdm.h, 8.1 SDK:


_When_(MajorFunction == IRP_MJ_WRITE, _At_(Buffer, _In_))
_When_(MajorFunction == IRP_MJ_READ, _At_(Buffer, _Out_))
_When_(MajorFunction != IRP_MJ_READ && MajorFunction != IRP_MJ_WRITE, _At_(Buffer, _Pre_null_))
_Must_inspect_result_
__drv_aliasesMem
_IRQL_requires_max_(PASSIVE_LEVEL)
NTKERNELAPI
PIRP
IoBuildSynchronousFsdRequest(
    _In_ ULONG MajorFunction,
    _In_ PDEVICE_OBJECT DeviceObject,
    PVOID Buffer,
    _In_ ULONG Length,
    _In_opt_ PLARGE_INTEGER StartingOffset,
    _In_ PKEVENT Event,
    _Out_ PIO_STATUS_BLOCK IoStatusBlock
 );

Rolling your own preconditions

I think that’s enough ugly header code. Back to contrived examples!

The code we had before was as follows:

void dosomething( _In_ somestruct* in ) {
    //Not detected at compile time
    assert(in->some_precondition);
    if ( !in->some_precondition ) {
        throw std::invalid_argument( "Function preconditions not met!" );
        }
    }

I personally don’t like exceptions that much, and in the event of an actual invalid parameter, I’d rather log some debugging info and std::terminate( ).

Here, we’ll avoid catching exceptions (I typically avoid catching them anyways), and thus the code will be cleaner. You’ll be able to see the source on GitHub, at the end of the article.

The problem code from the beginning of this post is really easy check with SAL. The annotated function is:

_Pre_satisfies_( in->flag )
void dosomething( _In_ somestruct* in ) {
    //Not detected at compile time
    assert(in-> flag);
    if ( !in->flag ) {
        throw std::invalid_argument( "Function preconditions not met!" );
        }
    }

When I compile this with Visual Studio 2013 Code Analysis, I get a warning:

C28020 <strong>The expression is not true at this call</strong>
The expression '_Param_(1)->flag' is not true at this call.

Clicking the warning highlights the callsite, which is really handy on a large codebase.

If I want to, I can even configure Visual Studio to consider the build a failure if it detects C28020 – although it’s a bit of a counter-intuitive trick to enable.

You’ll have to open the ruleset in the project properties window:

Then, find the rule that you’d like to change:

Lastly, hit save (I’m a keyboard person, so I hit CTRL + S).

Now let’s step it up just a notch.

Say the function has been in use for several years, and has grown a whole bunch of ugly hairs. It now conditionally assigns to/from a pass-by-pointer parameter. Nobody remembers why this ability was added, but the function is used a few thousand times across the whole codebase. You’re not in a good position to fix it, so you decide to make it more robust, and aggressively detect incorrect uses.

This does happen in real world scenarios (mirror: Bruce Dawson hits thousands of warnings in Chrome, Twitter). And SAL can help.

Here’s the new version of the function:

_Pre_satisfies_( in->flag )
void dosomethingelse( _In_ somestruct* const in, int* const some_number ) {
    //Not detected at compile time
    assert(in-> flag);
    if ( !in->flag ) {
        throw std::invalid_argument( "Function preconditions not met!" );
        }
    if ( in->member <= 0 ) {
        ( *some_number ) = in->member;
        }
    else {
        in->member = ( *some_number );
        }
    }

We still have the Pre_satisfies( in->flag ) precondition, but we have this other damned variable (at least it’s a const pointer!).

So, here’s what we know, in addition to the in->flag precondition:

when in->member <= 0, the parameter `some_number` is written to.
when in->member > 0, the parameter `some_number` is read from.

Thankfully, we can express this almost literally in SAL. The annotations that we need to add:

_When_( in->member > 0, _At_( some_number, _In_ ) )
_When_( in->member <= 0, _At_( some_number, _Out_ ) )

The final function looks like this:

_Pre_satisfies_( in->flag )
_When_( in->member > 0, _At_( some_number, _In_ ) )
_When_( in->member <= 0, _At_( some_number, _Out_ ) )
void dosomethingelse( _In_ somestruct* const in, int* const some_number ) {
    //Not detected at compile time
    assert(in-> flag);
    if ( !in->flag ) {
        throw std::invalid_argument( "Function preconditions not met!" );
        }
    if ( in->member <= 0 ) {
        ( *some_number ) = in->member;
        }
    else {
        in->member = ( *some_number );
        }
    }

Sure enough, when I compile this with some test code, I get C6001: Using uninitialized memory.

Final thoughts

I want to remind you that while SAL helps verify the correctness of your code, it is not a substitute for proper error/exception handling. Although a properly annotated API will make VS yell at you for ignoring return values, and passing invalid parameters, people can (and will) still (void) those return values away.
Thoroughly annotating a codebase with SAL seems to have a tremendous impact on the run time of code analysis. (It’s MUCH faster)

Resources

OSR, a good resource of information on SAL since the stone age, has a great article: SAL Annotations Don’t Hate Me Because I’m Beautiful – OSR

I’ve, of course, mirrored it: SAL Annotations_ Don’t Hate Me Because I’m Beautiful – OSR

Also: I’ve been ~~poorly~~ curating a repository of SAL example usage – at the moment it’s mostly from the Windows headers, but I’m looking for other examples.

You can find it here: https://github.com/ariccio/SALExamples

You can find the source code for this article in that repository, but here’s a direct link: https://github.com/ariccio/SALExamples/blob/master/StructMember/StructMember/structmember.cpp

~ by Alexander Riccio on April 2, 2015.

Posted in Computing, Featured Posts, Science/Innovation
Tags: Code Analysis, Formal Verification, Microsoft SAL, Source Annotation Language, Static analysis

Alexander Riccio