E.12 Explicit vtable structures to implement traits

A lot of PuTTY's code is written in a style that looks structurally rather like an object-oriented language, in spite of PuTTY being a pure C program.

For example, there's a single data type called ssh_hash, which is an abstraction of a secure hash function, and a bunch of functions called things like ssh_hash_foo that do things with those data types. But in fact, PuTTY supports many different hash functions, and each one has to provide its own implementation of those functions.

In C++ terms, this is rather like having a single abstract base class, and multiple concrete subclasses of it, each of which fills in all the pure virtual methods in a way that's compatible with the data fields of the subclass. The implementation is more or less the same, as well: in C, we do explicitly in the source code what the C++ compiler will be doing behind the scenes at compile time.

But perhaps a closer analogy in functional terms is the Rust concept of a ‘trait’, or the Java idea of an ‘interface’. C++ supports a multi-level hierarchy of inheritance, whereas PuTTY's system – like traits or interfaces – has only two levels, one describing a generic object of a type (e.g. a hash function) and another describing a specific implementation of that type (e.g. SHA-256).

The PuTTY code base has a standard idiom for doing this in C, as follows.

Firstly, we define two struct types for our trait. One of them describes a particular kind of implementation of that trait, and it's full of (mostly) function pointers. The other describes a specific instance of an implementation of that trait, and it will contain a pointer to a const instance of the first type. For example:

typedef struct MyAbstraction MyAbstraction;
typedef struct MyAbstractionVtable MyAbstractionVtable;

struct MyAbstractionVtable {
    MyAbstraction *(*new)(const MyAbstractionVtable *vt);
    void (*free)(MyAbstraction *);
    void (*modify)(MyAbstraction *, unsigned some_parameter);
    unsigned (*query)(MyAbstraction *, unsigned some_parameter);
};

struct MyAbstraction {
    const MyAbstractionVtable *vt;
};

Here, we imagine that MyAbstraction might be some kind of object that contains mutable state. The associated vtable structure shows what operations you can perform on a MyAbstraction: you can create one (dynamically allocated), free one you already have, or call the example methods ‘modify’ (to change the state of the object in some way) and ‘query’ (to return some value derived from the object's current state).

(In most cases, the vtable structure has a name ending in ‘vtable’. But for historical reasons a lot of the crypto primitives that use this scheme – ciphers, hash functions, public key methods and so on – instead have names ending in ‘alg’, on the basis that the primitives they implement are often referred to as ‘encryption algorithms’, ‘hash algorithms’ and so forth.)

Now, to define a concrete instance of this trait, you'd define a struct that contains a MyAbstraction field, plus any other data it might need:

struct MyImplementation {
    unsigned internal_data[16];
    SomeOtherType *dynamic_subthing;

    MyAbstraction myabs;
};

Next, you'd implement all the necessary methods for that implementation of the trait, in this kind of style:

static MyAbstraction *myimpl_new(const MyAbstractionVtable *vt)
{
    MyImplementation *impl = snew(MyImplementation);
    memset(impl, 0, sizeof(*impl));
    impl->dynamic_subthing = allocate_some_other_type();
    impl->myabs.vt = vt;
    return &impl->myabs;
}

static void myimpl_free(MyAbstraction *myabs)
{
    MyImplementation *impl = container_of(myabs, MyImplementation, myabs);
    free_other_type(impl->dynamic_subthing);
    sfree(impl);
}

static void myimpl_modify(MyAbstraction *myabs, unsigned param)
{
    MyImplementation *impl = container_of(myabs, MyImplementation, myabs);
    impl->internal_data[param] += do_something_with(impl->dynamic_subthing);
}

static unsigned myimpl_query(MyAbstraction *myabs, unsigned param)
{
    MyImplementation *impl = container_of(myabs, MyImplementation, myabs);
    return impl->internal_data[param];
}

Having defined those methods, now we can define a const instance of the vtable structure containing pointers to them:

const MyAbstractionVtable MyImplementation_vt = {
    .new = myimpl_new,
    .free = myimpl_free,
    .modify = myimpl_modify,
    .query = myimpl_query,
};

In principle, this is all you need. Client code can construct a new instance of a particular implementation of MyAbstraction by digging out the new method from the vtable and calling it (with the vtable itself as a parameter), which returns a MyAbstraction * pointer that identifies a newly created instance, in which the vt field will contain a pointer to the same vtable structure you passed in. And once you have an instance object, say MyAbstraction *myabs, you can dig out one of the other method pointers from the vtable it points to, and call that, passing the object itself as a parameter.

But in fact, we don't do that, because it looks pretty ugly at all the call sites. Instead, what we generally do in this code base is to write a set of static inline wrapper functions in the same header file that defined the MyAbstraction structure types, like this:

static inline MyAbstraction *myabs_new(const MyAbstractionVtable *vt)
{ return vt->new(vt); }
static inline void myabs_free(MyAbstraction *myabs)
{ myabs->vt->free(myabs); }
static inline void myimpl_modify(MyAbstraction *myabs, unsigned param)
{ myabs->vt->modify(myabs, param); }
static inline unsigned myimpl_query(MyAbstraction *myabs, unsigned param)
{ return myabs->vt->query(myabs, param); }

And now call sites can use those reasonably clean-looking wrapper functions, and shouldn't ever have to directly refer to the vt field inside any myabs object they're holding. For example, you might write something like this:

MyAbstraction *myabs = myabs_new(&MyImplementation_vtable);
myabs_update(myabs, 10);
unsigned output = myabs_query(myabs, 2);
myabs_free(myabs);

and then all this code can use a different implementation of the same abstraction by just changing which vtable pointer it passed in in the first line.

Some things to note about this system: