Optional and name-based arguments in C

Introduction

It’s inevitable that some functions’ signatures become unwieldy. This is especially true as a project ages, and various developers who were not involved in the initial vision are asked to extend and tweak various existing functionality.

In a high-level language, it’s often possible to effect new changes in an existing function without modifying all its consumers.

In C however, that’s not the case (unless the function already happens to use va_args). Once a C function’s signature changes, all consumers must be modified. Things get even more complicated if the function is part of a public API in a library.

It occurred to me that there are existing features of C that allow us to pre-emptively design functions in a way that future-proofs them in anticipation of these types of changes. I’m not advocating that all functions be designed this way, however for the the class of functions that appear candidates for future signature expansion, my proposal may provide some ideas.

Existing stepping stones

The main features we’ll be putting together for this method are:

Calling by-value

In C all function parameters and returns are passed by-value, not by reference. That means the callee receives a copy of the parameters, and can not influence the caller’s copy.

Perhaps for the sake of performance, C developers generally assume the above rule leads to “structs must be passed as a pointer” to allow that copying to be performed, however that is not true. It’s perfectly valid in C to call a function passing it a struct (not a pointer to a struct). The function’s signature would look like this:

int say_hello(struct person p);

The caller would call this function like so:

...
  struct person bob;
...
  say_hello(bob);
...

say_hello would receive a COPY of that struct, which means its local p will be at a different address than bob. The inner members of p will be copied, individually, to the values of the inner members of bob

Initializing structs

Initializing a variable such as an int can be done like so:

int age = 20;

Similarly, initializing a struct can be done like so:

struct person bob = {"Bob", 20, "Accounting"};

The individual struct members are given, between brackets, in the same order as the struct declaration.

Missing members are allowed:

struct person bob = {"Bob", 20};

In the above case, any members not supplied will be initialized to 0 / NULL.

It’s perfectly legal to not supply any members, in which case all members will be set to 0 / NULL:

struct person bob = {};

Inlined initialization

say_hello which accepts a struct may be called using an inline-initialized struct like so:

say_hello({"Bob", 20, "Accounting"});

Some compilers may need the struct explicitly cast, like so:

say_hello((struct person){"Bob", 20, "Accounting"});

C99 Designated initializers

When initializing a struct, you have the option of referencing its individual members by name, instead of by position:

struct person bob = {
  .name       = "Bob",
  .age        = 20,
  .department = "Accounting"
};

Again, as with position-based initialization, members may be omitted to default them to 0 / NULL:

struct person bob = {
  .name       = "Bob"
};

Variadic macros

While macros need no introduction to any seasoned C programmers, they’re typically declared to accept a fixed set of arguments. Variadic macros allow for a dynamic set of arguments to the macro. We’ll be using them for some final syntactic sugar, but they’re not strictly necessary.

Putting it all together

We’ll assume we’re writing a find function that, given a set of criteria, searches an employee directory and returns matches.

The implementation of the search and returning the result aren’t important. What is however is designing a signature for that function that anticipates inevitable future changes to the supplied search criteria.

/* nba.h */

#ifndef _NBA_H
#define _NBA_H

struct _find_args {
        char * name;
        int min_age;
        int max_age;
};
#define FindArgs(...) ((struct _find_args){__VA_ARGS__})

int find(struct _find_args fa);
#define Find(...) (find(FindArgs(__VA_ARGS__)))

#endif

The implementation of find itself, for the purpose of demonstrating consuming the input, looks like so:

int find(struct _find_args fa) {

        printf("Looking for criteria:\n");

        if (fa.name) {
                printf("\tName: [%s]\n", fa.name);
        }
        if (fa.min_age != -1) {
                printf("\tMin age: [%d]\n", fa.min_age);
        }
        if (fa.max_age != -1) {
                printf("\tMax age: [%d]\n", fa.max_age);
        }

        return(1);
}

Finally, the consumer consumes it using the Find macro (notice the capitalization). Here’s a battery of consumption cases:

int main(int argc, char ** argv) {

        char *n;
        char *names[] = {"Albert", "Jane", NULL};
        int i;

        /* Static input */
        Find(
                .name = "Bob",
                .min_age = -1,
                .max_age = 20
        );

        /* Dynamic input */
        n = "Mary";
        Find(
                .name = n,
                .min_age = 8,
                .max_age = -1
        );

        /* Dynamic input in loop */
        for (i=0; (n = names[i]); i++) {
                Find(
                        .name = n,
                        .min_age = -1,
                        .max_age = -1
                );
        }

        return(1);

}

When run, the output is:

$ ./nba
Looking for criteria:
        Name: [Bob]
        Max age: [20]
Looking for criteria:
        Name: [Mary]
        Min age: [8]
Looking for criteria:
        Name: [Albert]
Looking for criteria:
        Name: [Jane]

End result and viability

The end result is demonstrated in the main function above.

The pros are:

Extremely clear naming for the arguments being given to the function
No need to supply arguments that are not needed
Ability to add new members to the struct without altering existing code that consumes that function

The cons are:

There’s probably more overhead to copying structs in general compared to copying the same number of elements declared as regular function parameters
Missing members defaulting to 0 / NULL may be logically ambiguous if 0 / NULL is a logically acceptable explicit value for a given member. That is why the above example explicitly sets min_age / max_age to -1 to indicate non-interest. Another approach would be to use a secondary member to indicate the desire to “use” the first member. For example, age coupled with use_age
Variadic macros may not be available everywhere. This method still works without them, at the cost of more verbosity for the function consumer.

So far, the above has just been ideas and experimentation on my part. It’s not something I’ve used in any actual projects I’m working on.

If you’re a C developer I’d love to hear your thoughts on this approach.

December 9, 2010