Abusing The C Preprocessor: Writing A 4-Bit Adder

I've always wondered what could be done with the C preprocessor. The existence of projects such as Boost Preprocessor and the infamous Brainfuck interpreter are a testament to its very interesting capabilities.

However, as they say, the best way to learn is to do, and I felt like indulging in some masochism. So - onwards!

Token pasting is the main (see: only) operation in the C preprocessor. Two tokens (input strings) are added together (concatenated); while this seems rather limited, you'll see it has plenty of potential:

#include <stdio.h>

/* The token pasting abuse that makes this possible */
#define JOIN_INTERNAL( a, b ) a##b
#define JOIN( a, b ) JOIN_INTERNAL( a, b )

We use two macros here because our main JOIN macro needs to expand its arguments (which will be macros themselves!) before passing it onto JOIN_INTERNAL - otherwise, the macros will go through without being expanded, and you'll find out the painful way that doesn't work.

So, we have token pasting. We know that we can add together tokens to produce new tokens; so what if you used this little trick to make binary operators?

Say we have two tokens, representing bits: 0 and 1. If we token paste them together, we get "01". Now, if we token paste that with another token, say XOR_, we can get "XOR_01", which is a token in itself.

We know that the preprocessor is all too happy to expand these; so what if we made a truth table out of them? By defining macros which correspond to our tokens (XOR_00, XOR_01, etc), the values of these macros are substituted when we token paste XOR_, 0, and 1 together - we have created an operator!

By defining the truth table for XOR and AND, we've set up the basic building blocks that we need for our 4-bit adder:

/* Operators */
/* XOR */
#define XOR_00 0
#define XOR_10 1
#define XOR_01 1
#define XOR_11 0

#define XOR( a, b ) JOIN( XOR_, JOIN( a, b ) )

/* AND */
#define AND_00 0
#define AND_10 0
#define AND_01 0
#define AND_11 1

#define AND( a, b ) JOIN( AND_, JOIN( a, b ) )

And here it is - the core of the adder. These macros transcode the logic behind the full adder - the equations have been reproduced in the comments.

/* Full adder macros */
/* Out = (A ^ B) ^ cin */
#define FULL_ADD_OUT( a, b, cin ) \
    XOR( XOR( a, b ), cin )

/* Carry_out = (A & B) ^ (Carry_in & (A ^ B)) */
/* The standard adder uses OR for the last 'gate' - this can safely be changed
   to XOR, which has been done here to avoid defining an OR operator */
#define FULL_ADD_CARRY( a, b, cin ) \
    XOR( AND( XOR( a, b ), cin ), AND( a, b ) )

These macros basically chain together the XOR and AND operators we made before, so that we can calculate the output and Carry Out bits from our input - the A, B and Carry In bits. Because our macros are functions, it's like writing the equations in Polish notation.

To calculate an addition, we call FULL_ADD_OUT and FULL_ADD_CARRY for each output bit; the parameters are the input bits, and the Carry Out from the last bit. Basically, we chain each full adder by its Carry Out output. As for the very first full adder? We simply pass zero into the Carry In.

Now, I don't know about you, but I find it mildly difficult to view numbers in binary format. Of course, full adders operate in binary; so we must resolve this conflict somehow. Remember how we defined XOR and AND with token pasting? Same principle applies here!

/* Number -> binary conversion */
#define NUMBER_TO_BINARY_0 0, 0, 0, 0
#define NUMBER_TO_BINARY_1 0, 0, 0, 1
#define NUMBER_TO_BINARY_2 0, 0, 1, 0
#define NUMBER_TO_BINARY_3 0, 0, 1, 1
/* ... To NUMBER_TO_BINARY_15 ... */

#define NUMBER_TO_BINARY( a ) JOIN( NUMBER_TO_BINARY_, a )

/* Binary -> number conversion */
#define BINARY_TO_NUMBER_0000 0
#define BINARY_TO_NUMBER_0001 1
#define BINARY_TO_NUMBER_0010 2
#define BINARY_TO_NUMBER_0011 3
/* ... To BINARY_TO_NUMBER_1111 ... */

#define BINARY_TO_NUMBER_INTERNAL( a, b, c, d ) \
    JOIN( BINARY_TO_NUMBER_, JOIN( a, JOIN( b, JOIN( c, d ) ) ) )

/* This is necessary to expand an incoming macro into its binary
   form; otherwise, only one argument will be passed into the macro
   expecting four */
#define BINARY_TO_NUMBER( x ) BINARY_TO_NUMBER_INTERNAL( x )

NUMBER_TO_BINARY takes in a base-10 number from 0 to 15, adds it to NUMBER_TO_BINARY_, and then returns the appropriate token; this expands to the binary format that our adder uses.

BINARY_TO_NUMBER takes in four bits, adds each of them together (0, 1, 1, 0 becomes 0110), and then adds that to BINARY_TO_NUMBER_, which expands to a normal base-10 number.

Of note is that there are two macros - BINARY_TO_NUMBER_INTERNAL, and BINARY_TO_NUMBER. You see, this macro suffers from the same problem as JOIN - it needs to expand its arguments before we can use it! BINARY_TO_NUMBER simply expands the arguments and passes it to BINARY_TO_NUMBER_INTERNAL, which does the actual token pasting.

/* Four-bit addition! Can easily be extended, but gets messy for 
   obvious reasons */
#define ADD_A_B_4BIT( a1, a2, a3, a4, b1, b2, b3, b4 ) \
    FULL_ADD_OUT( a1, b1, \
        FULL_ADD_CARRY( a2, b2, \
            FULL_ADD_CARRY( a3, b3, \
                FULL_ADD_CARRY( a4, b4, 0 ) ) ) ), \
    \
    FULL_ADD_OUT( a2, b2, \
        FULL_ADD_CARRY( a3, b3, \
            FULL_ADD_CARRY( a4, b4, 0 ) ) ), \
    \
    FULL_ADD_OUT( a3, b3, \
        FULL_ADD_CARRY( a4, b4, 0 ) ), \
    \
    FULL_ADD_OUT( a4, b4, 0 ) \

/* This macro receives two NUMBER macros, which expand to the 4-bits necessary
   for each argument */
#define ADD_A_B_4BIT_NUMBERS( n1, n2 ) ADD_A_B_4BIT( n1, n2 )

And now: four-bit addition. Looks nasty? That's because it is. This macro spits out four bits, just like our NUMBER_TO_BINARY macros.

Each bit is the result of FULL_ADD_OUT; since there's no way to save the FULL_ADD_CARRY from each bit, we deal with this problem the questionable way: we manually nest the FULL_ADD_CARRY macros. The last bit has no carry, so we set it to zero; the second-last bit has the carry from the last bit; the third-last bit has the carry from the second-last bit, which is itself defined in terms of the last bit's carry; and so on, until we reach the first bit.

That's a simple operation when unpacked, resulting in the 4-bit addition of the numbers we pass in. We have to wrap ADD_A_B_4BIT in ADD_A_B_4BIT_NUMBERS so that our NUMBER macros get converted to binary, and then they're passed into ADD_A_B_4BIT. Phew!

int main()
{
    printf( "%i + %i = %i\n", 
        NUMBER_1,
        NUMBER_2,
        BINARY_TO_NUMBER( 
            ADD_A_B_4BIT_NUMBERS( 
                NUMBER_TO_BINARY( NUMBER_1 ), 
                NUMBER_TO_BINARY( NUMBER_2 ) ) ) );
}

And now, our test case. We have our standard int main() function, and a seemingly standard printf showing an addition. But wait! In lieu of actual numbers, we have NUMBER_1, NUMBER_2, and our addition macro.

NUMBER_1 and NUMBER_2 aren't defined in the actual code: we provide them when we compile the code, so we can change it at compile-time. This code is relatively simple: our two numbers are converted to binary, run through our addition macro, and the result is converted back into a number, just in time to be passed into our printf.

Time to test this sucker!

philpax@philliplarkson:$ clang full_add.c -DNUMBER_1=7 -DNUMBER_2=4
philpax@philliplarkson:$ ./a.out
7 + 4 = 11

Ain't she a beaut?

And for good measure, here's the expanded version of the int main() function after being run through our macros.

philpax@philliplarkson:$ clang -E full_add.c -DNUMBER_1=7 -DNUMBER_2=4
int main()
{
    printf( "%i + %i = %i\n",
        7,
        4,
        11 );
}

Alright; I hope you've enjoyed this brief look at the mysterious, powerful, and quite frankly insane C preprocessor. A compiling version of the full code can be found here. I tested it with GCC and Clang, both of which had no issues; I couldn't get it to work with MSVC, but I'm not particularly concerned by that failing.