Before I start coding things around code generation, I want to explain what I intend to do so you won’t get lost if you’re trying to follow me.

Up until now, I’ve implemented two fairly simple classes: the DEREncoder class and the DERDecoder class. The former is basically a collection of static member functions that can be used to encode basically anything in DER, the latter is a simple state machine that decodes DER and passes whatever it finds to its derived classes. These two classes, together with the Details::Integer class, are intended to make up the bulk of the DER encoding and decoding run-time support library.

Each of these classes are actually class templates which you happen to be able to use without any extra arguments. This has two advantages: the first is that whatever library comes out of this can still be a header-only library. The second is that any reasonably configurable values remain configurable and we don’t use the preprocessor to configure them. That means that if the final user code needs one DER decoder with a 2K parse buffer and another with a 4K parse buffer, it’s quite easy to accomodate that (this would be much more difficult if I had used preprocessor macros for the configuration values).

Using the DEREncoder class to encode something will look like a series of calls, like this:

DEREncoder<> encoder;
vector< unsigned char > target;
encoder.encodeInteger(back_inserter(target), Details::Integer(12));
encoder.encodeOctetString(back_inserter(target), name.begin(), name.end()):
...

That is fine, but it gets tedious after a while. What you really want to be able to do is say message.serialize(back_inserter(target)); and not have to worry about what the structure of your message is.

The structure of the message, when working with ASN.1, is defined in the abstract syntax of whatever it is you’re working on. In order to now have to write the code to work with the DER encoder and decoder, that code can be generated from that ASN.1 specification.

Code generation has many advantages. For one thing, large parts of code are tedious repetitions of the same blocks used over and over again, but with minor tweaks along the way. Generating the code from a template with those tweaks in the paces where we know they will need to go avoids a lot of copy-and-paste errors. Another advantage is that if there’s a bug in the generated code, the bug is really in the code generator. That means that if you fix the generator, you fix that bug, but you also fix the same bug in all the other generated code.rnrnIn order to be able to generate code, you need to have a specification that you can generate the code from. In the case of ASN.1, that will be the ASN.1 grammar, which is defined in X.680. In the case of our final generated code, that will be a schema written in the ASN.1 schema language.

“But Ronald,” you ask: “why didn’t you think of this earlier and, in stead of writing the DER encoder and decoder, generate that as well?” Well, there are two reasons for writing the encoder and the decoder by hand. The first is that the specification for DER (defined in X.690) is far less amenable to code generation than the specification for the ASN.1 schema language (defined in X.680) is, so it’s easier to write the code by hand. The second is that the specification for DER itself is a very slow-moving target. What we’ll be generating with the code generator is not an interpreter for the ASN.1 schema language (though we’ll be generating most of that as well), but the code to drive the DER encoder and decoder. The specifications those are generated from are user-defined, and therefore a fast-moving target.

Most of the code I’ll be writing for the next few commits will be one of two things: either I’ll transcribe the ASN.1 schema language grammar to use as a specification for code generation to generate an ASN.1 schema parser, or I’ll write the code to integrate that parser with a code generator (which I’ll also write). The end product of this will be an ASN.1 schema parser that will generate the code corresponding to whatever is specified in the ASN.1 schema. That generated code will support DER serialization and deserialization, and will add accessors for each of the fields in the encoded types.

To generate the parser code from the grammar specification, I’ll be using ANTLR4. You can get it from antlr.org. Just follow the instructions on the site to install it. If, like me, you’re running it on Ubuntu, make sure you have a recent version of Sun’s Java installed and make sure you use the complete jar from antlr.org with C++ code generation support.”