Our work on security makes us deal with a lot of ASN.1 data, mostly
encoded as DER. We designed the Quick DER library to have a
commercial-grade tool to manipulate it... except that it is released
in open source.
This is a first article in an On Display in which we report on the output from projects that fall under the InternetWide Umbrella of projects.
What is it?
While the web world leans heavily on textual formats such as XML and JSON, there has long been a standard for representing data structures for binary transmission, named ASN.1 and developed by ITU. Never heard of it? That's probably because it works so well that you never noticed it, but rest assured that you use it many times a day. Like its textual counterparts, ASN.1 helps to pack and unpack data while in transit.
Data languages are a cornerstone of open protocols. They define in a precise manner what goes out and what comes in. Given accurate descriptions of such transit data, it is possible to choose freely from various implementations that adhere to the same specification. As a result, you can communicate with peers that use completely different software from what you are using.
ASN.1 is used in many open standards, such as RFC's, and ITU itself uses it in many of its telephony and networking protocols. It defines, in a somewhat abstract manner, how structures are composed. When using ASN.1 in software, it needs to be represented in some way in bits. Standard mappings have been defined such as DER (used for certificates, Kerberos tickets and much more), BER (used for LDAP), XER (which means XML) and, soon to come, JER (which maps to JSON). The same ASN.1 specification can be used with any or all of these representations.
Quick DER is a library that focusses on the most prominent representation in Internet Standards, namely DER. It also tolerates its more general cousin BER. With the library, you can pack and unpack such structures, and end up with structure fields that contain the data passed across the wire.
One advantage of DER over formats like JSON is that it is easy to skip ahead: the structures are nested and when there is no need for a part then its contents need not be explicitly skipped. Another advantage is that the format is binary, and can pass any data without escaping. You can include quotes, backslashes, and zero characters without worrying that it might be mangled while in transit. This not only means great simplification of code, but as a result of that, less opportunities for bugs or misinterpretation.
Design Criteria
We wanted Quick DER to be secure, meaning that it parses with great care. Bugs have been known to lead to security leaks before. So, simplicity was key in the design.
We also wanted a compact library, mindful of possible applications on embedded platforms, such as the IoT applications that are discussed so often these days. Compactness in terms of code size and memory footprint were both important to us. We achieved this by a surprisingly simple choice, namely to not allocate memory in the library. The cost that resulted from this choice was rather limited: repeating structures such as lists or sets of other elements are not unpacked, but should instead be iterated over.
We wanted a highly reliable interface, simple and consistent in use. We
believe we succeeded by invoking der_pack()
and der_unpack()
operations
with a simple command sequence that can be setup in the calling code as a
constant bytecode array.
And, we wanted to provide programmer convenience. Though some luxoury automation is missing, such as for decoding integers, we added those as add-on utility functions that are only one call away. The general bit however, is really convenient to use: every unpacker output (or packer input) is represented as a pointer/length combination that holds a value as it is encapsulated in the DER format. Note that this is the best format to retain the ability to carry any binary data, including quotes, escapes and zero characters, so just as the wire format did.
Compiling Syntax Specifications
Given an ASN.1 specification, the accompanying tool asn2quickder
can be
used to translate it to a header file for use in C, as well as wrapper
classes in Python. Both support an access pattern that is native to the
language, and adheres to the structures described in ASN.1; structures
(called SEQUENCE
in ASN.1) have field names, which can be used to dive
into the structure; C uses overlay struct
definitions for this purpose,
and Python has class members to achieve the same.
Without this compiler, based on the generic asn1ate
parser library,
the Quick DER tool would not be half as useful. With it, you can write
your own ASN.1 specifications and translate them to bytecode and structures
that allow simple packing and unpacking of DER encoded data.
We didn't stop by just supplying this tool. There are so many standards making good use of ASN.1 that we decided to setup for pre-compilation of their specifications. This also helps to make slight adaptions if the format is too difficult to process (ASN.1 is a big topic) or plain wrong (as we have indeed found in a few standards by using this tool). The intended result is that anyone who installs Quick DER is immediately rewarded with include files for the most commonly used standard formats.
If you want to add a standard to this library, please send us a pull request, and we will gladly add it. We can do a lot of good by sharing code, but even more when we additionally share these descriptors that allow straightforward handling of standard data.
Usage Ideas
DER is a very good facility for specifying your protocols. This is what Kerberos does, for example. It is also used for data formats, such as X.509 or PKIX certificates. It has never been so easy to parse a certificate and pull out, say, the name of its owner. This is the sort of use that we have in mind when we line up with existing protocols, and sometimes expand them with new ideas. This is why we thought it worthwhile to create the Quick DER software package.
Not all protocols and not all data formats follow DER encoding rules, of course. TLS for example, has elected to define its own formats and has many corners in which it finds yet other ways of formatting options and extensions. And GnuPG also has its own data formats. In both cases, the use cases would have been suited for ASN.1 description, but another choice was made.
In terms of specification complexity, the definition of a local encoding is rather costly. All implementations must go to the bit level, as does the specification, rather than start from a record level that assumes that the data is somehow magically transported. This is just how things work, but it could be a lesson to us all. Quick DER was created so not even the most open-source minded person would feel a need to invent yet another wire format, cause yet another stream of decoding bugs, and learn yet again the same old lessons. The choice, however, is yours.
Getting Started
If you feel like diving in, please head for these documents first:
The interface to C is the most developed, though you should feel happy with the Python wrappers as well.
Go Top