By Hannes Mehnert - 2017-02-27
In this article, some technical background and empirical evidence is given how we reduced the lines of code in Mirage3, which has about 25% fewer lines of code than Mirage2, while providing more features.
Mirage does a fair amount of code generation since its initial release to extend target-agnostic unikernels to target-specific virtual machine images (or Unix binaries). Until Mirage 2.7, string concatenation was used heavily. Since the Mirage 2.7.0 release (February 2016), it is based on functoria, "a DSL to describe a set of modules and functors, their types and how to apply them in order to produce a complete application". The code generated by Mirage3 is less complex than the Mirage2 one and contains up to 45% fewer lines of code.
Code generated by a program with intricate control flow and automatically generated identifier names is difficult to understand by a human - in case the generated code is incorrect and needs to be debugged (or the compiler chokes on it with an error message pointing in the middle of intricate generated code). It is also a burden on the developer, since generated code should not be part of the version control system, thus the build system needs to include another step. If the code generator is buggy, or not easily extendible for new features, developers may want to manually modify the generated code - which then turns into a release nightmare, since you need to maintain a set of patches on top of generated code, while the code generator may is developed alongside. Generating code is best avoided - maybe there is a feature in the programming language to solve the boilerplate without code generators.
Having said this, there's nothing wrong with LISP macros or MetaOCaml.
Mirage uses code generation to complete backend-agnostic unikernels with the
required boilerplate to compile for a specific backend - by selecting the
network device driver, the console, the network stack, and other devices -
taking user-supplied configuration arguments into account. In Mirage, the OCaml
TCP/IP stack requires any network device which implements the
Mirage_net.S
module type.
At the end of the day, some mechanism needs to be in place which links the mirage-net-solo5 library if compiling for Solo5 (or mirage-net-xen if compiling for xen, or mirage-net-unix for Unix, or mirage-net-macosx for MacOSX). This can be left to each unikernel developer, which would require having the same boilerplate code all over, which needs to be updated if a new backend becomes available (Mirage2 knew about Xen, Unix, and MacOSX, Mirage3 extends this with Solo5 and Qubes). Instead, the mirage tool generates this boilerplate by knowing about all supported devices, and which library a unikernel has to link for a device depending on the target and command line arguments. That's not exactly the ideal solution. But it works good enough for us right now (more or less). A single place - the mirage tool - needs to be extended whenever a new backend becomes available.
connect
Devices may depend on each other, e.g. a TCP stack requires a monotonic clock and a
random number generator, which influences the initialisation order. Mirage
generates the device initialisation startup code based on the configuration and
data dependencies (which hopefully form an acyclic graph). Mirage2 allowed to
handle initialisation errors (the type of connect
used to be unit -> [ `Ok of t | `Error of error ] io
), but calls to connect
were automatically
generated, and the error handler always spit out an error message and exited.
Becaus the error
was generic, Mirage2 didn't know how to properly print it,
and instead failed with some incomprehensible error message. Pretty printing
errors is solved in Mirage3 by our re-work of errors, which now use the result
type, are extendible, and can be pretty printed. Calls to connect
are
automatically generated, and handling errors gracefully is out of scope for a
unikernel -- where should it get the other 2 network devices promised at
configuration time from, if they're not present on the (virtual) PCI bus?
The solution we discussed
and implemented (also in functoria) was to always fail hard (i.e. crash) in connect : unit -> t
. This lead to a series of patches for all implementors of connect
,
where lots of patches removed control flow complexity (and less complex test
cases, see e.g.
mirage-net-unix, or
tcpip). Lots of common
boilerplate (like or_error
, which throws an exception if connect
errored)
could be removed.
Comparing the generated main.ml
between Mirage 2.9.1 and 3.0.0 for various
unikernels on both unix and xen code reductions up to 45% (diffs are
here)
The workflow to build a unikernel used to be mirage configure
followed by
make
. During the configure phase, a Makefile
was generated with the right
build and link commands (depending on configuration target and other
parameters). Mirage2 installed opam packages and system packages as a side
effect during configuration. This lead to several headaches: you needed to have the
target-specific libraries installed while you were configuring (you couldn't
even test the configuration for xen if you didn't have xen headers and support
libraries installed). Reconfiguration spawned yet another opam
process (which
even if it does not install anything since everything required is already
installed, takes some time since the solver has to evaluate the universe) -
unless the --no-opam
option was passed to mirage configure
.
A second issue with the Mirage2 approach was that dependent packages were listed
in the unikernel config.ml
, and passed as string to opam. When version
constraints were included, this lead either shell (calling out opam
) or make
(embedding the packages in the Makefile) or both to choke. Being able to
express version constraints for dependencies in config.ml
was one of the most
wanted features for Mirage3. It is crucial for further development (to continue
allowing API breakage and removing legacy): a unikernel author, and the mirage
tool, can now embed versioned dependencies onto device interfaces. Instead of a
garbled error message from mirage trying to compile a unikernel where the
libraries don't fit the generated code, opam will inform which updates are
necessary.
In a first rampage (functoria) instead of
manual executions of opam
processes, an opam package file was generated by
mirage at configuration time for the given target. This allowed to express
version constraints in each config.ml
file (via the package
function). This
change also separated the configuration phase, the dependency installation
phase, and the build phase - which included delayed invocations of pkg-config
to pass parameters to ld
. A mess, especially if your goal is to generate
Makefiles which run both on GNU make and BSD make.
A second approach (functoria) digged a bit
deeper down the rabbit hole, and removed complex selection and adjustment of
strings to output the Makefile, by implementing this logic in OCaml (and calling
out to ocamlbuild
and ld
). Removing an uneeded layer of code generation is
easier to read and understand, less code, and includes stronger guarantees.
More potential errors are caught during compile time, instead of generating
(possible ill-formed) Makefiles. Bos is a
concise library interacting with basic operating system services, and solves
once and for all common issues in that area, such as properly escaping of
arguments.
Mirage3 contains, instead of a single configure_makefile
function which
generated the entire makefile, the build and link logic is separated into
functions, and only a simplistic makefile is generated which invokes mirage build
to build the unikernel, and expects all dependent libraries to be
installed (e.g. using make depend
, which invokes opam
) -- no need for
delaying pkg-config
calls anymore.
This solution has certainly less complex string concatenation, and mirage has now a clearer phase distinction - configure, depend, compile & link. (This workflow (still) lacks a provisioning step (e.g. private key material, if provided as static binary blob, needs to be present during compilation atm), but can easily be added later.) There are drawbacks: the mirage utility is now needed during compilation and linking, and needs to preserve command line arguments between configuration and build phase. Maybe the build step should be in the opam file, then we would need to ensure unique opam package names and we would need to communicate to the user where the binary got built and installed.
The first commit to mirage is from 2004, back then opam was an infant. Mirage2 ensured that a not-too-ancient version of OCaml is installed (functoria contained a similar piece of code). Mirage3 relies on opam to require a certain OCaml version (at the moment 4.03).
Mirage and functoria were developed while support libraries were not yet available - worth mentioning bos (mentioned above), fpath, logs, and astring. Parts of those libraries were embedded in functoria, and are now replaced by the libraries. (See mirage#703 and functoria#84 in case you want to know the details.)
Functoria support for OCaml <4.02
has been
dropped, also
astring is now in use.
Mirage support for OCaml <4.01
has been
dropped
from Mirage.
Some C bits and pieces, namely str
, bignum
, and libgcc.a
, are no longer linked and part
of every unikernel. This is documented in
mirage#544 and
mirage#663.
The overall statistics of Mirage3 look promising: more libraries, more contributors, less code, uniform error treatment, unified logging support. Individual unikernels contain slightly less boilerplate code (as shown by these unified diffs).
The binary sizes of the above mentioned examples (mirage-skeleton, nqsb, Canopy, pinata) between Mirage2 and Mirage3 results on both Unix and Xen only in small differences (in the range of kilobytes). We are working on a performance harness to evaluate the performance of flambda intermediate language in OCaml and dead code elimination. These should decrease the binary size and improve the performance.