By Hannes Mehnert - 2019-03-05
We are happy to announce our MirageOS 3.5.0 release. We didn't announce post 3.0.0 releases too well -- that's why this post tries to summarize the changes in the MirageOS ecosystem over the past two years. MirageOS consists of over 100 opam packages, lots of which are reused in other OCaml projects and deployments without MirageOS. These opam packages are maintained and developed further by lots of developers.
On the OCaml tooling side, since MirageOS 3.0.0 we did several major changes:
pin-dependsallows you to depend on a development branch of any opam package for your unikernel,
miragecommand-line utility now emits lower and upper bounds of opam packages, allowing uncompromising deprecation of packages,
safe-stringis enabled by default. Strings are immutable now!!,
resultpackage, which has incorporated into
Pervasivessince OCaml 4.03.0.
The 3.5.0 release contains several API improvements of different MirageOS interfaces - if you're developing your own MirageOS unikernels, you may want to read this post to adjust to the new APIs.
type tconstrained to
unitas of 2.0.0;
ETHIFmodule type to the clearer
ETHERNET. As of 2.0.0 it also contains keep-alive support, complies with recent TCP/IP layering rework (see below), and IPv4 now supports reassembly and fragmentation;
We improved the key-value store API, and added a read-write store. There is also ongoing work which implements the read-write interface using irmin, a branchable persistent storage that can communicate via the git protocol. Motivations for these changes were the development of CalDAV, but also the development of wodan, a flash-friendly, safe and flexible filesystem. The goal is to EOL the mirage-fs interface in favour of the key-value store.
Major API improvements (in this PR, since 2.0.0):
keyis now a path (list of segments) instead of a
valuetype is now a
list : t -> key -> (string * [Value|
Dictionary], error) result iowas added
get : t -> key -> (value, error) result iois now provided (used to be named
readand requiring an
last_modified : t -> key -> (int * int64, error) result ioand
digest : t -> key -> (string, error) result iohave been introduced
RWfor read-write key-value stores extends
ROwith three functions
There is now a non-persistent in-memory implementation of a read-write key-value store available. Other implementations (such as crunch, mirage-kv-unix, mirage-fs, tar have been adapted, as well as clients of mirage-kv (dns, cohttp, tls)).
The IPv4 implementation now has support for fragment reassembly. Each incoming IPv4 fragment is checked for the "more fragments" and "offset" fields. If these are non-zero, the fragment is processed by the fragment cache, which uses a least recently used data structure of maximum size 256kB content shared by all incoming fragments. If there is any overlap in fragments, the entire packet is dropped (avoiding security issues). Fragments may arrive out of order. The code is heavily unit-tested. Each IPv4 packet may at most be in 16 fragments (to minimise CPU DoS with lots of small fragments), the timeout between the first and last fragment is 10 seconds.
The layering and allocation discipline has been revised.
ethernet (now encapsulating and decapsulating Ethernet) and
arp (the address resolution protocol) are separate opam packages, and no longer part of
At the lowest layer, mirage-net is the network device. This interface is implemented by our different backends (xen, solo5, unix, macos, and vnetif). Some backends require buffers to be page-aligned when they are passed to the host system. This was previously not really ensured: while the abstract type
page_aligned_buffer was required,
writev) took the abstract
buffer type (always constrained to
Cstruct.t by mirage-net-lwt). The
mtu (maximum transmission unit) used to be an optional
connect argument to the Ethernet layer, but now it is a function which needs to be provided by mirage-net.
Mirage_net.write function now has a signature that is explicit about ownership and lifetime:
val write : t -> size:int -> (buffer -> int) -> (unit, error) result io.
It requires a requested
size argument to be passed, and a fill function which is called with an allocated buffer, that satisfies the backend demands. The
fill function is supposed to write to the buffer, and return the length of the frame to be send out. It can neither error (who should handle such an error anyways?), nor is it in the IO monad. The
fill function should not save any references to the buffer, since this is the network device's memory, and may be reused. The
writev function has been removed.
The Ethernet layer does encapsulation and decapsulation now. Its
write function has the following signature:
val write: t -> ?src:macaddr -> macaddr -> Ethernet.proto -> ?size:int -> (buffer -> int) -> (unit, error) result io.
It fills in the Ethernet header with the given source address (defaults to the device's own MAC address) and destination address, and Ethernet protocol. The
size argument is optional, and defaults to the MTU. The
buffer that is passed to the
fill function is usable from offset 0 on. The Ethernet header is not visible at higher layers.
The IP layer also embeds a revised
val write: t -> ?fragment:bool -> ?ttl:int -> ?src:ipaddr -> ipaddr -> Ip.proto -> ?size:int -> (buffer -> int) -> buffer list -> (unit, error) result io.
This is similar to the Ethernet signature - it writes the IPv4 header and sends a packet. It also supports fragmentation (including setting the do-not-fragment bit for path MTU discovery) -- whenever the payload is too big for a single frame, it is sent as multiple fragmented IPv4 packets. Additionally, setting the time-to-live is now supported, meaning we now can implement traceroute!
The API used to include two functions,
write, where only buffers allocated by the former should be used in the latter. This has been combined into a single function that takes a fill function and a list of payloads. This change is for maximum flexibility: a higher layer can either construct its header and payload, and pass it to
write as payload argument (the
buffer list), which is then copied into the buffer(s) allocated by the network device, or the upper layer can provide the callback
fill function to assemble its data into the buffer allocated by the network device, to avoid copying. Of course, both can be used - the outgoing packet contains the IPv4 header, and possibly the buffer until the offset returned by
fill, and afterwards the payload.
The TCP implementation has preliminary keepalive support.
ukvmtarget was renamed to
solo5-hvtis the monitoring process
The semantics of arguments passed to a MirageOS unikernel used to vary between different backends, now they're the same everywhere: all arguments are concatenated using the whitespace character as separator, and split on the whitespace character again by parse-argv. To pass a whitespace character in an argument, the whitespace now needs to be escaped:
We are working on further changes which revise the
mirage internal build system to dune. At the moment it uses
make. The goal of this change is to make MirageOS more developer-friendly. On the horizon we have MirageOS unikernel monorepos, incremental builds, pain-free cross-compilation, documentation generation, ...