By Hannes Mehnert - 2019-03-05
We are happy to announce our MirageOS 3.5.0 release. We didn't announce post 3.0.0 releases too well -- that's why this post tries to summarize the changes in the MirageOS ecosystem over the past two years. MirageOS consists of over 100 opam packages, lots of which are reused in other OCaml projects and deployments without MirageOS. These opam packages are maintained and developed further by lots of developers.
On the OCaml tooling side, since MirageOS 3.0.0 we did several major changes:
pin-depends
in config.ml
. pin-depends
allows you to depend on a development branch of any opam package for your unikernel,mirage
command-line utility now emits lower and upper bounds of opam packages, allowing uncompromising deprecation of packages,safe-string
is enabled by default. Strings are immutable now!!,result
package, which has incorporated into Pervasives
since OCaml 4.03.0.The 3.5.0 release contains several API improvements of different MirageOS interfaces - if you're developing your own MirageOS unikernels, you may want to read this post to adjust to the new APIs.
type t
constrained to unit
as of 2.0.0;ETHIF
module type to the clearer ETHERNET
. As of 2.0.0 it also contains keep-alive support, complies with recent TCP/IP layering rework (see below), and IPv4 now supports reassembly and fragmentation;We improved the key-value store API, and added a read-write store. There is also ongoing work which implements the read-write interface using irmin, a branchable persistent storage that can communicate via the git protocol. Motivations for these changes were the development of CalDAV, but also the development of wodan, a flash-friendly, safe and flexible filesystem. The goal is to EOL the mirage-fs interface in favour of the key-value store.
Major API improvements (in this PR, since 2.0.0):
key
is now a path (list of segments) instead of a string
value
type is now a string
list : t -> key -> (string * [
Value|Dictionary], error) result io
was addedget : t -> key -> (value, error) result io
is now provided (used to be named read
and requiring an offset
and length
parameter)last_modified : t -> key -> (int * int64, error) result io
and digest : t -> key -> (string, error) result io
have been introducedsize
was removed.RW
for read-write key-value stores extends RO
with three functions set
, remove
, and batch
There is now a non-persistent in-memory implementation of a read-write key-value store available. Other implementations (such as crunch, mirage-kv-unix, mirage-fs, tar have been adapted, as well as clients of mirage-kv (dns, cohttp, tls)).
The IPv4 implementation now has support for fragment reassembly. Each incoming IPv4 fragment is checked for the "more fragments" and "offset" fields. If these are non-zero, the fragment is processed by the fragment cache, which uses a least recently used data structure of maximum size 256kB content shared by all incoming fragments. If there is any overlap in fragments, the entire packet is dropped (avoiding security issues). Fragments may arrive out of order. The code is heavily unit-tested. Each IPv4 packet may at most be in 16 fragments (to minimise CPU DoS with lots of small fragments), the timeout between the first and last fragment is 10 seconds.
The layering and allocation discipline has been revised. ethernet
(now encapsulating and decapsulating Ethernet) and arp
(the address resolution protocol) are separate opam packages, and no longer part of tcpip
.
At the lowest layer, mirage-net is the network device. This interface is implemented by our different backends (xen, solo5, unix, macos, and vnetif). Some backends require buffers to be page-aligned when they are passed to the host system. This was previously not really ensured: while the abstract type page_aligned_buffer
was required, write
(and writev
) took the abstract buffer
type (always constrained to Cstruct.t
by mirage-net-lwt). The mtu
(maximum transmission unit) used to be an optional connect
argument to the Ethernet layer, but now it is a function which needs to be provided by mirage-net.
The Mirage_net.write
function now has a signature that is explicit about ownership and lifetime: val write : t -> size:int -> (buffer -> int) -> (unit, error) result io
.
It requires a requested size
argument to be passed, and a fill function which is called with an allocated buffer, that satisfies the backend demands. The fill
function is supposed to write to the buffer, and return the length of the frame to be send out. It can neither error (who should handle such an error anyways?), nor is it in the IO monad. The fill
function should not save any references to the buffer, since this is the network device's memory, and may be reused. The writev
function has been removed.
The Ethernet layer does encapsulation and decapsulation now. Its write
function has the following signature:
val write: t -> ?src:macaddr -> macaddr -> Ethernet.proto -> ?size:int -> (buffer -> int) -> (unit, error) result io
.
It fills in the Ethernet header with the given source address (defaults to the device's own MAC address) and destination address, and Ethernet protocol. The size
argument is optional, and defaults to the MTU. The buffer
that is passed to the fill
function is usable from offset 0 on. The Ethernet header is not visible at higher layers.
The IP layer also embeds a revised write
signature:
val write: t -> ?fragment:bool -> ?ttl:int -> ?src:ipaddr -> ipaddr -> Ip.proto -> ?size:int -> (buffer -> int) -> buffer list -> (unit, error) result io
.
This is similar to the Ethernet signature - it writes the IPv4 header and sends a packet. It also supports fragmentation (including setting the do-not-fragment bit for path MTU discovery) -- whenever the payload is too big for a single frame, it is sent as multiple fragmented IPv4 packets. Additionally, setting the time-to-live is now supported, meaning we now can implement traceroute!
The API used to include two functions, allocate_frame
and write
, where only buffers allocated by the former should be used in the latter. This has been combined into a single function that takes a fill function and a list of payloads. This change is for maximum flexibility: a higher layer can either construct its header and payload, and pass it to write
as payload argument (the buffer list
), which is then copied into the buffer(s) allocated by the network device, or the upper layer can provide the callback fill
function to assemble its data into the buffer allocated by the network device, to avoid copying. Of course, both can be used - the outgoing packet contains the IPv4 header, and possibly the buffer until the offset returned by fill
, and afterwards the payload.
The TCP implementation has preliminary keepalive support.
ukvm
target was renamed to hvt
, where solo5-hvt
is the monitoring processThe default random device from the OCaml standard library is now properly seeded using mirage-entropy. In the future, we plan to make the fortuna RNG the default random number generator.
The semantics of arguments passed to a MirageOS unikernel used to vary between different backends, now they're the same everywhere: all arguments are concatenated using the whitespace character as separator, and split on the whitespace character again by parse-argv. To pass a whitespace character in an argument, the whitespace now needs to be escaped: --hello=foo\\ bar
.
You may also want to read the MirageOS 3.2.0 announcement and the MirageOS 3.3.0 announcement.
We are working on further changes which revise the mirage
internal build system to dune. At the moment it uses ocamlbuild
, ocamlfind
, pkg-config
, and make
. The goal of this change is to make MirageOS more developer-friendly. On the horizon we have MirageOS unikernel monorepos, incremental builds, pain-free cross-compilation, documentation generation, ...
Several other MirageOS ecosystem improvements are on the schedule for 2019, including an irmin 2.0 release, a seccomp target for Solo5, and easier deployment and multiple interface in Solo5.