Weekly Meeting: 2013-04-16

Agenda

Anil Madhavapeddy: overview of release goals
Dave Scott: libvirt and other status
Vincent Bernardoff: Mirari and testing
Thomas Gazagnaire: Irminsule, the storage substrate for Mirage
Amir Chaudhry: xen.org incubation status and next steps

Attendees: Thomas Gazagnaire, Vincent Bernardoff, Jonathan Ludlam, David Sheets, David Scott, Anil Madhavapeddy, Amir Chaudhry, Prashanth Mundkur

Minutes

Overview of release goals (Anil)

Aim of this meeting is to clarify release goals for 1.0. We're aiming for the OSCON 2013 talk in July, so release plan should have a stable output by June so that we can document and stabilise. Three concrete applications:

Need stub-domains for Xenstore and Xenconsoled. Device drivers (mainly netfront and blkfront) should compile on both Xen and Linux to let us swap between front-ends and back-ends as either stub-domain or dom0.
Self-hosting website and DNS server. Already working well now, but the TCP stack could use some attention, as well as an SSL prototype.
Proof of concept of distributed system (ideally can be demo'd for OSCON). Some strawman distributed system to start with; current thinking is something like OCamlot or Signpost. This would be a nice "Hello world!" demo as it requires actors, a persistent job queue. For example, i.e spawning new VMs in response to load, as evidenced by self-scaling web-server.

David Sheets (working on OCamlot), asks about the place for Mirage in OCamllot. For example would it be the dispatcher running as a Mirage kernel on "the cloud" and the workers would be running as ordinary VMs? This would demonstrate that the distributed system actually works. Anil: can discuss the details of this later (e.g what OS are the workers using), but that's correct in principle. We need to split the OS module into a controller that is responsible for resource provisioning, and the worker application itself. Vincent is already working on this on the UNIX backend.

Libvirt and other cloud APIs (Dave)

Dave: Libvirt manages VMs on a physical host and is hypervisor-agnostic. OCaml bindings exist for this via Richard Jones at Red Hat, and Dave Scott has been fixing up the Xen bindings. We can't use it to work with EC2 or proprietary clouds (have to use EC2 API directly there), but we could write something that sits between if really motivated. libvirt could work with KVM and other machines, which is a big plus when we have other hypervisor backends such as KVM.

Anil: Need to investigate mapping to cloud APIs; e.g Mirage app splits into the controller and the app that does worker as a separate VM (on Xen), or a separate process (on UNIX), or even uses capability APIs such as Capsicum on FreeBSD to separate the two.

No current fork API on Xen: that is, a VM cannot self-provision more resources, hence the need for this controller/worker split. This is a general issue on the cloud, and there are several ad-hoc PV interfaces (e.g. balloon driver). We need to think about how the OS library interface for Mirage will do in this space.

Decision: Initial cloud APIs to target are libvirt (will cover 90% of FOSS users as it works with xm) and EC2 (interesting demos and for self-hosting website).

Dave: Need to make sure that libvirt works on Ubuntu and Debian.

Jon: asks whether configuration is still a compile-time thing. Can also do key exchange thing at compile time, which should be cool and avoids dynamic secret exchange. Anil: yes, and closely tied into Mirari (see next).

#### Mirari (Vincent)

Mirari splits workflow into config stage (mirari configure) which installs OPAM dependencies, the build phase which invokes obuild, and mirari run to execute the result. The configure and build phases work fine, but we're missing mirari run to interface with libvirt and EC2.

Need to think about the FRP model for configuring and keeping track of dependencies. It would make an awesome demo (and useful in general!) to reconfigure server variables and recompile/launch precisely what is needed. Question asked if anyone has used FRP for anything serious. Silence ensues. Followed by some sniggering. Followed by serious looks. Mort's name is mentioned.

Prashanth: Does Mirari monitor VMs across multiple hosts? Anil: Eventually yes, but not in 1.0. It keeps track of local VMs for now on the same host and Mirari v2 will deal with tracking versions across hosts, or on EC2. Just libvirt for now will get us a long way in terms of usability.

Jon: we need to define some goal applications for Mirage 2.0 to make all these design decisions clearer. Anil agrees, and suggests:

Project Windsor, a dom0 free system that essentially turns xapi into a communicating set of unikernels.
For Mirage website, e.g automatically reconfiguring itself from individual to two VMs and a load balancer, and then more VMs and DNS balancing as load increases. This is the concept of self-scaling. How do we do this in the libraries? Anil would love to demo a self-scaling webserver at OSCON: Mort, Anil and Malte have played with this a couple of years ago, but its not practical at the VM level without fast boots (which we have now) and careful dependency tracking across resources and libraries. The latency and precision we can bring is an order of magnitude better than the current state-of-the-art, and (thanks to Dave's work on xenopsd), we have the control stacks to glue them all together.

Jon: XenServer pool-level support for these makes sense since there is already a cluster-wide API to deal with moving across hosts. Anil: yes, but the XAPI distributed db is a potential significant slowdown.

Irminsule (Thomas)

We need a storage interface (persistence) and control interface in Mirage.
Branch-consistency is the lowest level we can work at without embedding any policy in the database. Want to have a DB that grows immutably, and then we just have a garbage collection problem to deal with ("just").

Goal is to have a Mirage-friendly DB with 3 scenarios:

VM storage: expose an NFS library and eventually PV blkfront that can expose fake git directory to VM. Not stuck at the block-level but real filesystem-level packs.
Native Mirage filestore system. Needs to be lock-free as they have to run all over the public cloud. Irminsule designed for this. Could use Sqlite with data constraints but... no. Desire a more rigorous, low-level DB and build SQL on top as necessary.
For the RWO book, people making comments have lost stuff as Javascript API requires people to be online to report issues. We can compile Irminsule down into JS and then comments can be made locally and then sync/merged upstream.

Jon: asked if you need the whole blob/tree in your browser to do the stuff for RWO book? Anil/Thomas: you need the graph but not the whole thing as we're not stuck to only needing the git protocol. The low-level protocol actually supports lazy graph traversal fine. Anil: we can also rebase the trunk using a sliding window scheme.

Jon: Targeting Xenstore may be worth considering as its a close fit to Irminsule. Thomas points out that very close mappings between existing transaction semantics and Xenstore (Thomas worked on both). Anil: we can also turn Xenstore into a cluster-wide persistent store like this, removing a lot of XAPI complexity.

Prashanth: what if I want something from the subdirectory rather than root? Thomas: Git doesn't allow this but in his implementation its possible by delving at a lower level. Sheets: Git patches have been proposed but rejected that permit this in git for submodules.

Xenproject update (Amir)

Xen.org has become a Linux Foundation project (called Xen project) and there have been some high profile announcements so far which explicitly mention Mirage OS so we can expect more attention being paid to the work. There will also be blog posts and news about Mirage being put up on Xen Project's site and openmirage.org.

Anil: we all need to be looking over the installation instructions and checking they work ok. They need refreshing and tweaking as even small missing steps cause confusion.

Any other business

Jon wanted get Mirage set up as a test VM for XAPI. It does a great job (starts and stops in no time), which is part of motivation to get start- suspend working properly. Anil: fuzz-testing in general is a good thing and we can use this in all the libraries. Balraj created a TCP loopback VM that uses two VIFs in single VM (see mirage-skeleton/tcp).

Plan for next week's call: Discuss the mirari FRP integration in more detail. Invite Daniel Bunzli to advise.