By Jeremy Yallop - 2014-07-15
One of the most frequent questions about MirageOS from developers is "do I really need to write all my code in OCaml"? There are, of course, very good reasons to build the core system in pure OCaml: the module system permits reusing algorithmic abstractions at scale, and OCaml's static type checking makes it possible to enforce lightweight invariants across interfaces. However, it's ultimately necessary to support interfacing to existing code, and this blog post will describe what we're doing to make this possible this without sacrificing the security benefits afforded by unikernels.
A MirageOS application works by abstracting the logic of the
application from the details of platform that it is compiled for.
The mirage
CLI tool parses a configuration file that represents the
desired hardware target, which can be a Unix binary or a specialized
Xen guest OS. Our foreign function interface design elaborates on
these design principles by separating the description of the C
foreign functions from how we link to that code. For instance, a
Unix unikernel could use the normal ld.so
to connect to a shared
library, while in Xen we would need to interface to that C library
through some other mechanism (for instance, a separate VM could be
spawned to run the untrusted OpenSSL code). If you're curious about
how this works, this blog post is for you!
ocaml-ctypes ("ctypes" for short) is a library for gluing together OCaml code and C code without writing any C. This post introduces the ctypes library with a couple of simple examples, and outlines how OCaml's module system makes it possible to write high-level bindings to C that are independent of any particular linking mechanism.
Binding a C function using ctypes involves two steps.
For example, here's a binding to C's puts
function, which prints a string to
standard output and returns the number of characters written:
let puts = foreign "puts" (string @-> returning int)
After the call to foreign
the bound function is available to OCaml
immediately. Here's a call to puts
from the interactive top level:
# puts "Hello, world";;
Hello, world
- : int = 13
Now that we've had a taste of ctypes, let's look at a more realistic example: a program that defines bindings to the expat XML parsing library, then uses them to display the structure of an XML document.
We'll start by describing the types used by expat. Since ctypes represents C types as OCaml values, each of the types we need becomes a value binding in our OCaml program. The parser object involves an incomplete (abstract) struct definition and a typedef for a pointer to a struct:
struct xml_ParserStruct;
typedef xml_ParserStruct *xml_Parser;
In ctypes these become calls to the structure
and ptr
functions:
let parser_struct : [`XML_ParserStruct] structure typ = structure "xml_ParserStruct"
let xml_Parser = ptr parser_struct
Next, we'll use the type representations to bind some functions. The
XML_ParserCreate
and
XML_ParserFree
functions construct and destroy parser objects. As with puts
, each
function binding involves a simple call to foreign
:
let parser_create = foreign "XML_ParserCreate"
(ptr void @-> returning xml_Parser)
let parser_free = foreign "XML_ParserFree"
(xml_Parser @-> returning void)
Expat operates primarily through callbacks: when start and end elements are encountered the parser invokes user-registered functions, passing the tag names and attributes (along with a piece of user data):
typedef void (*start_handler)(void *, char *, char **);
typedef void (*end_handler)(void *, char *);
In ctypes function pointer types are built using the funptr
function:
let start_handler =
funptr (ptr void @-> string @-> ptr string @-> returning void)
let end_handler =
funptr (ptr void @-> string @-> returning void)
We can use the start_handler
and end_handler
type representations to bind
XML_SetElementHandler
, the callback-registration function:
let set_element_handler = foreign "XML_SetElementHandler"
(xml_Parser @-> start_handler @-> end_handler @-> returning void)
The type that OCaml infers for set_element_handler
reveals that the function
accepts regular OCaml functions as arguments, since the argument types are
normal OCaml function types:
val set_element_handler :
[ `XML_ParserStruct ] structure ptr ->
(unit ptr -> string -> string ptr -> unit) ->
(unit ptr -> string -> unit) -> unit
There's one remaining function to bind, then we're ready to use the
library. The
XML_Parse
function performs the actual parsing, invoking the callbacks when tags
are encountered:
let parse = foreign "XML_Parse"
(xml_Parser @-> string @-> int @-> int @-> returning int)
As before, all the functions that we've bound are available for use
immediately. We'll start by using them to define a more idiomatic OCaml entry
point to the library. The parse_string
function accepts the start and end
callbacks as labelled arguments, along with a string to parse:
let parse_string ~start_handler ~end_handler s =
let p = parser_create null in
let () = set_element_handler p start_handler end_handler in
let _ = parse p s (String.length s) 1 in
parser_free p
Using parse_string
we can write a program that prints out the names of each
element in an XML document, indented according to nesting depth:
let depth = ref 0
let start_handler _ name _ =
Printf.printf "%*s%s\\n" (!depth * 3) "" name;
incr depth
let end_handler _ _ =
decr depth
let () =
parse_string ~start_handler ~end_handler (In_channel.input_all stdin)
The full source of the program is available on github.
Here's the program in action:
$ ocamlfind opt -thread -package core,ctypes.foreign expat_example.ml \\
-linkpkg -cclib -lexpat -o expat_example
$ wget -q https://mirage.io/blog/atom.xml -O /dev/stdout \\
| ./expat_example
feed
id
title
subtitle
rights
updated
link
link
contributor
email
uri
name
[...]
Since this is just a high-level overview we've passed over a number of details. The interested reader can find a more comprehensive introduction to using ctypes in Chapter 19: Foreign Function Interface of Real World OCaml.
Up to this point we've been using a single function, foreign
, to
make C functions available to OCaml. Although foreign
is simple to
use, there's quite a lot going on behind the scenes. The two
arguments to foreign
are used to dynamically construct an OCaml
function value that wraps the C function: the name is used to resolve
the code for the C function, and the type representation is used to
construct a call frame appropriate to the C types invovled and to the
underlying platform.
The dynamic nature of foreign
that makes it convenient for
interactive use, also makes it unsuitable for some environments.
There are three main drawbacks:
Binding functions dynamically involves a certain loss of safety:
since C libraries typically don't maintain information about the
types of the functions they contain, there's no way to check whether
the type representation passed to foreign
matches the actual type of
the C function.
Dynamically constructing calls introduces a certain interpretative
overhead. In mitigation, this overhead is much less than might be supposed,
since much of the work can be done when the function is bound rather than
when the call is made, and foreign
has been used to bind C functions in
performance-sensitive applications without problems.
The implementation of foreign
uses a low-level library, libffi,
to deal with calling conventions across platforms. While libffi is mature
and widely supported, it's not appropriate for use in every environment.
For example, introducing such a (relatively) large and complex library into
Mirage would compromise many of the benefits of writing the rest of the
system in OCaml.
Happily, there's a solution at hand. As the introduction hints, foreign
is
one of a number of binding strategies, and OCaml's module system makes it easy
to defer the choice of which strategy to use when writing the actual code.
Placing the expat
bindings in a functor (parameterised module) makes it
possible to abstract over the linking strategy:
module Bindings(F : FOREIGN) =
struct
let parser_create = F.foreign "XML_ParserCreate"
(ptr void @-> returning xml_Parser)
let parser_free = F.foreign "XML_ParserFree"
(xml_Parser @-> returning void)
let set_element_handler = F.foreign "XML_SetElementHandler"
(xml_Parser @-> start_handler @-> end_handler @-> returning void)
let parse = F.foreign "XML_Parse"
(xml_Parser @-> string @-> int @-> int @-> returning int)
end
The Bindings
module accepts a single parameter of type FOREIGN
, which
encodes the binding strategy to use. Instantiating Bindings
with a module
containing the foreign
function used above recovers the
dynamically-constructed bindings that we've been using so far. However, there
are now other possibilities available. In particular, we can instantiate
Bindings
with code generators that output code to expose the bound functions
to OCaml. The actual instantiation is hidden behind a couple of convenient
functions, write_c
and write_ml
, which accept Bindings
as a parameter
and write to a formatter:
Cstubs.write_c formatter ~prefix:"expat" ~bindings:(module Bindings)
Cstubs.write_ml formatter ~prefix:"expat" ~bindings:(module Bindings)
Generating code in this way eliminates the concerns associated with constructing calls dynamically:
The C compiler checks the types of the generated calls against the C headers (the API), so the safety concerns associated with linking directly against the C library binaries (the ABI) don't apply.
There's no interpretative overhead, since the generated code is (statically) compiled.
The dependency on libffi disappears altogether.
How easy is it in practice to switch between dynamic and static binding strategies? It turns out that it's quite straightforward, even for code that was originally written without parameterisation. Bindings written using early releases of ctypes used the dynamic strategy exclusively, since dynamic binding was then the only option available. The commit logs for projects that switched over to static generation and linking (e.g. ocaml-lz4 and async-ssl) when it became available show that moving to the new approach involved only straightforward and localised changes.
Generating code is safer than constructing calls dynamically, since it
allows the C compiler to check the types of function calls against
declarations. However, there are some safety problems that even C's
type checking doesn't detect. For instance, the following call is
type correct (given suitable definitions of p
and q
), but is
likely to misbehave at run time:
memcpy(p, q, SIZE_MAX)
In contrast, code written purely in OCaml detects and prevents attempts to write beyond the bounds of allocated objects:
# StringLabels.blit ~src ~dst ~src_pos:0 ~dst_pos:0 ~len:max_int;;
Exception: Invalid_argument "String.blit".
It seems a shame to weaken OCaml's safety guarantees by linking in C code that can potentially write to any region of memory, but what is the alternative?
One possibility is to use privilege separation to separate
trusted OCaml code from untrusted C functions. The modular design of
ctypes means that privilege separation can be treated as one more
linking strategy: we can run C code in an entirely separate process
(or for Mirage/Xen, in a separate virtual machine), and instantiate
Bindings
with a strategy that forwards calls to the process using
standard inter-process communication. The remote calling strategy is
not supported in the current release of ctypes, but
it's scheduled for a future version. As with the switch from dynamic
to static bindings, we anticipate that updating existing bindings to
use cross-process calls will be straightforward.
This introductory post should give you a sense of the power of the unikernel approach in Mirage. By turning the FFI into just another library (for the C interface description) and protocol (for the linkage model), we can use code generation to map application logic onto the privilege model most suitable for the target hardware platform. This starts with Unix processes, continues onto Xen paravirtualization, and could even extend into CHERI fine-grained compartmentalization.
Although ctypes is a fairly new library, it's already in use in a number of projects across a variety of domains: graphics, multimedia, compression, cryptography, security, geospatial data, communication, and many others. Further resources (documentation, forums, etc.) are available via the home page.