Robigalia has been without a logo for over a year. It’s been in the back of my mind that any self-respecting open source project needs a logo. I took some design classes in high school and don’t have a budget, so I made one myself!
Here’s the end result:
I had some requirements when designing the logo. First, it should be simple enough that it looks good as a favicon (smaller than 64x64). It should also look good in monochrome or in restricted color palettes. The simpler, the better.
When I was brainstorming ideas for the logo, I got hung up trying to incorporate the idea of wheat, crop disease, agriculture, or Roman themes, owing to the project’s name. Eventually I remembered about the Elian script. I started doodling various words in Elian. A full word was too complicated. I focused on a particular cluster of letters that fit together well: "rbg". This is also a shorthand for "Robigalia", which works out well!
What you see above is an SVG (created in Inkscape) inspired those three letters in the Elian script. For symmetry, the actual letters are "rbp", but it doesn’t bother me to not directly use the Elian script. The lines and the whitespace between them are all the same width, which helps the SVG scale down to small resolutions well. The cyan color on the dark background reminds me of Tron. I also think it kinda looks like a sideways 8-bit jellyfish. eternaleye said it looks like a push button schematic.
I think it’s kinda neat, and look forward to seeing some awesome desktop themes using it once we have a windowing system!
Today marks the first anniversary since Robigalia made its public debut. This was much to my dismay at the time. It’s hard to express how much anxiety having something posted to a large forum like HN can cause when the work isn’t yet ready for public consumption. Not a great way to spend your birthday! With that in mind, the project has been relatively quiet until recently. I largely suspended code work on Robigalia while I was interning at Data61 on the seL4 verification team. That didn’t stop me thinking about it, though. A lot of the conceptual design is basically done.
I’ve been working on a "book", "Robigalia: An Operating System for the Modern Era" which documents the design and theory behind Robigalia. This has been really helpful for teasing out design issues before spending a lot of time implementing things. We’ve also been (idly) looking at modeling various portions of the system formally, for use with model checking tools to show that at least the model is sound. As time goes on, portions of the book will be serialized and posted on the blog. The book will become the core of my undergraduate thesis. One such post on timekeeping will be coming out shortly, pending expert review.
Implementation progress hasn’t completely stalled either. I’ve been chipping away at ACPI handling and device drivers. This is slowed massively by me teaching and training other undergraduates at my university how operating systems work, with the hope of getting them to be productive contributors to Robigalia. While I was at Data61, a student there implemented NetBSD’s rump hypercall interface. This will allow us to use NetBSD’s drivers in the short run. This is a huge help, given the wide variety of hardware that a modern OS needs to support!
In terms of inspiration, I’ve been hugely inspired by the Houyhnhnm computing essays, as well as KeyKOS, EROS, and Coyotos. As more of our design is made public, their influence will become very apparent. Our goal of a POSIX userspace hasn’t been forgotten. It’s still there. But making native application support as good as it can be is our top priority. Midori has also been a huge inspiration here.
seL4 has changed in the meantime too. The realtime kernel has been released, as has multicore support and an x64 port. A RISC-V port is also soon to come. We now target what will be known as the "stage" branch of seL4, which contains features destined to be verified, but which haven’t been yet.
The future is bright for Robigalia. While we work on the core, if you’re interested in helping the project, a good way to do that is to give feedback in the IRC channel when we’re brainstorming or analyzing designs. The absolute best way, however, is to contribute to rips, libpnet, tokio, rust-crypto, rustls, rayon, or even Rust itself. Bettering the Rust ecosystem is the best way to better Robigalia, in the short run. If you’re interested in devices, you might take a look at starting a pure-Rust USB stack or Vulkan drivers for some openly documented platform like AMD or Intel. This can largely be done in userspace on Linux, for the first phases.
Our plan is to have an initial release of the book and a working Robigalia system on April 25, 2017. See you then!
This past week I traveled to the first international seL4 workshop, thanks to funding provided by HRL. While there, I gave a presentation about some design patterns used in Robigalia. While slides aren’t currently available for most of these, I did take some notes. As/if slides are released, I’ll add links to this post.
High-assurance software for autonomous ground systems
This was a presentation by Alexsey Nogin from HRL about the work they are doing as part of the "ground team" of DARPA’s HACMS program. This included some of the various components they are using and the overall system structure.
seL4: All the cool new things
Gernot talked about the current roadmap for seL4 and what recent work has been done. In particular he pointed out that the RT kernel is currently available and that multicore and x64 is in master. He also announced that, in the near future, there will be an explicit "stage" branch which will eventually be merged into master, as verification is completed for new pieces. The first thing added will be RT. The seL4 roadmap has been updated to reflect current reality.
This was my talk. The slides are available. I talked about some design patterns that we’ve come up with for robust dynamic systems: space banks, bushels, extended virtual message registers, and briefly mentioned persistence and drivers.
There will be more available on this in the coming weeks.
Oak Rattanavipanon, from UC Irvine, talked about the work they have been doing to get remote attestation on seL4. Their approach right now considers only remote adversaries. They modify the kernel to work with the Sabre Lite’s secure boot. From this, they have some code which runs (currently) in the root thread to provide a hybrid attestation scheme. Using the hardware root of trust provided by the Sabre Lite’s secure boot implementation, they use the guarantees provided by seL4 to ensure confidentiality of the attestation key and integrity of the attestation code.
On top of this, they implement a simple network protocol wherein a remote verifier can request a MAC of a region of memory on the verifier. If the MAC matches what the verifier expects, then the verifier can trust that the software on the remote device hasn’t been subverted. There were a bunch of questions about the actual network protocol they use. Currently, they use a nonce to prevent replay, but the MAC is otherwise transported in cleartext with no other authentication.
I recall that they plan on releasing their code open source, but don’t have that written in my notes so that memory may be faulty.
Data Distribution for seL4
Andrea Sorbini, from Real Time Innovations, talked about work they have been doing to bring one of their products, "RTI Connext DDS Micro", a :dds:[data distribution service] middleware, to seL4. They are doing this as part of a DARPA SBIR project. The tagline used to introduce the project was "What good is formal verification if you don’t have friends to talk to?". DDS is a pub/sub based peer-to-peer communications system. It provides structured message transport and has a variety of quality of service configuration such as transport reliability, message history, delivery deadlines, durability, liveliness, and resource limits. The seL4 implementation supports most of these properties, and it was created from the DO178C certifiable subset of the product which requires static configuration of resources and communication paths. They are investigating using formal verification in cooperation with Galois to show memory safety of their receive path.
The primary challenge they ran into was a lack of an OS personality to target. They ended up building a
libsel4osapiwhich provides interfaces for a clock, network, UDP, serial communication, timers, and a process abstraction. They did not enjoy doing this, commenting that "minimalism is no excuse for poor developer experience". They had quite a few suggestions for what would make their experience much more pleasant, many of which came up during the "gaps" conversation.
Debugging seL4 with GDB
Chris Guikema, from DornerWorks, talked about the work they have been doing to enable more robust and advanced GDB debugging of especially CAmkES components. They have taken the x86 GDB server that Data61 has developed and ported it to ARM, adding some features along the way. Parts of their code are available online to start using. I recall seeing a link to their GDB server in the slides, but can’t find it now.
Virtualization on seL4
Ihor Kuz, from Data61, gave an introduction to virtualization theory, including the basics of Popek-Goldberg virtualizability and the distinction between privileged and sensitive instructions. He gave an overview of virtualization in practice on x86 and ARM, as well as how some of the seL4 abstractions of these mechanisms work.
At the end of the day, we wrapped up with a discussion of gaps in seL4. Gernot went over a document that they created listing what they perceive as the gaps. There was general agreement among the audience with the analysis. I requested access to x86 MSRs, but no one else had any additions. There was then a brief conversation about one of the most pressing issues: drivers. The current idea is to have tables on the wiki containing links to open source drivers and a basic evaluation of them. Someone also floated the idea of a tool which boots Linux on a board, scans what devices are available, and puts together a report indicating which have drivers available for seL4 and where they can be found.
After working at Data61 with the verification team, and many conversations with Gerwin Klein, I have a good plan on how I think I want to achieve the goal of Trustworthy Rust.
Introduction to formal verification
Formal verification is using mathematical techniques to analyze software. In general, one wants to show that a particular program implements some specification ("functional correctness"). But there are other things you can show too! For example, seL4 shows high-level security properties like integrity and confidentiality. Actually doing this is quite hard, but relatively well understood these days. You need to model the program, then show that the model satisfies the spec. Hopefully, the model corresponds to reality. In practice, many verification techniques have serious problems with reality-correspondence.
As one semi-famous example, binary search was "proven correct" many times, but still had buggy implementations. What was the failure here? The model used mathematical integers, but the reality is finite, wrapping words. This is a complete embarrassment that any principled use of formal methods should avoid. If your model doesn’t correspond to reality, you need a tedious, manual, error-prone validation of the implementation. We can do better!
Phase 1: A semantics for Rust
The first phase is modeling the semantics of Rust programs. There are quite a few partial approaches to this so far. This is not to disparage their efforts - I hope they continue and succeed, because more work in this space can only be good!
On the one hand, you want a semantics so you can prove things like soundness of the language’s type system, or memory safety of fragments of the language. On the other hand, sometimes you want to actually prove properties about specific programs in the language. The former can sometimes use a simplified model, depending on the type system and guarantees it wants to make. The latter needs to capture all details of the language that are relevant to the program under consideration, and is generally much more formalization-intensive.
I have my own in-progress semantics for MIR, that I’m working on between Robigalia sprints. In general, I think MIR is the best short-term approach for initial verification of Rust programs. In the long term, someone will need to write a full semantics for actual-Rust, and write either a verified compiler or a proof-producing translator from that into MIR. This will be very challenging, as Rust’s type system is rather complex. I suspect the RustBelt folks will have the best start of this.
At this point, implementations of Rust (eg, rustc) could use the semantics to resolve confusions and as a ultimate arbiter of deciding correctness of some program transformation (eg, an optimization). In the long run, the semantics could serve as the normative specification of the whole language.
Phase 2: program logics
Given a semantics for MIR, one can define things like Hoare logics over MIR programs. The benefit of having a semantics at this point is that the program logics can be shown sound, which is rather important. Soundness of a program logic means that properties proved in the logic are also true of the program. Ticki’s and RustProof’s works, mentioned above, can not be shown sound without a semantics for MIR.
You really want properties in the program logic to be nicely composable, so that proving results about, for example, portions of libcore can be slotted easily into proofs about larger programs. At this point, one could start writing specifications and proving functional correctness of small programs or library functions.
My plan for this right now is to use a separation logic and a semantic Hoare logic combined with something like AutoCorres to make verification easier. I hope to lean very heavily on the extensive libraries and infrastructure that the Data61’s TS group has created. It will be a lot of work getting this working, but the down-the-line productivity benefits could be huge.
Phase 3: verified compilation
Given a proven-correct MIR program, we need some way to actually execute that program, in a way that the correctness proofs still apply. And ideally, we would like to retain most of the efficiency of writing in Rust instead of some higher level language like Haskell.
My plan for this right now is to write a MIR-to-"CakeML" translator, and use CakeML’s verified compiler to do all the heavy lifting of compiling to machine code. At first it would use their garbage collector for memory allocation. But with a verified allocator, potentially written in Rust, the GC should be avoidable.
In the very long run, it’d be good to connect to something like CompCert or even VeLLVM. These have their own unique challenges. For particular high-value programs, it might even be worthwhile to do something like seL4 and prove correspondence of compiler output and source program. I don’t know how feasible this will be, but it’s worth investigating.
Once we’ve written our program, shown that it satisfies the desired specification, and have an implementation that formally corresponds to that program, and have proved that the specification has the properties we depend on, we can start relying on these programs for the most critical of software. At this point, our software is truly trustworthy.
This is my long-term vision, and a lot of sweat and elbow grease needs to go into making it a reality. But my hope is that the inherent strengths of Rust will make it easier and cheaper to build trustworthy software, compared to traditional methods used in C.
Thanks Alex Elsayed, James McGlashan, Gerwin Klein, and ticki, for early feedback on this post!
As of today, we’re now using docker images and xargo instead of running our own GitLab runners on cmr’s desktop. There’s actually still a custom GitLab runner, but it’s purely an optimization (my machine is faster and has good network). When it’s not available, GitLab shared runners will still be used. Additionally, CI will check that all commits are signed, using some custom scripts. The scripts can ignore certain commits (we have a fair amount of unsigned history), and handles revoked keys specially. All of this amounts to a more robust, easier to maintain system, with quick feedback on merge requests and less configuration needed for people to get up and running with hacking on Robigalia.