Running the kernel in library mode

Ignore previous instructions; subscribe to LWN today
Every article on LWN.net is written by humans, for humans. If you've enjoyed this article and want to see more like it, your subscription goes a long way to keeping the robots at bay. We are offering a free one-month trial subscription (no credit card required) to get you started.

By Jonathan Corbet
April 8, 2015

Once upon a time, the only way to run the Linux kernel was as the primary operating system on a handy piece of hardware. Since then, though, other modes of operation have become possible: the kernel can, for example, be run as the guest of another kernel through virtualization, or as a user-space process with the user-mode Linux (UML) port. One mode that has not been supported is running the kernel as a library that can be called from within an application program, but that situation appears to be about to change thanks to a patch set which has just made its first appearance on the linux-kernel list.

This patch set, posted by Hajime Tazaki, goes by the name LibOS; it was presented (slides [slideshare]) at the recent Netdev 0.1 conference. LibOS is structured as if it were a new architecture port; it can be found under arch/lib in the kernel tree. But this port, when built, does not result in a bootable kernel; instead, it creates a shared library that can then be loaded into a running process.

One might wonder why this mode of operation would be useful. Though it is not limited to this particular use, the main focus of LibOS at the moment is to make the Linux network stack available to user-space applications. User-space network stacks are not unheard of in the Linux world; they have shown up in certain performance-sensitive settings for some years now. With LibOS, it is not necessary to write (or port) a new network stack to run in a Linux process; the kernel's network stack is now available to use directly.

Needless to say, one does not just make the network stack callable from user space without doing a bit of work. To make this mode possible, the LibOS developers have created a whole set of stub functions to replace various kernel functions used by the networking code. Indeed, the bulk of the patch set consists of thousands of lines of stub functions. They do things like replacing the slab allocator with a simple version based on malloc() and, for the most part, shorting out the filesystem layer entirely. When that is done, what's left is the networking stack with almost enough scaffolding to let it run standalone within a process's address space.

"Almost enough" because a few tasks are still left to the calling application. For example, there is no stub implementation of schedule(); instead, the calling code must provide one during the initialization process. The idea here is that the running application may want to exert some control over how the management of processes (most likely implemented as POSIX threads) will be done.

There are currently two projects using the LibOS framework. Networking in user space (NUSE) finishes the job of providing a running user-space network stack. With NUSE, one can set up arbitrary networking topologies, interface to other user-space mechanisms like DPDK for fast transmission and reception of packets, and more. The NS-3 system, instead, is a simulation framework used to run tests on network protocols and implementations. It can run network-oriented applications on top of the LibOS network stack using LD_PRELOAD tricks to redirect calls to the networking system calls.

There are a number of interesting things that can be done with these tools. Users running networking in user space for performance reasons could consider using it, though the kernel's stack has not been optimized for performance in that setting. Somebody wanting to run an experimental protocol like MPTCP in production could use LibOS (built with a suitably patched kernel) to get that feature without touching the network stack used by the rest of the system. There are also a lot of opportunities for running debugging tools with a network stack that is running in user mode.

While the LibOS work has been focused on the network stack as the first objective, there is nothing in its design that limits it to networking. If one wanted to, say, isolate the virtual filesystem layer instead, it would mostly be a matter of coming up with the additional stub functions needed.

A question that might come to mind is: how does this differ from the user-mode Linux port that has been in the kernel for many years? Indeed, UML maintainer Richard Weinberger wondered exactly that. There appear to be a few differences. UML is meant to run as a standalone application in its own right, while LibOS runs as a library called by some other application. One can even have several LibOS instances running simultaneously within the same application. Beyond that, the idea of isolating a single subsystem for use within an application is not a part of the design of UML. After looking more deeply at the LibOS code, Richard agreed that it brought some interesting things to the table.

One possible area of concern is the maintenance of all of the stub functions. There are a lot of them, and they will need to be updated whenever the corresponding "real" version is changed in the kernel. Few maintainers are likely to think that they have to update LibOS when they are making changes to their own subsystems. As a result, it seems likely that LibOS will be broken much of the time.

That, in turn, means that maintenance concerns may be one of the chief obstacles LibOS must overcome before it can be considered for merging into the mainline kernel. If LibOS is often broken, developers will hesitate to use it. If LibOS breakage leads to complaints against subsystem maintainers working on their own code, they may respond by calling for its removal. Avoiding these pitfalls may require finding some way to automate the creation of these stub functions. Creating a library-mode version of the kernel may turn out to have been the easy part when one considers what is required to make that work maintainable in the long run.

Index entries for this article
Kernel	Library mode
Kernel	Virtualization/Library mode

Running the kernel in library mode

Posted Apr 9, 2015 6:43 UTC (Thu) by mathieu_lacage (guest, #3967) [Link]

Dear editor,

1) the ns-3 DCE component that can be used to instantiate multiple libos instances within a single process does not use LD_PRELOAD tricks. Instead, it relies on either the dlmopen function (implemented with an adhoc ELF Loader that is binary compatible with the glibc loader) or a piece of code that plays tricks with the ELF binaries.

2) I have seen this statement a lot of times: "LibOS will be broken much of the time" and I have been unable to dispel that myth yet. In practice, my experience has been that it is not the case and it seems to boil down to the fact that the internal interfaces that are plugged in appear to be much more stable than feared by most kernel developers (I shared that fear at some point a couple of years ago). Or maybe I have a different appreciation for what "most of the time" means. In practice, it appears that a couple hours of work once 2 to 3 months is enough to maintain this code.

Now, I would not want the above to be interpreted as a justification for not merging this code since I feel it would be a terrific addition to the kernel but I felt compelled to correct what I perceive as a misconception.

[thanks again for this terrific resource that I have been subscribed to for ... gasp ... 8 years now !?]

Running the kernel in library mode

Posted Apr 9, 2015 9:32 UTC (Thu) by dunlapg (guest, #57764) [Link] (1 responses)

Sounds a lot like the "rump kernel" work that the NetBSD guys have been doing:

http://rumpkernel.org/

Running the kernel in library mode

Posted Apr 9, 2015 16:57 UTC (Thu) by justincormack (subscriber, #70439) [Link]

Indeed, there has been quite a lot of contact. The rump kernel has sorted out how to keep the rump kernel synced - it is used heavily in the test suite, for example, so it is clear when it is not working.

Running the kernel in library mode

Posted Apr 9, 2015 10:54 UTC (Thu) by SLi (subscriber, #53131) [Link] (1 responses)

I wonder if the result is sufficiently normal user space code that one could, for example run it (or the binary it's linked to) under Valgrind, or to compile it with some kind of instrumentation (for example, AddressSanitizer or some kind of branch instrumentation for profile guided fuzzing)?

Running the kernel in library mode

Posted Apr 9, 2015 17:01 UTC (Thu) by justincormack (subscriber, #70439) [Link]

Yes that should be possible. Working on doing this kind of thing with the NetBSD rump kernel, which is a similar architecture.

The next step

Posted Apr 9, 2015 11:40 UTC (Thu) by epa (subscriber, #39769) [Link] (2 responses)

Now we need only compile LibOS to Javascript to run it in the browser, and Linux development will be complete.

The next step

Posted Apr 9, 2015 12:43 UTC (Thu) by pr1268 (subscriber, #24648) [Link] (1 responses)

Didn't Fabrice Bellard already do that?

The next step

Posted Apr 9, 2015 14:45 UTC (Thu) by epa (subscriber, #39769) [Link]

That's an x86 emulator - which obviously has been possible for ages. It is not compiling Linux directly to Javascript.

Running the kernel in library mode

Posted Apr 10, 2015 16:00 UTC (Fri) by tom.prince (guest, #70680) [Link]

It seems like this is something that would be useful for writing a test-suite for the kernel.

Running the kernel in library mode

Posted Apr 11, 2015 12:56 UTC (Sat) by gdt (subscriber, #6284) [Link] (5 responses)

Does this means that internal Linux functions now become part if the user-visible API (via LibOS rather than via system call), and thus have to be stable?

Running the kernel in library mode

Posted Apr 12, 2015 2:55 UTC (Sun) by thehajime (guest, #88408) [Link] (4 responses)

exactly. your application theoretically can call functions without system calls.

Running the kernel in library mode

Posted Apr 16, 2015 0:41 UTC (Thu) by scientes (guest, #83068) [Link] (1 responses)

and time() already works that way.

Running the kernel in library mode

Posted Apr 16, 2015 1:57 UTC (Thu) by thehajime (guest, #88408) [Link]

we elaborate this idea to other system calls, like socket(2) for instance at the moment.

Running the kernel in library mode

Posted Apr 16, 2015 2:10 UTC (Thu) by viro (subscriber, #7872) [Link] (1 responses)

... and get screwed as soon as kernel internals change. What, does anybody expect that we'll accept the stability obligations on anything other than syscalls? If anyone tries to argue that *and* arch/libos maintainers don't tell the to piss off convincingly enough, well, git rm arch/libos will solve the entire problem just fine, TYVM...

Running the kernel in library mode

Posted Apr 16, 2015 3:06 UTC (Thu) by thehajime (guest, #88408) [Link]

as mathieu_lacage mentioned, it's not that bad to maintain the kernel internal changes.

> In practice, it appears that a couple hours of work once 2 to 3 months is enough to maintain this code.

Running the kernel in library mode

Posted Apr 11, 2015 20:52 UTC (Sat) by robbe (guest, #16131) [Link]

I guess this is only useful for applications that are GPL-2 compatible? NS-3 is GPL-2. NUSE, true to github form, has no license...