The Muen Separation Kernel is the world’s first Open Source microkernel that has been formally proven to contain no runtime errors at the source code level. It is developed in Switzerland by the Institute for Internet Technologies and Applications (ITA) at the University of Applied Sciences Rapperswil (HSR). Muen was designed specifically to meet the challenging requirements of high-assurance systems on the Intel x86/64 platform. To ensure Muen is suitable for highly critical systems and advanced national security platforms, HSR closely cooperates with the high-security specialist secunet Security Networks AG in Germany.
The webpage contains instructions for building the kernel yourself, for installing it in a virtual machine, and for running it on real hardware.
…for using Ada, instead of giving birth to another unsafe C based system.
The kernel is written in SPARK, so in addition to all the safety checks Ada does, the build system ensures at compile time that there won’t be exceptions.
And the plan is to go beyond that and run proofs about domain specific runtime behaviour, too.
Disclosure: There are a number of commits in my name in there.
Great! I used SPARK Pro and GNAT Pro for about five years at a previous job. I was very impressed with the maturity and effectiveness of the tools and concepts. For us, it delivered what it promised. It took our team over a year to learn to use it well, but after that everything went pretty smoothly.
Noooo, not another microkernel! I thought people realized until now that a pure microkernel design is flawed.
From a security perspective and from stability perspective, may it makes sense. But from a performance point of view, it does not.
Edited 2014-01-22 16:44 UTC
Except for QNX, Sybase, RTEMS, OKL4, PikeOS.
Most what do those guys know, Linus says microkernels are no good.
Sybase? uh?
Typo. Should have been Symbian.
By that same argument a monolithic kernel is flawed, it trades stability and security for performance. So with that in mind, if your needs are specifically security and stability first and foremost, why would you *not* go for a microkernel?
a) choosing security and stability over performance is desirable in a number of situations
b) how do you know they haven’t solved the performance problem?
If its pure microkernel its going to suffer from the problems all microkernels have: message passing overhead between the subsystems.
Pure microkernels aren’t great, neither are pure monolithic kernels. The most successful still are hybrid systems with compromises made all over the place.
I was hoping to see more cool stuff coming from exokernel research, etc but with the current lame fragmented closed systems that ARM has allowed it seems like any really cool new os research has sort of stalled.
These guys aren’t trying to build a performant system, they’re trying to build a secure one.
You could just as equally suggest that monolithic kernels aren’t as performant as a system with absolutely no address space separation or multitasking, so we’re all better off using DOS for everything.
Solved problem. L4, QNX, MINUX3, etc. (modern uKernels) only have to buffer and/or copy messages between processes in corner cases – the majority of the time there is no copy involved at all, only a context switch that bypasses the scheduler. The receiver reads the message(s) from the same registers that the sender wrote them to. Worst case (its too big) it spills into a buffer that is mapped into the receivers address space. IPC in modern uKernels can be done with very little overhead, at least low enough where IPC is no longer a dominant bottleneck.
What I dislike is that most “hybrid” kernels (XNU in OSX, NT, etc.) simply abandon IPC as a runtime concept and “compile it away” in all of the performance sensitive parts of the system (i.e. where it probably matters most).
Much of the source code may be structured as if there are separate servers doing IPC, but you lose all the runtime advantages because the address space becomes polluted – everything is reading/writing to shared memory and if something goes wrong you no longer have a truly isolated state to recovery a server from… You also tend to need shared and/or global locks in some places because you compiled away IPC – if you were really doing IPC those locks would not be needed (or at least would be implicit).
It works, it has advantages, and some of the downsides can be addressed in other ways, but it isn’t really interesting from an academic point of view. Its just a nice way to organize what is ostensibly a monolithic kernel at runtime – I just think the real benefits of a uKernel are only to be found at runtime…
L4, QNX, etc. manage to avoid throwing out the baby with the bath water – they don’t do message passing in the strictest sense (most of the time), but they still maintain the runtime advantages of truly isolated address spaces and the benefits it brings.
I read the paper on this, and from the sound of it this is quite a different kind of uKernel though…
http://muen.codelabs.ch/muen-report.pdf
I doesn’t even have IPC… It is actually based on shared memory. All the kernel does is allow wiring processes together by explicitly defining what parts of memory they can “share”.
Interesting to be sure, but it is surely a different kind of beast from what is usually considered a uKernel. The paper mentions a similarity to vxWorks, implying that this is aimed as embedded development – it is more of a hypervisor than a uKernel.
Edited 2014-01-23 00:24 UTC
What bugs me is that Intel & AMD haven’t done the obvious and provided a simple and fast way to pass messages directly across a virtual memory boundary. Given the size of even an L1 cache in a modern CPU, why the heck can’t I have some way to directly map a 4k page of that cache into one or more processes and use it as a ring buffer or FIFO?
Because it is stupid?
This argument has been going on for DECADES, by stupid people who need money to keep their stupid cash donors convinced of the “benefits” and “possibilities” without anything going anywhere and no research to show for it.
In short, we need money to write more papers and buy chicken.
This is how I look at it:
You cannot define a system of logic that is not capable of self contradiction.
So first of all, you cannot have a formalism in Computer Science that will yield a perfect program.
It will NEVER happen.
This problem was solved and put to rest by men that are far more brilliant than those born today with corn syrup, anti depressants and rittalin stringing through their brains with the attention span of a teetse fly.
Secondly from a pragmatic sense, you have two, and ONLY two camps to choose from in this debate if we are talking about Von Neumann machines:
1) Subsequent problems with error in OS kernel execution is something that the hardware must protect against, and so must the formalized logic of the execution of the kernel.
or this camp, which is where I am in:
2) OS kernel problems with error in execution is a tools and engineering problem. The OS should remain simple for hardware as much as possible.
It should be clear why #2 is superior in cost, engineering and portability to the simplest hardware designs from pacemakers to the worlds largest super computers.
They all run one operating system, one MONOLITHIC OS KERNEL, and it isLINUX.
(Those that do not, are quaint.)
You will NEVER product a Micokernel that can straddle cost, performance and engineering as a result of #2.
And no hardware manufacturer would dare risk their balance sheets betting against Goedel’s theorem.
End of Discussion and stay away from MicroKernels on Von Neumann machines.
-Hack
Well thanks for that fascinating digression into your mental health.
I don’t see how letting the user-space directly map L1 cache for IPC has anything to do with the rest of your diatribe.
As others have pointed out, there are numerous examples of successful microkernel OSes. Also, you’ve jumbled the “provably correct” issue and the micro- vs. monolithic- kernel debates.
I’m reminded of that scene in Good Will Hunting:
Side note: It’s tsetse fly.
Since we’re bringing out our college class cheat sheets, I’ll declare this to be an example of the Fallacy of the Excluded Middle. And, in fact, I can point out at least one other option. The NX bit / Data Execution Prevention is implemented both in hardware and software.
That’s a demonstrably false position. Consider page 42 of UBM’s 2013 Embedded Market Study http://www.eetimes.com/document.asp?doc_id=1263083 (or http://e.ubmelectronics.com/2013EmbeddedStudy/index.html if you want the original, which is free but requires registration.) And that’s not counting all the little embedded devices that have no actual OS.
You are right about supercomputers, of course. Still, the fact that Linux has a 95%+ share in that space doesn’t mean it’s completely perfect and there’s no reason to every try anything else ever again. It’s GOOD, yes. It’s not the Answer to Life, the Universe, and Everything.
Odd; I would think that a microkernel would be BETTER for point #2. Find a security flaw in a module? Fix it! On the fly without a reboot!
Except for Intel. And AMD. And ARM, DEC, Sun, IBM, Motorola…
http://en.wikipedia.org/wiki/NX_bit#Hardware_background
Even if I buy your premise (I don’t) that G~APdel’s incompleteness theorem implies that there’s no point to trying to formally prove correctness of software, I fail to see how that has anything to do with microkernel vs. monolithic. Your argument goes something like “Relativity proves we can’t go faster than light, therefore steak is better than hamburger.”
-James
Considering vander was asking about a hardware feature, I find your post both asinine and completely misdirected…
I’m just a web developer who reads way too much and is fascinated by the history of kernel design. The little I really know on the subject was read, I never did anything amounting to real kernel design, so my opinion on such matters is really worth squat – I just contribute where I think something I read is relevant.
Vander has actually done kernel design. From the tone and content of your post I doubt you could say the same… Linux is great. I like it, but it doesn’t represent the final solution to system design for the rest of time…
Grow the fuck up.
I think given the breath of devices LINUX currently runs on that it is self evident, that of course, it IS one solution for just about any problem you can think of.
Despite your comment about growing up, you are not going to change reality.
It is a fact that LINUX exists in all realms and works equally well.
No way are you going to get a correct, nor a complete solution to a formalized kernel of any kind.
I don’t care what it is, Micro or Monolithic.
That being said Monolithic is currently and will forever reign supreme on von Neumann machines, not from just a scalibility approach which is SELF EVIDENT, but also from a cost one as well.
I would also like to point out complete also infers secure.
Not going to happen, even LINUX isn’t secure and never will be in those terms. (Formalized.)
-Hack
You have yet to point out where there is a claim of completeness.
My limited understanding is that this is avoided because of a desire by Intel/AMD to keep the L1/L2 cache architecturally hidden, giving them the freedom to change its size, structure, mapping behavior, etc., even remove it entirely, without impacting the ISA significantly.
I’ve seen mention of similar ideas in the past. From what I have read, one solution seems to be if you do things “just so”, and make sure your page is properly marked as shared between processes, AND you take advantage of some of the newer virtualization primitives to tag the TLB entries…
Theory being although you cannot access the cache directly you can achieve behavior where the TLB entry for that page is tagged to multiple processes address space, and remains coherent across context switches. In other words you are still dealing with going through the MMU, but you can sort of “soft pin” the page into the L1, at least in the sense that a context switches between 2 processes that share that page won’t flush the page from cache and the overhead of the address translation is completely avoided. So the page stays in the cache as long as some other outside influence doesn’t flush it out…
I don’t think anyone is really doing this kind of thing much outside of hypervisors though, and even then I have no knowledge of anyone actually implementing it fully. I remember reading a very interesting thread on the Linux KML about this, but I can’t seem to find it now…
Edited 2014-01-23 04:20 UTC
I wonder, is the cache in x86 system tagged by the virtual address?
On PowerPC it is tagged by the physical address, though this is a problem on “normal” context switches, for MP it is “HW” support.
Regarding mapping a page, I’d say if you map the same virtual address in two processes, you can rely on the cache controller and you should have no performance loss, You copy your data into the page and it goes also into the cache.
Sandy Bridge at least, and AMD’s bulldozer are virtually indexed and physically tagged (VIPT). Is suspect that this is probably the norm for all of the last few generation x86 chips from both Intel and AMD (have no idea about VIA).
It certainly isn’t a requirement of the x86 ISA though, as the cache is architecturally hidden (you can pretty much do whatever you want, even no have a cache at all).
Many ARM designs are virtually indexed and virtually tagged (VIVT), which is probably the only other method still being employed in modern cpu caches. PIPT caches are more or less by definition too slow nowadays.
Its all about tradeoffs…
THX,
I am not sure what “virtually indexed” means. I have to read this up.
NT does at least still do internal IPC kernel calls, as a separation mechanism between layers.
But a stray pointer can bring the whole thing down anyway, as it is the same process anyway.
Fair enough. I guess my point is that IPC means “Inter-Process Communication” – if you use it inside the same process address space it becomes a glorified function call. That’s fine from an organizational perspective and does yield you modularity, but it isn’t really IPC anymore because there is no longer any process isolation.
Edited 2014-01-24 05:27 UTC
I do not see why IPC and process isolation must stick together. Even within the same address space (aka process) using message passing serves to get a better design (and even w/o the overhead of copying data).
You’re talking about a claim from the 90s. Microkernels certainly did have performance issues with single core 32-bit x86 architectures. This is where most of the testing was done to “prove” this claim. Modern architectures have a register file, which makes context switches cheaper, and many, many more cores, which makes multithreading kind of important. Monolithic kernel designs find it hard to distribute kernel-based drivers across cores.
I’m not saying Microkernels are definitely (comparatively) faster now, but that the conditions are very different, and the tests need to be redone under modern architectures.
It’s not just a microkernel, it’s a separation kernel.
The techniques are similar (and so claiming it’s a microkernel, as happens on the website isn’t wrong), but the system you end up with is quite different from your ordinary microkernel design.
The project is _all_ about security.
For example, it’s built around a fixed process scheduler to reduce side channels that are based on CPU use. From a performance point of view, you just Don’t Do That[tm].
Edited 2014-01-23 06:31 UTC
My uni lecturer has been involved in SEL4 http://ssrg.nicta.com.au/projects/seL4/ which also claims to be the first design to use formal methods to verify security. Maybe they just don’t know about each other but it appears one of them has to be wrong.
Your lecturer’s claim is that they are the first to use formal methods to verify.
This projects claims is that they are the first to GET verified.
They are different claims.
The important qualifier in the Muen statement is “Open Source” – there are a number of formally verified kernels out there, but none are Open Source.
Good catch.
have been drinking spiked hot coco while they wrote their formalisms of correctness.
http://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleten…
-Hackus
That only applies if they claimed to be both complete and correct. They can very well claim to be correct as long as they don’t claim completeness.
What you may be looking for is the halting problem.
SPARK 2014 (which Muen doesn’t use yet) attempts to attack the halting problem with loop variants. By proving that all loops terminate at some point, you can show that all blocks of code terminates eventually.
In case of an operating system kernel, you will still have an endless outer loop, since you don’t want the entire system to terminate after a well-defined number of steps.
And it’s not even the first language that eliminates the halting problem. For example, DTrace’s script language is also safe since it only allows loops with predefined iteration boundaries and forward jumps.
Neither language is turing complete (in case of SPARK 2014: if you choose to use loop variants), but many solutions to problems don’t need this kind of power, so you can still write meaningful code within this constraint.
The halting problem doesn’t prevent specific classes of failures being proven/disproven.
I was only meaning to point out that even the halting problem isn’t as universal as pop computer science likes to make it.
Pop computer science? Is Alan Turing pop computer science?
It is a matter of mathematical proof that there can be no method of checking whether an arbitrary program will halt.
Not even SPARK 2014 can catch all non-terminating loops. It can either be too restrictive and only limit you to a very small subset, or catch too many false-positives.
The halting problem IS universal. People just keep getting tripped up over the idea that it is a general and thus doesn’t prevent very specific cases from being proven.
No, but many interpretations of “X is impossible because of Y” where Y is some classic CS theorem are.
You certainly can’t build a turing machine emulator in those languages. So yes, arbitrary programs are out.
It is “too restrictive”. The idea is that it’s still good enough for some meaningful operations – also, loop variants aren’t enforced everywhere by the language (but only where you annotate them): as said, the outer loop of an operating system kernel’s scheduler will run “endlessly” with a non-deterministic exit (shutdown event).
But being able to show that everything else does return, so at some point, the processor ends up in the main loop again, is already very useful.
Dtrace enforces that scripts terminate eventually with no exceptions, but it’s a very special purpose language. It’s still very useful to its users, so the “very small subset” is good enough.
It’s designed to be used on production systems, messing with the kernel, so I’d rather have one level of safety more rather than expressiveness I barely need. It’s a good trade-off.
(and all these constraints keep hardware bugs out of the question. Trying to model these will be so complex as to be useless…)
kwan_e,
“The halting problem doesn’t prevent specific classes of failures being proven/disproven.”
Many people learned about the halting problem in absolute terms and don’t seem to get this. I’d say the over-generalization of the halting problem’s conclusions over all algorithms classes makes it one of the most abused concepts of CS.
The set of all algorithms:
1. contain self contradictions.
2. contain no self contradictions.
The halting problem only proves there’s no solution for those in #1. By extension, it proves there’s no solution for the most general combined set of #1 and #2. However it says nothing about the feasibility of halting solutions for #2.
Q: Just for fun, does this function halt or not halt?
function F(int x) {
return x==0? x : F(x-1);
}
A1: Mathematician – It only halts for non-negative inputs. Negative inputs will loop infinitely as x approaches minus infinity.
A2: Computer scientist – It always halts due to the eventual stack overflow on a real machine.
A3: Computer scientist – It always halts due to the eventual integer wrap around on a real machine.
A4: Physicist – It always halts because running it infinitely would require more energy than exists in the universe.
In order for Microkernel’s to work well, lots of stuff has to be built in hardware that is not needed in a Monolithic kernel.
Partition states or context switching and message passing could be done in hardware for example to make Microkernel’s much more efficient.
That IS, if researchers could ever agree on exactly how a Microkernel should work and which parts should have hardware built for them. THEN Intel and AMD or ARM etc could do a cost analysis and determine if a market exists for them.
(HINT: One doesn’t exist. )
So why hasn’t a Microkernel evolved to the state of LINUX?
One problem is that researchers cannot even agree on a final design plan for a Microkernel. I would also like to point out that this research has been a technological cul-de-sac since the 80’s.
For the sake of the argument, and to save all of those PhD’s from irrelevance, say Microkernel design was finished and a plan was presented to Intel, AMD and ARM engineers for large scale deployment of such hardware.
The PLAN would require large portions of silicon to be added to todays machines which do not have what I would call excessive NANNY STATE kernel’s running on them. (i.e. MONOLITHIC kernels).
Microkernels (and the ideas around them) would STILL run slower than MONOLITHIC kernels, consume more power to run per watt than a Monolithic kernel requires of computer hardware.
In short, we would have a collapse of the computer industry, everyone would go broke, and we would have to build thousands of new Nuclear power plants to make it all work.
Finally, they would all melt down and well, that would be that.
-Hackus
PS: For the PhD’s sending me hate mail (undoubtedly using a Monolithic Kernel to send it), please remember I do not control your department funding, I do not write your NSF grant papers and I am sure your research will turn out something…
One of these days. (Wink Wink Wink).
There is absolutely no basis for this statement.
I would argue that QNX has, in fact, evolved to the state of Linux. Maybe even better. But they’re closed source. Of course, the kernel design is not the only reason Linux has been successful. Consider that *BSD also has a monolithic kernel, yet has not seen anywhere near the level of uptake as Linux based OSes.
Again, no actual basis in fact. Yes, IPC introduces some overhead that can impact speed (and therefore performance per watt.) How large that impact is depends on a lot of factors. Monolithic kernels have their own share of issues to resolve. There are trade-offs to any design decision.
-James
A simple cache is enough. At least in the environments I know.
My companies RTOS together with the TCP/IP stack could be seen as a MP-^AuKernel and outperforms on a 1.5GHz PowerPC Linux.
Of course, this is no general prove only personal experience. But it shows, that even the “old” Power architecture is good enough for a ^AuKernel.
Sure it could. Same for context-switch when entering a monolithic (btw. .. ithic => from stone age ) kernel.
Has this been done for Linux or BSD.
IMHO, the main motive for Linux in the beginning was to have a alternative to M$.
Now it is so widely spread, that big companies see there benefit from it, either because it is “free” or because they have the sources at hand.
For a “new” ^AuKernel which wants to be a replacement for Linux on mainframes or even on the desktop there is no more a herd of idealistic programmers.
But if you turn away from mainframe and destop towards embedded, there are a lot of micro kernels.
Some like PikeOS (has roots in L4) started as hypervisors. Some like QNX (or better Neutrino) are there for years.
Both use message passing as IPC.
I would upvote you if not for that immature “M$”…
They do not claim “correctness” in the mathematical sense. They simply guarantee exactly what is in the quote in the parent article:
That is not at all the same thing as proving correctness… Read the article before judging, because if you look into the track record of the SPARK language and you consider the kernels design (extremely high reliance on hardware supplied isolation features, extremely small code footprint, a plethora of self imposed restrictions by the language, etc.), proving the simple statement above does not sound all that far fetched to me.
Keep in mind this is kernel is written in a language that self-imposes the following restrictions:
No access types (pointers)
No recursion
No exceptions
No goto
No anonymous and aliased types
No functions with side-effects
No dynamic array bounds
No dynamic memory allocation
No dynamic dispatch
Its basically a static system – address spaces usage and all allocations are done through static configuration, not code. It isn’t all that hard to validate because the language is so restrictive that virtually any form of dynamic behavior by code at runtime is impossible.
Useful for some things? Maybe so. Competition for other more conventional kernel designs? Um, no. Its really apples and oranges – this is for building absolutely bullet proof embedded systems, its too weird for anything else…
Edited 2014-01-23 04:54 UTC
You have the same restrictions when writing compliant MISRA C code, which is required for environments where human life’s are at risk.
True. I saw this and though you might find it interesting since you brought up MISRA C:
http://www.spark-2014.org/entries/detail/misra-c-2012-vs-spark-2014…
I^A'm not talking about the microkernel design but the programming language. Linux uses dirty tricks in C and in some cases assembler to optimize as much as possible. Can you match that performance in a restricted language like Spark?
The first time I looked at the Muen code, I wondered “where is this thing actually doing something?”
The answer is that it pushes lots of things to compile time and hardware. The scheduler runs a fixed schedule. Memory mappings are compiled in. The Ada runtime checks can be disabled since SPARK makes sure the code never runs into them. Isolation is provided by VT-x.
All that makes Muen a really special kernel for really special purposes, which is why that sort of kernel has a special name: Separation Kernel.
But there’s a place for it. The academic work in that field goes back to the late 60s, and there’s still nothing better for situations where you _really_ _really_ want to make sure that data doesn’t bleed from one process (“subject”) to another – except running each process on its own processor with its own RAM, which is what Separation Kernels seek to emulate.
Performance isn’t the be all and all measurement, you know. Are you driving a Ferrari F40 when you go off-road? Why not? It has better performance than a Land Rover.
Most safe system programming languages can match C without issues. It is always a matter of what compiler is used and what type of optimizations are available on the backend.
Many people talk about bound checked arrays and other stuff as performance bottlenecks in safe system programming languages.
Yes, it is true that they can hinder performance.
However, what those people fail to acknowledge when pushing their C agenda, is that those safety mechanisms can be turned off in the few hotspots where those extra ms really matter.
While it may seem the same thing, it is a matter of defaults.
A safe systems programming language requires that the programmer states, here are dragons enter at your own peril, via explicit unsafe constructs.
Forget about turning them off. To make C as safe as a language like Ada, they’ll have to write the same bounds checking code everywhere – and there will always be mistakes – whereas you get that for free in Ada. Once that bound checking code is in C, they’ll be just as slow, not to mention it is conceivable that a compiler can better optimize its own bounds checking code than any hand coded bounds checking.
moondevil,
You are right. Sometimes even C falls short compared to other languages and most of us aren’t even aware of it. The C optimizers ability to perform optimizations is severely curtailed due to “pointer aliasing”.
For instance:
int a;
void power(int*b) {
int r=1;
while(*b>0) {
r*=a;
*b–;
}
return r;
}
Although the example is rather contrived, it illustrates a subtle problem for the compiler. We would like to compiler to optimize the code by storing a and *b in registers, however by doing so we’re unable to guaranty correct semantics if b points to a. This is a _very_ simple case, but in large C projects the compiler can be forced to give up a lot of potential optimizations.
This is exactly why languages without pointer aliasing can optimize even better than C. The “restrict” keyword was introduced into the C language to tell the compiler to perform the optimizations regardless of the outcome should two pointers be aliases.
http://en.wikipedia.org/wiki/Restrict
Wonderful discussion! I hack on small OS kernels for a living, so time to sign up and join the bashing!
Compare file I/O performance between Linux and a micro kernel, and you’ll see that Linux wins clearly. Compare IPC on a microkernel to DBUS on Linux, and then the micro kernel wins. But ruling the desktop is not the goal here.
The goal of projects like Muen is to achieve fault isolation of the userland stuff they run with the minimum amount of code running in hypervisor mode. You have to choose wisely between performance and safety, and then you can get what you have asked for. Separation/micro kernels are niche operating systems. And there are niche markets willing to pay this price.
With hardware virtualization, you can easily get the benefits of both worlds: put Linux into a VM, then AMD/Intel will take care that this Linux runs as fast as possible ontop of your micro kernel. Basically, HV on cheap x86 systems revived micro kernel hacking in the last eight years.
On the other hand, Linux adopted micro kernel concepts over the time as well: Lots of the approaches found in the real-time patches have been in use on micro kernels for 30+ years, like threaded interrupt handlers or preemptible device drivers.
stwcx,
Welcome to osnews!
Thank you for seeing both sides of the issue! Obviously both approaches can have pros/cons, one has to be somewhat disengenuous or naive to be completely one sided over it. I agree with you that hardware virtualization extensions are very encouraging for microkernels. CPU bound tasks show very little overhead since they don’t context switch often. So the focus would be on IO bound tasks. Here IBM shows file IO overhead within KVM at around 11% (give or take depending on workload).
pic.dhe.ibm.com/infocenter/lnxinfo/v3r0m0/topic/liaav/LPCKVMSSPV2.1.pd f
If an entire OS running in a VM is acceptable for things like amazon’s elastic cloud, web hosting, data center server consolidation, etc, then it logically stands to reason that isolation overhead isn’t so bad as to render the technology commercially nonviable, indeed it’s already proving quite popular due to it’s advantages despite the slight overhead. The same argument could also favor microkernels.
“Again, no actual basis in fact. Yes, IPC introduces some overhead that can impact speed (and therefore performance per watt.) How large that impact is depends on a lot of factors. Monolithic kernels have their own share of issues to resolve. There are trade-offs to any design decision.”
-James
I agree.
Microkernels have gigantic elephant sized tradeoffs, while Monolithic kernels have Mymaridae sized tradeoffs.
-Hackus
hackus,
Your posts to this article sound very preachy to me. They keep saying things you want us to take as fact, but you haven’t really shown any evidence to support your assertions. I wouldn’t mind so much if you produced _something_ tangible, but these claims that microkernels are as horrible as you say they are, even on modern hardware, it’s beginning to sound like dogma. If I am mistaken and you do have some factual information on which your view is formulated, then I apologize. However I will ask that you please post it here for the benefit of all to discuss.
I do expect microkernels to have some performance overhead when designed as a strait substitute for a macrokernel, but on modern hardware using the latest isolation techniques, overhead should be minimal. My preferred design for a microkernel would be based on async IO with zero-copy message queues that amortizing syscalls across a greater number of operations. This could reduce the overhead to practically NIL and even end up beating the conventional macrokernels that require one syscall/operation.
To put it in clearer terms:
CPU cycles =
A) userspace cycles/unit work +
B) context switch cycles/”syscall” +
C) kernel cycles/unit work
“syscall” is in quotes because I’m not defining the mechanism nor the number of context switches involved, my point is to show that the overhead, whatever it is, can be amortized across multiple operations.
Example 1:
microkernel
A = 1000 cycles/task
B = 1000 cycles/syscall
C = 1000 cycles/task
macrokernel
A = 1000 cycles/task
B = 50 cycles/syscall
C = 1000 cycles/task
These numbers are merely illustrative, note that I’m giving macrokernels a large advantage by giving the microkernel a 20x syscall performance penalty for additional context switching. So under these circumstances the microkernel will require 3000 cycles to get the task completed, 46% MORE than the macrokernel’s 2050 cycles.
Example 2:
microkernel
A = 1000 cycles/task * 100
B = 1000 cycles/syscall
C = 1000 cycles/task * 100
macrokernel
A = 1000 cycles/task * 100
B = 50 cycles/syscall
C = 1000 cycles/task * 100
So by taking advantage of the amortization of syscalls over 100 tasks, the microkernel needs only 2010 cycles per task and the macrokernel needs 2005 cycles per task, a mere .5% advantage. Note that the amortized microkernel uses FEWER cycles per task than the original macrokernel above!
I also think there are also other potential optimizations as well. Zero-copy memory-mapped IPC could theoretically eliminate the microkernel IPC syscalls entirely, relying entirely on fixed overhead scheduling instead, thereby avoiding almost all syscall related overhead. I conceed this is quite a different (read “incompatible”) approach from the kernel interfaces we’re traditionally used to, but it’d still have very real potential to offer scalability in extremely demanding applications all the while offering benefits of secure isolation as well. We should not let labels like “microkernel” overshadow merit and should welcome innovation regardless of where it comes from.
Edited 2014-01-24 07:07 UTC
Additionally as I mentioned in another thread, there are a few micro-kernels being used in production nowadays, some of them by well known companies.
It is no like we are discussing some OS design stuck in university labs.