“Darwin is the Unix-derived core that provides the underlying foundation for Mac OS X. At Darwin’s heart is the XNU kernel; a Mach 3.0-based microkernel that has been modified to include portions of FreeBSD for performance reasons. Let’s take a trip down into the core of OS X to learn more about the foundation that gives us one of the best user experiences in computing.”
What Is Darwin (and How it Powers Mac OS X)
28 Comments
-
2005-09-29 1:11 pmAnonymous
Not exactly: MACH is to Mac OS X as it was to MkLinux.
In order to implement MACH 3.0 in Rhapsody/Mac OS X Server Developer Preview, It is said that Apple has made use of the work its previous MkLinux team had made:
http://www.kernelthread.com/mac/oshistory/10.html
I by no means am trying to start a flame war, and I am not a troll.
I’m a Unix Server guy who owns two MAC’s, and is also interested in working with Darwin. Currntly I’m working on OpenSolaris, but I have been waiting for some of the MAC threading issues to be worked out.
Has anyone heard any *constructive* information with respect to what is being done to improve the threading on Darwin?
I know that the issues don’t really matter when using Darwin as a workstation. Personally I think OSX works great as a desktop OS, and I use it every day.
I would love to jump in and help, but I’m not a kernal developer, and I don’t think I have the skills to make a difference at this level. I’m more of a tools developer, and documentation person.
-
2005-09-29 3:58 pmBryanFeeney
How is it you can simultaneously be waiting for the threading issues to be worked out, and own two Macs?
When you say work with Darwin, what do you mean other than kernel level stuff?
How can you use Mac OS X “every day”, if you’re currently “working on OpenSolaris”?
I’m very confused.
Anyway, to answer your question, Apple does appear to be aware of their kernel’s concurrency issues. They did away with their own version of the big kernel lock in 10.4 (see http://arstechnica.com/reviews/os/macosx-10.4.ars/4), and they’ve given themselves 18months to work on 10.5 (Leopard), which would suggest they’re working on something substantial. Whether they’ll use this to work on threading is, as yet, unknown. Currently Apple uses a threading solution that’s inherited from the BSDs, going a Linux or Windows route would involve a lot of detailed technical work.
In any event, this is only an issue for major database-backed websites. For mid-to-professional level file-servers, mail-servers and general Intranets, Mac OS X.4 is perfectly okay.
-
2005-09-29 4:11 pmAdurbe
darwin is available on x86 so he could ‘easily’ replace opensolaris with darwin
with regard to threading from what I have read its only really SQL threads it falls over on at the moment. It holds its own in other depatments
-
2005-09-29 4:31 pmjapail
There are no magic SQL threads in XNU. Any largely parallel endeavor (processes and threads) that you intend to scale over multiple processors that call into the kernel frequently are going to be hampered by locking granularity. If your threads spend most of their time doing computations then that’s obviously not a problem. If the parallelism of your threads is unimportant then it’s obviously not a problem. As of limited breadth and seemingly confused in areas as the AnandTech investigation into XNU performance was, simply because they reported problems with MySQL does not mean that any performance bottlenecks exposed are limited to SQL. I have no idea why you’d even think that.
-
2005-09-29 4:34 pmcwdrake
Bryan…
Why is it not possible that he is working on OpenSolaris AND using Mac OS X????
I don’t understand what is confusing you.
-
2005-09-29 4:45 pm
-
2005-09-29 4:46 pmjapail
> How is it you can simultaneously be waiting for the
> threading issues to be worked out, and own two Macs?
You buy two Macs. There that was easy. Now you want to use Darwin for a server operating system (maybe and maybe not with the Macs) for some reason. Ah, but its parallelism is insufficient for your needs, so there you find yourself waiting.
> When you say work with Darwin, what do you mean other
> than kernel level stuff?
It sounds like he wants to either use Darwin for a server, or wishes to participate in a development effort centered-around Darwin for constructing a server operating system.
> How can you use Mac OS X “every day”, if you’re
> currently “working on OpenSolaris”?
It’s almost as if you can own and use more than one computer, or use any one computer for more than one thing every day.
> Anyway, to answer your question, Apple does appear to
> be aware of their kernel’s concurrency issues.
Of course they are; that doesn’t really answer his question, though. He’s curious if there’s any useful information on the intention of Apple to refine the granularity of their locking so as to improve parallelism. I think he can just take it as a given that Apple will continue to improve the locking in XNU as it becomes beneficial for their sales strategies to do so. It’s a complicated, time-consuming process as any of the Linux or BSD developers will be glad to tell you.
> Currently Apple uses a threading solution that’s
> inherited from the BSDs, going a Linux or Windows
> route would involve a lot of detailed technical work.
There’s not a lot that’s appreciably different in the abstract from the former; the BSDs just started later and had less granularity available at the time.
> In any event, this is only an issue for major
> database-backed websites.
It’s an issue for anything that wants performance to scale with the number of processors available and spends a lot of time calling into the kernel. If performance isn’t important, or the nature of your performance-critical code is computational in nature, then it’s obviously not a big deal.
-
2005-09-29 6:57 pmBryanFeeney
You buy two Macs. There that was easy. Now you want to use Darwin for a server operating system (maybe and maybe not with the Macs) for some reason. Ah, but its parallelism is insufficient for your needs, so there you find yourself waiting.
I’ve not seen anyone using OpenDarwin (the only other Darwin system that one can deploy) for anything other than experimentation. I can’t see how a serious task would exist, (one whose reliance on threading would cause him to stall) that would work better on OpenDarwin than Mac OS X. Further, if Mac OS X is not the draw, I can’t see why you he would not switch to Linux or Windows 2003 to achieve this serious task that relies on multi-threading.
You could say he wanted to experiment, but then why bother waiting for a major improvement in multi-threading?
Basically, the point of those three questions is that I thought the author was, in fact, trolling. The threading thing is a bug-bear that’s been thrown up recently and is perfect material to get people going during a debate involving Apple (my apologies to the author if I was wrong).
However despite this I did answer his question, for general benefit. You may say:
that doesn’t really answer his question, though. He’s curious if there’s any useful information on the intention of Apple to refine the granularity of their locking
but in my original comment I said
They [Apple] did away with their own version of the big kernel lock in 10.4 (see http://arstechnica.com/reviews/os/macosx-10.4.ars/4), and they’ve given themselves 18 months to work on 10.5 (Leopard), which would suggest they’re working on something substantial
I went on to mention that switching from M-to-N threads (which is what Apple has) to 1-to-1 threads would be quite an invasive change, and therefore there’s no guarantee it will happen, despite Apple’s obvious awareness of the issue.
In short, I told him as much as anyone knows about Apple’s work on improving concurrency.
Incidentally, the BSD’s added threading at the same time as or even before Linux; the difference is Ingo Molnar, Urich Drepper and others at RedHat completely re-engineered Linux’s threading architecture during the development of the 2.6 kernel. The BSDs have not followed Linux’s lead in this. To be fair, there are concerns with the way Linux does things, so this is not a slight on the BSD developers’ part.
For people who are wondering about M-to-N versus 1-to-1, Apple and BSD have threads in userland, with a userland scheduler as well as the process scheduler. This offers security at the cost of code duplication (two schedulers) and additional system calls. Linux places threads in the kernel and just uses the process scheduler for them. This offers speed, but some people wonder if there might be some risk with having them inside the kernel. And before anyone else says so, that’s quite a simplified summary
> In any event, this is only an issue for major
> database-backed websites.
It’s an issue for anything that wants performance to scale with the number of processors available and spends a lot of time calling into the kernel.
The ability to scale time spent on system calls with the number of processes and processor cores has been much improved by the replacement of a single big kernel lock in with more granular locks in 10.4 (as I mentioned in my original post). As regards threading, most Unix servers and utilities use fork(), and for mid-to-professional users, the fall-off caused by the poor threading system in 10.4 is not a major issue. It is for Enterprises, which is why I didn’t include them.
Mac OS X running on SMP systems will still offer benefits through multi-tasking, and concurrency achieved through forking. The systems which hurt the most from poor multi-threading are some databases (MySQL being the main one) and Java; in short, the things I didn’t mention in my list of acceptable uses for Mac OS X Server.
-
2005-09-29 7:51 pmMatt Giacomini
“Basically, the point of those three questions is that I thought the author was, in fact, trolling. The threading thing is a bug-bear that’s been thrown up recently and is perfect material to get people going during a debate involving Apple (my apologies to the author if I was wrong).”
My intent was not to bash Apple. You do bring up an interesting point though. People are bringing up bugs/issues, like the threading issue I mentioned, more often. Maybe on message boards like this and other, people are brining it up to try to bash Apple, but I think there is also another (more positive) reason that these things are being brought up more often:
Most of the people I work with are server guys. I have worked on and off doing UNIX administration for the last 11 years. Up until two years ago the only time anyone (in my UNIX circles) would bring up a MAC is if they were making a joke about it, but that has changed now. Apples new direction has interested many of the old UNIX guys I know. Many of us now own MAC’s, and we have had a lot of positive conversations about our expirences with OSX. Of course being server guys we try to run everything on it that we work with on a daily basis and subject it to the same tests that we subject AIX, Solaris, Linux, etc…. to. In this process we have found issues, the threading issue included, but I don’t take this as a bad thing. OSX is gaining interest from people that are byond the nitch market that OSX was built for, and I personally think that is a great thing. Why is us giving feedback on issues that are important to us considered Trolling?
“I can’t see why you he would not switch to Linux or Windows 2003 to achieve this serious task that relies on multi-threading.”
Well sure currently that is our only option. Should I just use another operating system and keep my mouth shut? If the Darwin comunity is not insterested in my input or help, then fine… My side point in my first post, which I didn’t state very directly, is that I think that having a more functional Darwin installation, would bring more people to the community like myself. I think the Darwin community could use more server side type people to get more involved on issues that more directly concern us. As stated in many other message boards (beyond a few Apple developers) I don’t think that Apple directly cares much about building a functional stand alone Darwin, or building a community that includes people like me. Too bad because I would interested.
Hopefully the Open Darwin people are successful even with the *IMO* limited support they get from Apple.
-
2005-09-29 7:54 pmjapail
> I’ve not seen anyone using OpenDarwin (the only other
> Darwin system that one can deploy) for anything other
> than experimentation. I can’t see how a serious task
> would exist, (one whose reliance on threading would
> cause him to stall) that would work better on
> OpenDarwin than Mac OS X. Further, if Mac OS X is not
> the draw, I can’t see why you he would not switch to
> Linux or Windows 2003 to achieve this serious task that
> relies on multi-threading.
People like to use different operating systems. That’s why people read OSAlert. It really doesn’t matter if he uses OS X or if he uses Darwin by itself, because they both suffer from the same performance limitations in this regard. If he wants to participate in a community centered around Darwin, how is that any different than one centered around OpenSolaris or NetBSD? He might be trolling, and he might not. Even if he is, his views may represent someone’s. As for Linux or Windows, why would he switch to that rather than OpenSolaris?
Instead of becoming defensive around how incredulous his position seems to you (when it leads to no contradictions despite how that seems to puzzle you) you can address his questions, mod him down because you think he’s trolling, or ignore him.
> I went on to mention that switching from M-to-N
> threads (which is what Apple has) to 1-to-1 threads
> would be quite an invasive change, and therefore
> there’s no guarantee it will happen, despite Apple’s
> obvious awareness of the issue.
You’re confusing user-land thread scheduling with kernel lock granularity. Mach threads are all preemptive, and are precisely what POSIX threads are implemented in terms of in OS X. On top of this Cocoa threads are implemented as one-to-one threads, and Carbon green threads are implemented. When any of these Mach threads call into the BSD subsystem (which is where their scalability problem arises) they most acquire locks (which prevent other Mach threads from entering these code sections resulting in the publicized scalability differences). The cooperative threading of Carbon green threads is not the question.
> This offers speed, but some people wonder if there
> might be some risk with having them inside the
> kernel.
This is pure fantasy.
> As regards threading, most Unix servers and utilities
> use fork(), and for mid-to-professional users, the
> fall-off caused by the poor threading system in 10.4
> is not a major issue. It is for Enterprises, which is
> why I didn’t include them.
The locking granularity affects the parallelism of all BSD subsystem accesses regardless of whether they’re pthreads or processes. fork is actually more expensive than with other unices for other reasons that are also not important to this discussion.
> The systems which hurt the most from poor
> multi-threading are some databases (MySQL being the
> main one) and Java;
No.
-
2005-09-29 8:53 pmBryanFeeney
With regard to the choice of operating systems, the original poster said he couldn’t use Darwin until it supported better threading, while simutaneously saying that he had two Macs which he used daily. This seemed to be a contradiction to me. I was basically a bit suspicious of the comment because of the way it was phrased. It seems I was wrong, and I do apologise to Matt for saying so.
With regard to confusing kernel locking and multi-threading, I approached the topic of concurrency in general, instead of threads specifically, to indicate
a) The current state of concurrency on Mac OS X
b) To extrapolate from this where concurrency might go in future versions
If you read my original response, I used the locking example to indicate that Apple were paying attention to concurrency, and then posited that this attention might result in better threading in a subsequent version.
I freely admit to knowing only the basics of Mac OS X’s threading, I had heard (on kerneltrap.org I think) that they were using an M-to-N model inherited from the BSD days, but if I’m wrong, then I’m wrong. I had also heard on John Siracusa’s (104 page!) Tiger review that Mac OS X had significantly improved the granularity of kernel locking in 10.4, but again, maybe this doesn’t cover threads. I do wonder if there would be as much locking required if threads were implemented in kernel space instead of userland, but again, I don’t work with BSD-style systems much.
Incidentally, what Mac OS X server software uses threads more than databases and Java? Mac OS X Server uses Apache 1.3 by default, which forks as does most Unix software. Also, if threading and forking are both equally affected by locking issues, how come people are complaining only about threads and haven’t, for example, complained about adverse performance in Apache and other Unix software?
-
2005-09-29 8:49 pm
-
2005-09-30 3:30 amjapail
There are two distinct but related things here. One is multiprocessing which brings genuine concurrency and the other is the implementation of user-level threading models as a means of having multiple process-like entities that share address spaces. The locking granularity of the kernel will affect the scalability of all concurrency, whether it be normal or lightweight processes that call into the kernel. The effort for multiprocessing in Mach itself was put into place very many years ago, but the various modern free BSD systems started much later at making parts of their kernel safe to schedule on multiple processors. This is why XNU which is derived from Mach and pieces of various BSDs has had to essentially duplicate the same efforts of the other free BSD derivatives in increasing the granularity of the BSD kernel code. This is a complicated process and it takes time and care, and can be difficult to maintain in general.
The other matter is the manner in which user-level threads are mapped to kernel entities that can be scheduled concurrently.
You have your pure user-level N:1 threads, where the threading library schedules multiple threads within one process. This doesn’t require any special kernel support, but it also doesn’t provide any concurrency for executing the threads in a process. Thread creation and switching is really fast, but there are caveats with regard to calling into the kernel unless you add the requisite complexity to intercept and manage syscalls. This is how thread libraries on BSD, Carbon, and Linux (before linuxthreads) typically worked.
You can have 1:1 threads, where each user-level thread is mapped to an entity that the kernel can schedule concurrently. Thread creation and switching can often be most expensive. This the the model many systems use when they don’t just provide the previous implementation strategy, and there are scalability concerns with using a lot of threads in this manner. That’s partly why thread pooling was popular when Java didn’t have nio, because mapping a lot of Java threads directly to 1:1 threads could seriously impede scalability. A naive implementation (like say linuxthreads or NPTL) of threading with clone or rfork_thread, or XNU’s implementation of pthreads on Mach threads will provide such a model.
Then there are N:M systems, which attempt to minimize the cost associated with creating and switching threads (found in 1:1 systems) while providing concurrency (so as to not suffer the caveats of N:1 systems). This can be accomplished in a number of ways, and you can find a lot of interesting papers to read about various implementations and their performance properties. Many commercial unices offer M:N systems (such as Solaris), and have been the development goal of FreeBSD (KSE) and NetBSD (schedular activations). The implementation of an N:M system is much more complicated (multiple schedulars that optimally should communicate), but can have performance advantages depending on the nature of the work involved. It doesn’t improve security, it’s not inferior to a 1:1 model, and it’s certainly not the source of scalability problems with XNU (because it doesn’t use it for one) and it didn’t obtain it from other BSDs.
NPTL in Linux had a lot of purposes, though it wasn’t to alter the mapping of user and kernel threads. LinuxThreads had various limitations (hard limits), conformancy problems, and performance problems stemming from implementation strategies (manager threads for instance).
forking is not a replacement for threading; if you’re not using a shared address space why you’re using multiple threads is beyond me. Since the problem with XNU threads isn’t that they’re scheduled in userland (since they aren’t) this affords you nothing.
Mach isn’t an especially high-performance kernel to begin with, and secondly the conversion of the BSD subsystem into a thread-safe entity is an ongoing process. Since most threads for the OS X desktop user will either not be performance-critical or will be compute-bound rather than syscall-bound this matters rather little to them, and Apple will improve as is necessary. As a normal server operating system, though, this poses a number of problems if you’re performance-bound, because you’ll be spending a lot of time in the BSD subsection for networking, disk I/O, and so forth. This is true whether you use threads or not.
-
2005-09-30 3:36 amjapail
Those should read ‘scheduler’ and ‘conformance.’ I should either proof-read before posting (rather than after) or stop consuming idiot flakes before posting. My apologies.
-
2005-09-30 10:43 amBryanFeeney
Interesting post, thanks for the feedback.
Incidentally, Solaris switched to a 1:1 threading model in version 9 (see section 3 in http://www.cs.ucl.ac.uk/students/N.Pontikos/blog/java_concurrency/0…)
Further, the Linux kernel managed to substantially decrease thread creation and destruction time during the rollout of their 1:1 model. Ingo Molnar explains his rationale for using 1:1 instead of M:N or KSAs here: http://marc.theaimsgroup.com/?l=linux-kernel&m=103284879216107&w=2
His position is that while M:N is theoretically better, in practice implementation complexity tends to hold it back.
Also, a surprising amount of Unix tools (even KDE from what I’ve heard) use fork() and named pipes or shared memory to achieve concurrency instead of threads, despite their ease of use. Historically this was because threading on Unix systems didn’t perform as well. For example the traditional Unix daemon waits for a socket to connect, and then forks: the child does the work, while the parent goes back to waiting for another socket.
-
2005-09-29 6:18 pmMatt Giacomini
How can you use Mac OS X “every day”, if you’re currently “working on OpenSolaris”?
I love working with many Operating systems.
– I use Windows XP for my day job (just to VPN into my company and then telnet to the UNIX servers that I work on). My company uses a VPN product that only works on Windows.
– I bought my wife one of the MAC’s for her personal use.
– I do photography/video for fun and use my 2nd MAC for video and photo editing.
– And when I can find the time (usually on weekends) I use my last machine for looked at new OpenSolaris builds, or checking out a new Linux Distro.
Thank god my wife is also a developer or we would probably be devoriced by now
“In any event, this is only an issue for major database-backed websites. For mid-to-professional level file-servers, mail-servers and general Intranets, Mac OS X.4 is perfectly okay.”
My primary job is to support our companies Database and Application Servers, I also do some Java development. So my interest when working with *nix is to general test out and play with back-end services. OracleDB, OracleAS, mySQL, PostgreSQL, JBoss, Tomcat, Resin, and Apache.
I think that in general OXS works great! but, some of the back-end services I have tested on it (mainly apache and mySQL) suffer under heavy load. So as a guy interested in operating systems and back-end services, I thought I would check around and see what other people had heard with respect to these issues.
Thanks for the link
-
2005-09-30 3:46 am
If you need further assistance Mac users are standing by to convert you to our way of thinking.
Ok, fun aside, here’s Steve jobs demo-ing NeXTStep (requires QT
http://www.esm.psu.edu/Faculty/Gray/graphics/movies/jobs_NS30_demo_…
your a heap of f–kin twat head, and im gunna kill you, you hint of twatt
[email protected]
There’s an interesting thread ongoing on the OD’s Discuss mailing list about the stagnating status of OpenDarwin as an independent distribution:
“Apple’s Darwin commitment”:
http://www.opendarwin.org/pipermail/discuss/2005-September/date.htm…
My, from reading that thread, I take home two things:
– Jordan Hubbard is seriously dedicated to supporting Darwin, and the community is in the process of setting up a nice build system
– “Dipl. Ing.” (FH?) Markus Hitter is an obnoxious troll.
“There’s an interesting thread ongoing on the OD’s Discuss mailing list about the stagnating status of OpenDarwin as an independent distribution:”
Followed your link out and looked at the thread. I have seen it all many times before concerning Operating System xxxx, (put your own values in here). You see much the same regarding any company trying to market an OS based on open source code. I saw many of the same comments about Xandros in its early days.
Let’s face it, Companies trying to make a profit have to give people incentive to pay for their product. If they didn’t they would soon be out of business. In this respect Apple is no different than any Linux vendor charging for their distro. The free version will usually be as far behind their commercial version as they can get away with.
Anyway, thanks for the link. It was interesting reading the comments even if I had seem them before. Some things never seem to change.
There is virtually no demand for OpenDarwin, thus no community, thus stagnation. But if something that no one wants is stagnating, it’s not a problem.
“Apple’s Darwin commitment”: