Multiprocessing with the Completely Fair Scheduler

Submitted by anonymous 2008-01-15 Linux 26 Comments

“The Linux 2.6.23 kernel comes with a modular scheduler core and a Completely Fair Scheduler, which is implemented as a scheduling module. In this article, get acquainted with the major features of the CFS, see how it works, and look ahead to some of the expected changes for the 2.6.24 release.”

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

26 Comments

2008-01-15 9:58 pm

abraxas
I’m glad this article explained some of the differences between RSDL and CFS. There is always a lot of political talk about the two schedulers but not a lot of technical talk. I give credit to CK for showing the developers the need and the benefit for a new scheduler even though his scheduler was not ultimately included in the mainline kernel. Until CFS was merged I had been using RSDL but I must say that I am thoroughly impressed with CFS and the direction it has taken with group scheduling and the modularization of scheduling policies.
2008-01-16 6:45 am

antik
If I got 4 high-priority processes running and other guy got one lousy-nonsense app consuming cpu time then I have only 50% of cpu power and he got another 50%- I won’t call it *fair* at all!

I’d call it FCS (F**** Communist Scheduler)- yeah, I know what communism is- I lived in USSR…

2008-01-16 11:31 am

Almindor
You didn’t get it. If you got high-priority processes, they will have negative nice value, which works no matter what scheduler you have. Only on same-niced processes is CFS “fair”. It’s actually like communism, all processes are equal, but some are more equal.
2008-01-16 3:09 pm

FreeGamer
You obviously did not read properly… You are referring to an aspect of “CFS group scheduling” which can be used to create groups of exact fairness, but it’s 1) flexible (some groups can be given higher priority) and 2) not the default behaviour (it’s something to be enabled as required).

The specific example given was akin to that of a public machine – consider a university with lab servers being shared by students. CFS group scheduling allows admins to make it so that a particular student can not take up more of than his fair allocation of the machine he is on.
2008-01-16 3:40 pm

WereCatf
If I got 4 high-priority processes running and other guy got one lousy-nonsense app consuming cpu time then I have only 50% of cpu power and he got another 50%- I won’t call it *fair* at all!

Do you often have several people logged in to your home PC? Besides, the feature you just described has to be separately enabled and yes, it is indeed very fair in any shared computer systems. Even if you have no need for such feature (most home users don’t) it will be incredibly valuable to for example universities.

Besides, I can’t help but wonder how would you define “fair” if not the way that every user gets the equal amount of everything? For example some user might be running a process which you consider very non-important, but it could very well be very important to that person.. Then it’s just fair that you can’t rob that user of all the CPU time. Also it ensures the other user can’t do the same to you either.

As for home users this will mostly just improve interactivity of all apps running, especially when compared to the older scheduler in a situation where you have lots of demanding apps running. In this case it is fair towards processes: each one gets equal share of CPU time, the more processes running the less each one gets.

2008-01-16 7:14 am

obsidian
Some interesting performance comparisons in the PDF

document below –

http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf

I like the bit where it says –

“The new CFS scheduler in (Linux) 2.6.23 is Completely Fair…to FreeBSD”

Heh… A good “dig” at Linux there….

The FreeBSD devs have been doing some great stuff with 7.0 – really looking forward to the release!

Edited 2008-01-16 07:14 UTC

2008-01-16 10:09 am

Yoke
It’s nice to see that the FreeBSD community is finally starting to put out some benchmarks, something they’ve avoided doing for the past 6-7 years.

For instance I recall Matt Olander promising to produce benchmarks that would back up the claim that FreeBSD 6.0 outperformed Linux wrt raw data throughput. Those benchmarks were never published.

2008-01-16 10:58 am

Oliver
Usually such benchmarks are quiet nonsense. The one and only valid purpose for a benchmark is to help the developer. Otherwise it’s just flamebait for the masses.

2008-01-16 5:16 pm

gilboa
As I pointed out before [1] the FreeBSD dropped the ball in this document.

Let me quote myself:

“The Linux 2.6.22 CFS numbers seem to mirror what I’m feeling on my Opteron/Clovertown workstation(s). Close, but no cigar.

-However-, the BSD team screwed up big time with the 2.6.23 kernel numbers.

In-order to test the 2.6.23 kernel they used the September 28 pre-Fedora 8 Rawhide kernel that was compiled with a large number of debug options.

The BSD team should consider redoing the 2.6.23 results with a production kernel. For the time being I’d suggest they remove the invalid results from the PDF.

Incorporating invalid results into an official documents just makes the BSD team look unprofessional.”

– Gilboa

[1] http://www.osnews.com/permalink?281880

2008-01-16 5:50 pm

sbergman27
Gilboa said:

[1] http://www.osnews.com/permalink?281880

*Rimshot!*

Edited 2008-01-16 17:53 UTC

2008-01-16 6:27 pm

gilboa
…?

2008-01-16 6:55 pm

sbergman27
http://en.wikipedia.org/wiki/Rimshot

Rimshots are often used just after the punchline of a joke is delivered, for emphasis, during a show. But the the term can be used to denote emphasis of a point in other contexts, as well.
2008-01-17 10:36 am

gilboa
… I did look it up.

I just couldn’t, umm, see the connection.

Granted, English is my third language so I may be missing some hidden cultural difference…

– Gilboa
2008-01-17 6:55 pm

sbergman27
Then you can interpret it as “Good post.”.

Actually, I do seem to recall this exercise exposing some sort of problem with Glibc’s malloc, which I think got fixed in 2.7 (released Oct 23, 2007) or something like that. So, like with Mindcraft, someone took a lemon and made lemonade.

Edited 2008-01-17 18:57 UTC
2008-01-19 9:01 am

gilboa
Oh.. OK, Got it.

Thanks for taking the time to clear it up.

– Gilboa

2008-01-16 7:45 pm

Oliver
The developer of the FreeBSD SCHED_ULE didn’t use this to compare it. He prepared his benchmark with the help of some Linux people and so they discovered a performance bug in Linux. So in the end it seems there is a problem in Linux with debug _and_ without it.

2008-01-16 10:20 pm

Yoke
Didn’t that turn out to be a problem with glibc malloc?
2008-01-17 10:39 am

gilboa
I didn’t say that there was no bug in CFG. (Hence the term: “Close but no cigar”)

I -was- saying that using a debug-riddled-pre-release-kernel running on a highly unstable platform is bad testing methodology.

Whether you may get lucky (or not), and get something meaningful out of it (read: CFS/malloc bugfix) is irrelevant.

– Gilboa

Edited 2008-01-17 10:42 UTC

2008-01-16 9:46 pm

abraxas
Some interesting performance comparisons in the PDF

document below –

http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf

I would like to see how the ULE scheduler fairs compared to Linux kernel release 2.6.23 and the current rc of 2.6.24. The benchmark only compares a git version of 2.6.23 and release 2.6.22 which doens’t use the CFS scheduler. CFS was released with kernel 2.6.23 so I don’t think the comparison says much. Plus there are many improvements taking place for CFS in the upcoming 2.6.24 release.

2008-01-17 9:35 am

antik
CFS was released with kernel 2.6.23 so I don’t think the comparison says much. Plus there are many improvements taking place for CFS in the upcoming 2.6.24 release.

And FreeBSD 7.0 is not even released yet. You think that only place where improvements are made is Linux kernel?

“My next release will beat your ass for sure.” is lame excuse not to test and release comparison ASAP.

2008-01-17 10:09 pm

abraxas
And FreeBSD 7.0 is not even released yet. You think that only place where improvements are made is Linux kernel?

“My next release will beat your ass for sure.” is lame excuse not to test and release comparison ASAP.

Isn’t this exactly what FreeBSD developers are doing by comparing an unrelased product to Linux? If you’re going to compare bleeding edge kernels then compare bleeding edge kernerls, not ones from a release ago. I believe the benchmark was current at the time it was created but a lot has changed in a small time frame. CFS being the main one, which if you haven’t noticed, is exactly what the article is talking about. CFS wasn’t even released on a stable kernel until after the benchmark was created. I just don’t think it is very relevant to the discussion and isn’t representative of reality today.

2008-01-16 11:04 am

axilmar
The given example is not an example of fairness, but the reverse: 48 processes are given equal time with 4 processes.

2008-01-16 1:30 pm

sbergman27
This is the sort of topic where people could go on arguing ad infinitum while establishing nothing. Define the parameters of the problem first. Is the goal to be fair to processes or to users? Depending upon what you decide there, the answer pretty much falls out on its own.

I suppose it depends upon the context of the situation. For my purposes as admin of XDMCP servers serving out desktops being fair to people is best for me. However, IO scheduling is at least as important, if not more important, than processor scheduling for my workloads. But processor scheduling tends to get more press.

2008-01-16 3:37 pm

axilmar
This is the sort of topic where people could go on arguing ad infinitum while establishing nothing. Define the parameters of the problem first. Is the goal to be fair to processes or to users? Depending upon what you decide there, the answer pretty much falls out on its own.

Well, if a scheduler wants to be fair to users, then a user with a heavy computation must not be treated in an unfair manner against a user with a lighter computation. Therefore, it’s not fair to equalize a user with 48 processes to a user with 4 processes.

2008-01-16 3:46 pm

WereCatf
Well, if a scheduler wants to be fair to users, then a user with a heavy computation must not be treated in an unfair manner against a user with a lighter computation. Therefore, it’s not fair to equalize a user with 48 processes to a user with 4 processes.

Yes, it is. The scheduler has no way of knowing which user has more mission-critical processes running, and each user’s processes are important to them ie. you have no right to say “My processes are more important than yours so my processes run first!”. Besides, the scheduler also has no way of knowing what the “heavy computation” (taken from your example) is used for: it could either be used for for example to calculate some molechular patterns, or it could just as well just be some pretty fancy and demanding screensaver. Would it then be fair that a user ran a very fancy and demanding screensaver with lots of heavy computation and robbed your lighter processes of their CPU time?

2008-01-16 4:37 pm

borker
it is completely fair to the groups that the processes are in. This example was for a multi user system where the admin has set up two groups to receive equal processor time. The only judgment that can be made here is about the fairness of the admin.

Edited 2008-01-16 16:37 UTC