Weird architectures weren’t supported to begin with

Submitted by Flatland_Spider 2021-03-01 General Development 19 Comments

This is the heart of the conflict: Rust (and many other modern, safe languages) use LLVM for its relative simplicity, but LLVM does not support either native or cross-compilation to many less popular (read: niche) architectures. Package managers are increasingly finding that one of their oldest assumptions can be easily violated, and they’re not happy about that.
But here’s the problem: it’s a bad assumption. The fact that it’s the default represents an unmitigated security, reliability, and reproducibility disaster.

I’m sure this will go down well.

About The Author

19 Comments

2021-03-01 6:37 pm

Anonymous
I skimmed through and wrote a quick response then browsed the footnotes and had another think about the overall essay. It’s a good and through and thought provoking essay. I would say it might benefit from a review after discussion because some (a lot) of the negatives are in a lots of ways opportunities if you take the right approach.=.

The security of a program is a function of its own design and testing, as well as the design, testing, and basic correctness of its underlying platform: everything from the userspace, to the kernel, to the compilers themselves. The latter is an unsolved problem in the very best of cases: bugs are regularly found in even the most mature compilers (Clang, GCC) and their most mature backends (x86, ARM). Tiny changes to or differences in build systems can have profound effects at the binary level, like accidentally removing security mitigations. Seemingly innocuous patches can make otherwise safe code exploitable in the context of other vulnerabilities.

Yes there is no such thing as a secure system.

The problem gets worse as we move towards niche architectures and targets that are used primarily by small hobbyist communities. Consider m68k (one of the other architectures affected by pyca/cryptography’s move to Rust): even GCC was considering removing support due to lack of maintenance, until hobbyists stepped in. That isn’t to say that any particular niche target is full of bugs10; only to say that it’s a greater likelihood for niche targets in general. Nobody is regularly testing the mountain of userspace code that implicitly forms an operating contract with arbitrary programs on these platforms.

Project maintainers don’t want to chase down compiler bugs on ISAs or systems that they never intended to support in the first place, and aren’t receiving any active support feedback about. They especially don’t want to have vulnerabilities associated with their projects because of buggy toolchains or tooling inertia when working on security improvements.

That’s a rabbit hole and becomes a swamp as you read on.

I put this one last because it’s flippant, but it’s maybe the most important one: outside of hobbyists playing with weird architectures for fun (and accepting the overwhelming likelihood that most projects won’t immediately work for them), open source groups should not be unconditionally supporting the ecosystem for a large corporation’s hardware and/or platforms.

To do portability well it helps to do this from the beginning. This needs a portability approach which accomodates different compilers and build environments, different quirks and features, tool and OS versioning, and sometimes substitution code. Abstract early and abstract often. The quicker you abstract the quicker you will avoid snags and adding new or maintaining existing platforms is a lot easier. You can also make use of warnings so if a vital element changes you can be alerted to this. Version numbers and other portabiliity relevant flags are your friend.

Companies should be paying for this directly: if pyca/cryptography actually broke on HPPA or IA-64, then HP or Intel or whoever should be forking over money to get it fixed or using their own horde of engineers to fix it themselves. No free work for platforms that only corporations are using14. No, this doesn’t violate the open-source ethos15; nothing about OSS says that you have to bend over backwards to support a corporate platform that you didn’t care about in the first place.

This can be a difficult one as different people support or finance support for different parts of the overall software system. What looks like a cost supporting an obscure piece of hardware may be relevant to someone who is paying for your kernel development so be careful which branch you are sawing off.

When doing high performance graphics development my view of Direct3D was uprintable yet all the abstractions were there to drop in Direct3D even if I never used it in practice, or 3DFX Glide (restricted to full screen uses) or any future graphics API. Ditto all the other code which needed abstracting. This was mostly interface stuff but also memory and file handling. I’m sure if there was anything else which needed abstracting I would have done this too. I tended to avoid anything which would create lock-in and would either code around it or have graceful fallbacks or substitutions.

I created my own simple memory and file serialising code which was abstracted and used all over the place. It’s very easy to get wrong. One slip in the wrong place and you are in trouble which is another mark in favour of simple code. It did what it did and nothing else.

I think it’s worth seperating the security and reliability issues. To some degree the general angst the author has was replicated with the Windows XP codebase hence Service Pack 3 and Microsoft putting effort into making Visual Studio safer. Then there is the refactoring with Vista which became Windows 7 then more work on improving reliability. It’s a crude comparison but there is likely some useful dialogue to be had.

As the author notes maintainer turnover rate and/or unmanaged code ratio is a symptom which can indicate problems with your entire portability/coding/build/support approach so it is worth dealing with issues.

2021-03-03 2:16 pm

oiaohm
**Yes there is no such thing as a secure system.

There is such thing as a close to secure system as humanely possible. This is Sel4 work is about.

https://ts.data61.csiro.au/projects/TS/compiler-correctness.pml.html

Do note Sel4 is using C. Do notice something here sel4 validation process at max does not trust the compiler. That right you build your program with compiler then use a decomplier to check that the resulting binary is conforming. The number of gcc arm bugs sel4 process has found is massive. There is nothing like the sel4 process for LLVM.

The big thing the person goes its almost impossible to prove two builds on two different systems is the same this is in fact true be it rust or C. sel4 compliler correctness shows what is really required. Not only have full compiler and full decomplier and mathematical model of the platform to check that built code is functionally equal.

Have people forgot meltdown/spectra…. stuff all read where x86/arm/power… cpu design faults were resulting in machine code not doing what the compiler/source code said it should. Interesting point meltdown and spectra behavior was in the Intel and AMD documentation describing how x86 worked before either problem. Yes the Intel/AMD documentation has never been turned into a mathematical model that can be used to validate resulting binaries.

Its not just weird platforms that have problems here with making correctly behaving applications. There is a layer down Rust is not dealing with. Sel4 validation process remove all the same faults rust does from sel4 C source code and its not a surface level validation.

There is no reason really from the sel4 example why rust could not be rust to C to machine code. Same with all the other safe languages. Of course I understand why rust and other so called safe languages will not do this. Its simple once you start having rust on the weird platforms it comes clear how much of the issue prevention rust has depends on x86 design.

C not claiming to be safe means it does not have to care that on weird platforms stuff is broken as much so us able to push undefined behaviors back to the person building the program. Safe languages really do need defined hardware behaviors like it or not. Defining hardware behavior in a validation way is hard painful and complex as sel4 validation process shows and it does not get any simpler having rust instead of C.

2021-03-01 6:40 pm

Xanady Asem
LLVM is not used just for it’s relative simplicity, it’s actually a very complex toolchain if anything. LLVM is used because it is a far more efficient approach to language/compiler design, by decoupling all the complexity of the bare metal and offering a universal abstract target for the compiler/interpreter of your language.

I don’t understand the thesis of the writer. If LLVM doesn’t support a specific architecture, how exactly is a security/reliability/reproducibility disaster?

I think there is a misunderstanding between portability and universality, which are not synonym.

2021-03-01 7:25 pm

Alfman verbose=1
javiercero1,

I don’t understand the thesis of the writer. If LLVM doesn’t support a specific architecture, how exactly is a security/reliability/reproducibility disaster?

The author was referring to poorly maintained and poorly supported GCC targets and projects that implicitly assume the existence of a C compiler that works correctly. Many projects take this assumption for granted even though it’s not always true.

Solving this is tough, projects like rust benefit from LLVM, and personally I think LLVM is a better foundation to build on than GCC. And I absolutely think rust is a better language for robust code. However niche platform users are getting annoyed that their platform isn’t supported, and I cannot fault them for their frustration.
2021-03-01 9:27 pm

sukru
FSF, specifically Stallman, was vehemently against the idea of a “modular” GCC: https://lwn.net/Articles/582242/

Their concern was private plugins getting mixed with the OSS code. So there was no proper separation between layers, and hence no counterpart to clangd or similar tools. (clangd keeps a synchronized copy of parsed code, thus allowing refactoring, code completion, error detection etc in visual studio code and other IDEs).

After all the effort went into LLVM making it on par, and for some scenarios better, than GCC, I don’t think it would be easy to turn back. At best other architectures might look into writing LLVM backends, which is not actually too difficult (for those with compiler writing experience).

2021-03-02 8:06 am

moltonel
LLVM is a much better platform to write languages/tools/etc than GCC, by design/policy of both projects. In that light it’s a bit surprising that GCC has more backends; this probably has to do with community inertia and the higher bar of entry/stay in LLVM (how many of those niche gcc backends are properly tested/maintained ?). I expect this trend to continue : LLVM will very slowly catch up to the point where everybody agrees that a platform needs an LLVM backend to be considered healthy/alive.

2021-03-02 11:43 am

sukru
I agree.

A monoculture is not generally preferable. But LLVM will most likely be *the* compiler. Even Microsoft is supporting it in Visual Studio (proper, not “code”):

https://docs.microsoft.com/en-us/cpp/build/clang-support-msbuild?view=msvc-160

2021-03-02 7:59 pm

cb88
Stallman is wrong because he is a software communist. While those theories work best in the software world since software is so easy to share and duplicate.

The work that goes into software is equally as hard as anything else… and “software communism” as I term it… provides the least incentive and most detriment to the originator of the software of most all free licensing schemes.

While there are points Stallman brings against software licensed as LLVM is…. in practice they don’t occur, or don’t occur significantly relative to the increase in investment in development of the software relative to more restrictively licensed software.

In fact enforcing the restrictions of “software communism” often engulfs a developer to the point that they no longer develop…

2021-03-03 9:35 pm

JeffR
Stallman isn’t a communist. He’s on record as saying has no problem with people making a profit, and Red Hat have done very well out of his “communism”.

If RMS is a “software communist,” then Intel and AMD (who cross-license their architectures), ARM (who license their IP), Sun and IBM (who open sourced SPARC and POWER, respectively) and SiFive et al. (who develop the open source RISC-V) are “hardware communists”. They aren’t, because “communist” and “communism” are words with specific meanings, which emphatically have nothing to do with “a word I use to describe something I fear because I can’t understand or refuse to understand it.”

I believe on this very site, Thom opined a few years ago that “Stallman was right, ” so if you still adhere to the view that he is a communist, you might want to stop reading this site.

2021-03-02 8:03 pm

teco.sb
LLVM is not used just for it’s relative simplicity, it’s actually a very complex toolchain if anything. LLVM is used because it is a far more efficient approach to language/compiler design, by decoupling all the complexity of the bare metal and offering a universal abstract target for the compiler/interpreter of your language.

I’m not sure that is entirely true. Last time I checked (about 2 years ago), LLVM/clang generate larger, slower and less optimized code than GCC. Compile times were faster, but GCC’s optimizer is much better.
LLVM/clang is also a beast to compile. While this is not an issue for most people, I was not able to compile LLVM on an old computer because the build kept crashing due to low memory. GCC, on the other hand, compiles perfectly. It could be that LLVM is poorly supported on some older and memory constrained platforms because it’s close to impossible to compile natively.
The fact is that LLVM has gained popularity not because it is better than GCC, but because of the permissive license. Apple, as well as some of the BSDs, adopted LLVM/clang when GCC moved to the GPLv3, it had nothing to do with GCC being better or worst. Companies love to say how pro Open Source they are, but really they are just looking forward to all that free labor from the community.

2021-03-03 1:22 am

sukru
Things are changing rapidly on the LLVM performance front. For example,

Chrome: https://chromium.googlesource.com/chromium/src/+/0e94f26e8/docs/clang.md
Firefox: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Compiling_Firefox_With_Clang_On_Linux
and many other projects either support llvm or use it as the primary.

The initial reasons were faster compilation (no longer true thanks to lots of optimizations) and much better diagnostics (still true, it can actually identify template mistakes). But the generated code is almost as fast as gcc, or even faster in many cases.

This was not even a license thing (at least for Chrome and Mozilla).

2021-03-01 9:20 pm

bert64
Every platform starts off niche too… The unix world was once dominated by m68k and vax. ARM used to be a very small niche, but is now gaining traction – this probably wouldn’t have happened if it weren’t for existing strong support by open source tools.
Without portable code, we end up tied to one architecture.

2021-03-02 11:26 am

flypig
I was thinking the same. There are plenty of good and interesting points in the article, but sometimes some chaotic mess is important to prevent stagnation.

2021-03-02 4:35 pm

FlyingJester
The problem is this is simply untrue. A few examples with very good (and well tested) OS and compiler support are SuperH/SH (well supported by GCC, and OpenBSD, NetBSD, Linux, probably others), UltraSparc (well supported by GCC and Solaris Studio, and Linux, OpenBSD, NetBSD, FreeBSD, Solaris, definitely many others), 68K (still supported by GCC, and with suppport from GCC, the Freescale toolchain, and Linux and NetBSD). You can get systems made with these processors fairly easily (though not new for UltraSparc so much anymore), and many OSes (particularly OpenBSD) are self-hosted and well tested on these architectures.

There is support. For some it’s good support. It’s LLVM that is lacking here.

2021-03-02 6:23 pm

Alfman verbose=1
FlyingJester,

There is support. For some it’s good support. It’s LLVM that is lacking here.

Not to dismiss GCC’s achievements at portability, they deserve a ton of credit. But LLVM’s tends to be better suited for inclusion in other projects. It’s just so powerful and flexible that it’s being used successfully at bridging very diverse technologies.
https://emscripten.org/

I completely understand the frustration for those who’s platform isn’t supported. It’s a matter of resources and in many cases the hardware isn’t even readily available anymore. I think the most practical solution would be for niche platform owners to contribute to LLVM’s portability effort, but I can sympathize that many of them are loyal GCC users and want nothing to do with LLVM. It’s one of those things where there’s no easy answer.

2021-03-06 3:43 am

smashIt
Wouldn’t a LLVM-IR -> C-code backend solve the problem?

You would have to compile the C-code with the native compiler of the platform in a second stage.

2021-03-06 1:25 pm

Alfman verbose=1
smashIt,

Wouldn’t a LLVM-IR -> C-code backend solve the problem?

You would have to compile the C-code with the native compiler of the platform in a second stage.

I don’t see why that wouldn’t work in principal, although if the author is to be believed, the native C toolchains do not always behave correctly. Multithreaded code is notoriously complex and this translation might lead to subtle race conditions on different architectures that use different memory barriers. x86 does not have the same barriers as ARM, etc. And there may be differences in the OS that cause new bugs, like the times at which the kernel might return E_INTR.

These aren’t really LLVM specific problems, but I’m just trying to think of reasons why a trivial port might not work perfectly as expected.

2021-03-07 5:34 am

smashIt
I’m well aware of the wishy-washy nature of C (what were they thinking?)
But I believe these problems could be solved with a few compiler-flags, or a custom script that fixes little problems for the native compiler.

2021-03-09 1:28 pm

smashIt
LLVM now supports M68K:
https://github.com/search?p=1&q=org%3Allvm+m68k&type=Commits