NVIDIA’s Wayland support is finally coming together albeit long overdue with DMA-BUF passing support and now patches pending against XWayland for supporting OpenGL and Vulkan hardware acceleration with their proprietary driver.
Pending patches to the X.Org Server’s XWayland code paired with a yet-to-be-released proprietary driver update finally allow for hardware accelerated rendering with XWayland.
NVIDIA is really holding Wayland back, so it’s good to see progress on this front.
Even better would be if they finally just open-sourced their drivers, like AMD and Intel did long ago.
Even better would be to switch to AMD and benefit from their open sourced drivers…
Sure, but I’m just not going to use Wayland until it works with Nvidia.
I read through the comments under the the first article. It’s the usual spead of rah rah to people pointing out WDDM is better than anything Linux has done, to a Linux zealot moaning that breaking the ABI is normnal and the blame is on the IHV.
While open source drivers would be nice a perpetually going to break ABI and iffy roadmaps and never ending office politics is really offputting. There’s a choice of drivers from an admittedly small pool of IHV’s but if the issue is that important to someone the simple answer is buy something else instead of Nvidia. I personally don’t like Nvidia because of their fanboi culture and the fact they have a history of sacrificing visual fidelity for cheating but I don’t go on about it every time they do or do not release a driver or product. Most people simply don’t care like most people don’t care whether a driver is open or closed source. Even for those who do a fraction fo a percent will actually look at the source and even fewer use that.
My laptops iGPU is an Intel. My eGPU is an AMD card. Sadly because my eGPu adapter is only rated for 2×75 watt an upgrade of my old AMD graphics card is going to have to be NVidia to make any sense because of the performance/watts ratio. I tried really really hard to justify an AMD with a 150 watt TDP but the performance gap was too large. That said i’ve looked again and an AMD RX 5600 XT may be possible and is slightly better than the Nvida GTX 1660 Super which is itself significantly better than the graphics card I had previously considered. But then the Nivida is a lot cheaper. On balance I’d be happy with a lower spec Radeon RX 5500 XT 8GB card (TDP 130 watts) as the performance/price is okay plus 8GB is helpful when you’re running an eGPU over a 1x/2x bus.
It may surprise people but running an ePGU over a 1x/2x bus is surprisingly good. Even if a graphic card has 90%-70% of the performance of a built in PCI bus this is a million times better than the iGPU and almost anything will have OpenCL which will do for providing a boost for art software. A lot of games I might play actually are playable in the 60-70FPS range for a card like this but I don’t play games. An 8GB card is useful as it will minimise potential data transfers over the bus although a 4GB-6GB is adequate.
https://www.phoronix.com/scan.php?page=news_item&px=NVIDIA-2021-Driver-CVEs
There are reasons you want kernel level driver peer reviewed at the source level if possible. Do note some of these issues that Nvidia closed source drivers was found to have were removed from the open source drivers changed to DMA BUF as well. Yes 2013 so 7 years of known defect Nvidia just fixed and put a CVE to it so it appears current instead of being the true fact they have had the head in sand not attending the different security discussions at the yearly X11 conference and not migrating away from obsoleted functions by Linux or Windows.
One of the reasons why changed to DMA BUF by DRI 1.3 was done is this changes the permissions system to file handles that the Linux system can handle natively.
Microsoft signing process WDDM failed here to prevent this problem. Yes the backwards compatible of WDDM is one of the causes this CVE set. Why a new version of WDDM driver could used older deprecated functions that were known insecure. Yes there is a need to chop of backwards compatibility on new drivers so they cannot use known security flawed functions if that had happened we might not have the CVE today because the driver would have broken as Nvidia developers tried to make it, WDDM ABI/API stablity is a pure double sided sword one side it makes driver development simpler due to less code changes required problem is this has other side sword makes security flaws way more common due to flawed code paths keeping on working instead of breaking so forcing a rewrite/fix. Same with windows driver ABI stability across the board.
Yes the Linux kernel has been cutting Nvidia off from old functions but that still does not stop them from internally screwing up since their driver core is from windows and they were getting away with crap over there the problem spread.
Yes the idea of a single unified driver shared between platforms in a lot of ways does not work once you wake up how different the security models are.
Reality here the Linux developers of Wayland compositors have been telling Nvidia they had design problems since 2013.
Remember the big selling point by Nvidia by eglstreams was video card driver makers would be able to share code between platforms.
HollyB like it or not ABI breakage if you are trying to make a secure platform is part of life.
@oiaohm
That’s a political disctinction not a technical one. My reaction to Nvidia taking ages is a big so what? I have a Windows 10 iGPU driver which will never be updated. Linux ABI breakages happen for political reasons such as “open source wah wah”. Wayland is also dragging its feet. All of the above is why I honestly cannot be bothered with Linux discussion. The Windows 10 ATI driver for my old eGPU card will never see another update either. As for it working on Linux the answer is yes it does but primary and secondary displays are basically unusable because I get a blank desktop. Why? I have no idea and don’t care enough to find out.
If you dig into WDDM driver signing there are things they do and things they don’t do. One of them is they don’t provide a full security audit. That’s the IHV’s job. Take it up with Microsoft or NVidia.
Don’t moan to me about whether they turn up to a conference or not. Most are talking shops. Wait until the proceedings are published or someone who does go makes their material accessible.
Most of my work was at the API or creative level. I really honestly don’t want to get into political discussions over kernels because it’s a tar pit.
HollyB its not political reasons. Its technical reasons why the Linux kernel level ABI changes.
“If you dig into WDDM driver signing there are things they do and things they don’t do. One of them is they don’t provide a full security audit. That’s the IHV’s job.”
This shows you being a idiot not know the technical here. Full source of all parts is required to perform a proper security audit. IHV cannot do alone neither can the OS vendor when you have a non microkernel design. If WDDM driver signing was doing full audit it would require either Microsoft hand over full source code to Nvidia/AMD/Intel…IHVs or Nvidia/AMD/Intel…IHVs hand over full source to Microsoft.
SEL4 and the work on the Linux kernel memory model(yes a full mathematically audit-able model of the Linux kernel memory operations of all the main line open source parts.) There are things in the Linux kernel memory model covering that particular cpus will do memory operations at times in different order to the machine code and other wacky things. How do you get this information to audit now the IHV are needing the CPU vendors data as well.
Yes the Linux kernel asking for open source is not something political there is a technical limitation where you need everyone source and information to produce full models to work out if something goofed something up or not. Of course running these models the more legacy functions you keep the more processing time it takes.
@oiaohm
Oh, I’m an idiot now am I? There’s a lot of people on here going around calling other people idiots. Not a good look. Ok… I’ll leave you to your tar pit.
HollyB you were being a idiot saying Microsoft does not do something but the reason is its technically impossible was not that they don’t do something it was that they cannot do something and its very important to know that otherwise you start calling stuff political that are no choice options after you have made other choices. I will list what Microsoft could do and its a tar pit of technical limitations leading to very hard choices.
Microsoft could choose not to allow new drivers using old deprecated functions as in you flag your driver as able to use the newest direct X features you go use a old deprecated function of the kernel that known insecure the kernel refused to load it, Please do not say this cannot be done Linux kernel 5.9 how it breaks sections of the Nvidia driver is implement this proving it as a very low overhead. The idea of this kind of filtering appears in a particular Unix before Linux existed that right Xenix so this is a case Microsoft threw the baby out with the bath water.
You want to use a old driver the above functionality could still allow old driver to work. Please note Microsoft has proven that they don’t maintain all the old support as good so you will be running extra security risk and extra stablity problems using old drivers with windows. You think about it you don’t active windows Microsoft displays a nice little water mark about using a non activated version of windows. There is no reason why they could not do the same with drivers so you know that you have your security pants down on a particular system due to unmaintained drivers.
“Wait until the proceedings are published or someone who does go makes their material accessible.”
The X11 conference every year publish the material as well 2013 was when use DMA BUF was decided due to security issues in the security part of the conference. When you as a driver developer being asked to attend something you should at least be reading over the material.
https://www.phoronix.com/scan.php?page=news_item&px=NVIDIA-2017-Device-Mem-API
The reality is we know Nvidia did read the material. We know Nvidia did argue against the change and go as far as putting forward a implementation that would not change the status que so not fix the security problem. The Wayland problem with Nvidia has been lots of kicking and screaming on Nvidia side not wanting to fix a known security problem and instead coming up with joke ways not to fix it for over 7 years. This has also had side effects that different areas of Wayland protocol have not be able to be set absolutely stable because Wayland does depend on passing around graphical buffer so if the buffer pass around method is going to be something other than a file extra parts may be need in the protocol to deal with it and so on.
Yes you are right this security stuff is a Tar pit of technical limitations and choices most of the choices are not politicial but have knock on effects.
If you want to audit the complete OS core your choice is source code or bust. That is the way it is.
If you want to support closed source binary blobs the correct choice is not absolute ABI stability.
There is a absolute need from technical security point of view for ABI/API breakage so new drivers cannot use deprecated kernel functionality. The same in fact applies in userspace. Question comes how you implement this.
So its not how you have been make it out HollyB at all you are very short on the technical knowledge of the security side and the limitations it causes.
@oiaohm
An apology would have been better. I’m not surprised Nvidia told Linux to take a hike for so long. I’m not listening so you’re clearly writing this for yourself or an imagined audience so you can “win”. Seriously, I have better things to do.
HollyB
**“open source wah wah”.
Notice something here you were insulting here first. This is acting like a idiot so I am calling you one.
**An apology would have been better
HollyB yes it would have been if you had read you post I answered to pointing out your claim that political was incorrect. Yes you have been a idiot all the way along here who has not worked out they have been insulting now asking for apology.. So its not me who owe you an apology its you who owes a apology for being idiot on the topic so over simplifying the problem then making out its being a baby complaining about this stuff.
There are a stack of technical limitations on what can and cannot done. When you make particular choices for security this causes you to follow particular routes. Those wanting all the source are wanting the means to audit.
Also you say its Linux zealot saying ABI breakage is normal. This is completely bogus point of view because its not just the Linux people for security that class ABI breakage as normal. The NSA extends government versions of Windows to block loading old drivers so creating ABI breakage internationally not the only party doing this for security either. Security experts at the highest level class ABI breakage as required to maintain the highest secure systems. Yes calling them a zealot means you get to ignore the high level security experts tell you that ABI breakage need to happen. So this is you being a idiot using a insult term to allow you to dig you head in the sand.
Basically you are a being total idiot on this topic HollyB yet you jumped in boots and all with insults. What other than a idiot would do this. If you don’t know the topic fully avoid insults. It very important some cases there are things that need to be done. ABI breakage like it or not is one of those things. Of course saying that the Linux way of ABI breakage is little too brutal would ok.
oiaohm,
Just stop it. You’re coming across as a bully. Osnews can benefit from your insight, not your condescension. It’s ok to disagree without attacking your peers.
Alfman really tell me to pull my head in you did not tell HollyB to. There is a nasty side to me HollyB was following bully actions.
I live by a scary concept to a lot people if you start treating people poorly with insults don’t expect me to be nice. Treat others how you expect to be treated if you treat others poorly expect to be treated poorly.
Funny thing is those who use under handed bulling like HollyB cannot take it when they are insulted themselves. If you cannot take it HollyB don’t do it.
oiaohm
She didn’t bully you. You called her an idiot like 7 times, you do the math. I’d be remiss if I didn’t say something. As a child I went to a 7 day summer camp. One of the kids was being bullied and I didn’t do anything about it. He cried and left the camp early. Now that I have kids I think more about that. His parents sent him there to have fun and I think about how sad it must have been for them to see their child hurt by kids that were mean to him for no reason. I regret not doing something. I’ll admit it’s not always easy and sometimes we get caught up in the heat of the moment, but being friendly and respectful can make a big difference between a community that is fun and welcoming versus one that is toxic. I hope you would agree that when it comes to inconsequential opinions, it’s worth weighing on the side of respectful disagreement rather than vindictive character attacks. Anyways, just keep it in mind.
Peace
@HollyB
I don’t know enough about the corporate or political situation to comment on the bulk of your post regarding Nvidia and AMD, but thanks for the eGPU feedback it’s quite valuable.
I’ve been thinking an eGPU might be a nice way for me to go to access the capabilities I need from newer software. I’m always dismayed when vendors drop support for hardware so soon, my home desktop is only 7 years old, still very very good because it was high end when configured, yet the latest CAD software I use and even the latest Blender betas no longer support my graphics card, which operates perfectly for all other purposes, so I’m reluctant to change the system just to obtain the latest from one or two pieces of software.
If you have an ordinary desktop with a spare PCI slot just drop in a new graphics card. Thunderbolt eGPU cases are pricey.
I have a Thinkpad T520 with integrated Intel HD 3000 graphics. It has an Expresscard slot but no Thunderbolt. The GDC Beast with expresscard slot adapter gives me options. (On Ebay for lb30+ or Banggood.) There is better around but it’s cheap. The only thing you need to watch is place it on it’s own power socket as it can be twitchy, and make or buy a case. You will need your own power supply. I have a Dell 220 Watt power adapter which is recommended for this but you can use an ATX power supply which will also support cards needing more than 150 watts.
If you use an eGPU card as a loopback to a laptops display performance will fall off a cliff as the bus is soaked up in both directions. It’s best to close the lid on the laptop if it isn’t already docked and run a desktop display off the graphic card port.
This is one of my favourite youtube channels and he has a video demonstrating this adapter.
https://www.youtube.com/watch?v=bP_8EYQ-2RA
HollyB,
Interesting video. I would have assumed that people looking for eGPUs are looking for thunderbolt 3 solutions, but maybe not. If a mini PCI socket is present, it is usually already used by a wifi adapter, no? The youtube review didn’t really show a finished product, just a PCI slot with wires hanging everywhere, did you build a rig to clean it up?
Here’s a review of a thunderbolt eGPU from 2020. He suggests that it worked well in many games on an external display, but that performance was disappointing rendering back to his laptop’s display, which is how he planned on using it. So that aspect seems to line up with your experience.
http://www.youtube.com/watch?v=37MdvN1XQ5o
“I bought an eGPU in 2020: My experience so far”
It actually looks like most eGPUs are targeting MAC users for some reason. I’m guessing there are a lot of mac pro users who are looking for higher end GPUs than are available in the MBP lineup and to help remove heat that caused by small form factors.
http://www.youtube.com/watch?v=c-bNq8CJju4
@Alfman
The youtube review was just the basics. Yes, you have data lead going to the adapter plus a power lead plus graphics out. It’s a bit of a mess but you can buy or make a case. As a custom solution someone produced a 3D printer file for an Nvidia 1060. A mini-ITX case or similar might do especially if you go the power supply route. I have no repssing need to use my eGPU so it sits in the cupboard. As and when I have a reason I will look into a case and new graphics card but this isn’t a personal priority. If I went this far I would probably also solder a new shielded round cable for the Expresscard adapter. If my laptop had a Thunderbolt 2 or 3 port and the commercial eGPU cases weren’t so expensive I would have bought one of those instead.
This adapter has three options: Expresscard, Mini PCIE, and NGFF M.2. Expresscard is the easiest if you have the slot (I do). Mini-PCIE and M.2 can look clunky or require case modifications to allow the cable to exit a laptop.
Thunderbolt 1 is slow. Thunderbolt 2 is better. Thunderbolt 3 is ideal. I don’t know whether he has Thunderbolt 2 or 3, and can’t remember what the performance is for this interface.
Yes the main market seemed to be Macs. It’s just a market which took off. I daresay because Thunderbolt was standard on Macs. I understand it was a hobbiest thing then some of the branded vendors released products and it took off. It’s a guess but I suspect most Windows users wanting performance would be more into games and be running a desktop.
In the video you linked to he gets it wrong. The CPU has nothing to do with this. It’s simply a bus traffic issue.
Here’s the 3D printer case project:
https://www.youtube.com/watch?v=nLmP1UIUPW4
HollyB TB3 5GBps depend on the game/application that can be half the transfer rate you need once you allow for the overhead. PCI3 x 16 = 16GBps is techically over kill for most applicattions. But lots of games if you put a card into a PCI3 x8 slot that 8GBps there are performance issues .
12-14GBps technically current GPUs running any application tap out and you need the remaining 2GBps out of the 16GPps of a PCI3x16 to transfer the screen back. Yes a PCI4x8 slot instead of a 16x slot with PCI4 supporting gpu provides all the transfer you need this is why AMD B550 level board technically can support dual GPU at full usage.
The numbers fairly much say 5GBps(40Gbps) yes roughly 4GBps once you add in the thunderbolt overheads is short by a long shot to render back on the laptop screen you are going to run out of bandwidth in most cases. External GPU with external monitor needs really to be a given with these thunderbolt connected monitors.
Thunderbolt 3 is roughly a 4xPCI 3.0. For heavy compute operations like crypo currency mining this is fine for a GPU heck you can even get away in this case with a 1xPCI 3.0. For a GPU that going to be outputting directly to a monitor this is also fairly fine but you are short on bandwidth small number of applications will have trouble. Rendering on the GPU and sending back to the screen in the laptop mostly not workable as there is just not the bandwidth. (there are a few ruggged laptops where it is workable because they have a HDMI in to the monitor part of the laptop so you end up with two cables thunderbolt/USB 4 to the GPU then a HDMI back from GPU to laptop.).
Something that is a big over looked problem here most game developers don’t test lower than 8xPCIe 3.0 for transfer speed. External thurderbolt or USB 4.0 GPU is half what the programs were validated with in lots of cases. So hear be dragons.
Yes this here be dragons also explains why games screw up so badly on lots of laptops because extra discrete Nvidia graphics in lots of laptops is only connected up by 8 lanes of PCIe 3.0 and then it also sending the output back along it to the intel that is hooked up to the screen. Yes more bandwidth than thunderbolt 3 and its still problem child because its bandwidth short. The sweet spot is about 3-4 times the transfer-rate thunderbolt 3 has.
cpcf,
Yeah, even if the hardware still works for you, software dropping support can be the big bigger problem. I wouldn’t really consider an egpu for myself over a desktop machine. But it makes me curious anyways and I found some benchmarks. Here are some user benchmarks for 3080 and 2080 TI:
https://egpu.io/forums/laptop-computing/initial-rtx-30-series-egpu-benchmarks-look-promising-for-bottleneck-concerns/
So an approximate range of 5%-30% with most coming in at approximately roughly at 10%.
I suspect it is going to depend on exactly what you are doing. If the GPU is rendering the whole scene and then just transferring the rendered screens, then thunderbolt 3 overhead should be minimal, however the more the host bus is involved the greater I’d expect the bottleneck to be. I don’t really know too much about blender, does the CPU load ramp up or is it always GPU limited?
I’ve been experimenting with cuda and processing GPU kernals with data from host RAM (because host RAM >> GPU RAM). It would be interesting to see how it ran over TB3. Alas I don’t have this setup to benchmark, but I would predict that my case would be a worst case scenario for an eGPU. In theory I might even benefit from PCI4 that the newest GPUs support, but my rig is limited to PCI3x16 for now.
Theoretical bandwidths…
PCI4 x 16 = 32GBps
PCI3 x 16 = 16GBps
TB3 5GBps
@Alfman
Those benchmarks are nowhere near as bad as I had expected, given I’m primarily looking for compatibility over performance.
cpcf,
By all means then give it a go
What are you doing with blender & CAD? I’ve touched blender before but I never quite got the hang of it. I wanted to like it but the software just seems awkward to use. I used to use the moray modeler with povray but this was many eons ago, haha.
I realise my thoughts are not optimal, I suppose I’m looking for solutions to hedge my bets and get a little bit each way.
I can’t let go of good hardware when I get it working, once I’ve got it ticking over sweetly I’m very reluctant to change. I suppose my hardware attitude is the equivalent of people running Win XP on new hardware!