There are increasing rumors that Alpha might be brought back to life. The Inq sets the big ‘if’ aside and explores the possiblities: “What if there really is a will to get Alpha back into the changed market? What sort of chip would it have to be to have that good chance of success, if any?”
But it will be running with an Intel logo slapped on it. They hired most of the team responsible for the Alpha. Whatever’s left at HP wasn’t responsible for the design of it. The “new” Alpha will show up in PowerMacs in 2007.
I loved Alpha..
What have you been smoking?
Way before their time. Too bad they did such a crappy job selling it.
Jeff: They hired most of the team responsible for the Alpha… then stuck them on Itanium.
(Which brings me to my next point. What sort of chip would it have to be? Anything that isn’t Itanium.)
I would love to see an Alpha based Linux offering. I don’t think Solaris and UltraSparc would stand much of a chance if that happened.
Alpha – the only truly interesting instruction set out there, which was capable to capitalize the benefits of the RISC principle. And it actually smashed all of it’s contenders on the performance field. Look x86 is gross. x86-64 is grabage on top of a garbage heap. PowerPC is ugly like hell if you look at the details. ARM – can’t be implemented at GHZ level. HP-PA – a L1 cache with some processing capabilities. Itanium – crap in every aspect. Why on hell did they kill Alpha in first place. The top performance 500 at least would still look very much different if they didn’t. And most important of all – it would preserve customers and give lots of reputation back to HP.
Why all that fawning about Alpha anyway?
Yes, Alphas were designed by particularly talented engineers who introduced lots of new ideas first. But those ideas (and the engineers) have long since transferred to x86 implementations.
The Alpha’s instruction set itself is just another cache-busting 32-bit-fixed RISC job.
what made alpha so much “better” than other RISC processors (i.e PARC, powerpc, etc.)
personally i like superH
Any chance you might expand a bit on how exactly the Alpha instruction set is meant to be so much better than PowerPC and ARM?
They’re all just very slight variations on the same theme: load/store architecture, simple addressing, lots of registers, three-operand format, fixed 32-bit encoding.
Besides, the ARM instruction set could very well be implemented as an out-of-order design with a long pipeline running at >2GHz.
with alpha like features and performance?
… but some others went to AMD. How do you think it was possible for AMD to stump Intel in the last 4-5 four years?
The Althon (and now AMD64) _has_ Alpha’s DNA in it and that’s why they are so performant.
Why would anyone invest such a considerable sum to bring back Alpha? There doesn’t appear to be a vast market for it.
Considering the dollars invested in some huge failures in recent years; I can’t believe anyone is going to dash into this one.
When you get past the “wouldn’t it be cool” portion of this idea, the rest of it makes very little sense.
A few years back, there was active support for Alpha from Linux vendors like RedHat and SuSE and even the BSDs. At that time, I was in college and installed RedHat on a handful of “obsolete” AlphaServers to get away from horific Tru64 licensing/support costs. It worked really well for things that were open source and could be re-compiled. There was even a tru64 compat library to run tru64 binaries. I miss the Alpha!
When I 1st started to do FPGA cpu design, the Alpha was already quite dead, I seriously thought about coopting the ISA as a salute to good engineering, but then thought better of it, either Compaq,HP,Intel could sue me.
Still for those that want an Alpha at any price and any performance you could build just a prototype core for 64bit that would only run at 100Mips (2 cycle 32b datapath) but it would also only cost about $2 of FPGA resources. If you want more performance, you can scale that as many times as you’d want (upto a few times in a bigger FPGA) or get the design ASICed which could probably run at about 1GHz.
For memory interface I’d use RLDRAM to get the memory wall down to 2.5ns random access over the main memory so no data cache needed. The design would be 4 or 8 way threaded like the Niagara or Raza cores. Instead I used a simpler ISA and stay at 32b.
Still I think any outsider that touched this project would get burned. I live just a few streets from the Hudson team so thats too close for comfort.
The alpha instruction set was specially designed with the avoidance of synchronization points and congested resources like for example flag bits in mind. This was one of the features of this IS which permitted the implementation to excel in clock speed. The handling of system level instructions was very elegant too thus allowing it to remain small on silicon but still serve the needs of past and current operating systems.
I’m not going to digress in to the zillions of memory addressing modes of the 386 and the tons of instructions sets this architecture contains (8086 with segments, 80286 with segments, 386 with segments TLBs 48 extensions any much much more not used crap too). However a lot of people showing PowerPC as the panacea in front of an 386 forget to mention how inadequate it is on the system level.
There’s actually nothing really wrong with the Itanium except that it wasn’t compatible with the X86 architecture. And I’m not sure how much these Alpha engineers had to do with the Itanium. The Itanium actually started with HP as the Merced.
But we’ll see. Maybe Apple is what Intel needed to become innovative again. I think Windows was holding Intel back in some ways.
I don’t know.. Frankly what’s so wrong about the PPC ISA vs Alpha’s ISA, okay PPC ISA is a bit more complex, but it isn’t bad..
And the Alpha didn’t have an interesting vector instruction unit, this variant of the Alpha had it, but it was never produced and has little chance of ever see the day..
I would love to see an Alpha based Linux offering. I don’t think Solaris and UltraSparc would stand much of a chance if that happened.
I can see your point with the UltraSparc, but with Solaris being open now, why not just port it to the alpha? Choice is always nice.
The alpha instruction set was specially designed with the avoidance of synchronization points and congested resources like for example flag bits in mind.
PowerPC doesn’t have flag bits. ARM has them but makes good use of them for conditional execution of instructions (thus avoiding branches).
And in any case, renaming can eliminate any false dependencies on flags.
The handling of system level instructions was very elegant
What, the microcode feature? And you’re saying those system level instructions make a significant difference in either speed or size?
memory addressing modes of the 386
Seems you’ve quickly run out of arguments. I wasn’t talking about the 386, but about your baseless claim that the Alpha instruction set is much better than that of other RISC architecures.
But while were at it: there’s no point defending the multitude of x86 memory models, but powerful addressing modes have actually become an advantage. That’s because they save code size and because transistor budgets have improved quicker than memory speed.
PowerPC doesn’t have flag bits.
I meant it doesn’t have implicit flag bits like x86. What it does have is eight condition registers along with compare and test instructions to explicitly set them and branch instructions to act on them.
<PowerPC doesn’t have flag bits. ARM has them but makes good use of them for conditional execution of instructions (thus avoiding branches).>
Last time I looked at it in the H & P book it did, not the single set that x86 or 68k has but IIRC 8 banks of flags so each opcode could choose with a 4th operand which flag bank it was working with.
I always think of PPC as a semi RISC, mostly RISC but with some CISC stuff still hanging in there and way too many codes, not what Cocke had in mind.
Alpha was closer to RISC spirit by far, maybe too much.
Wasnt the original Athlon bus the EV6 (or something) based off of the Alpha Bus?
Okaym people have started with the “x86 is garbage” talk
now i ask why is it garbage.
Please, i actually want to know why people have a problem with x86.
and no fluffy arguements based on soemthing that doesnt affect ANYBODY. it may be a crude hack but if it makes no actual difference, does it really matter?
so lets hear it. why does x86 and why does that reason matter
I think if you had some cpu design exp and a budget to design a clean slate cpu today in light of todays silicon and memory performance and without the boondoggle of having to support all the MS SW, nobody in their right mind would choose to make it look anything like an x86.
Now the Linux/Nix/BSD/embedded markets are far less of a problem and have proved to be mostly agnostic and in fact the embedded market alone is far larger than the PC desktop market for cpu nos but not for $ or high performance.
The design of the x86 makes it almost impossible to move forward. The future direction is threaded architectures to reduce the memory wall effect ie 100s (going to many 100s) of cyles of waste can be reduced 4 or 8 ways to something more manageable. And there is a DRAM technology that can reduce latencies down to 20ns rather than 100ns usually found in DDR but it requires the cpu be 8 way threaded to get best out of that (RLDRAM from Micron).
Ok, without deeper details on how a RISC architecture _is suppose_ to run/work, let’s imagen this:
x86 3Ghz are worth of doing what?
Think about it…
This document contains descriptions of the Alpha and 80×86, and some strengths and weaknesses of each:
http://jbayko.sasktelwebsite.net/cpu.html
If Alpha will be back then Sun Microsystems Niagara and Rock processors will be more faster than Alpha!
well you can try asking intel legal department
about alpha instruction set,
From what i know is open , Yep you could walk on gray legal area (towards black)
Nobody stops me to create an open core project
Let’s call it OmegaCPU for start
“Create project result:
Project ‘omega’ successfully created.
Your request will be approved or rejected in 1 week. You will be informed by email.”
http://www.opencores.org/
“I think if you had some cpu design exp and a budget to design a clean slate cpu today in light of todays”
what would such a dekstop cpu look like? risc, cisc, vliw, epic? 16-bit compact isa or 32-bit isa? vector units?
At least here in Germany. I mean normal PCs, not expensive workstations. They only ran Linux though and at that time “Linux desktop enviroment” meant fvwm. I guess they didn’t sell well..
However they also had strong points:
– they totally smoked the x86s in the heavy number crunching discipline.
– they were less power-hungry and less noisy
– they were cheap. only a little more expensive than standard PCs.
I almost bought one…
is the new Alpha. Like it or not, as far as design goes, it is at least as extraordinary as Alpha. So stop the nostalgia and move on…
No one cares if Alpha was/is better. It was better when it was on the market and it still failed. If a product cannot be succesfully marketed, there is no point in its development. Billions were poured into Alpha and billions were wasted, Intel only took it on due to contractual issues. Intel already blew even more billions on Itanic, they have learned their lesson and are sticking to x86.
Techies do not like to be reminded that better often does not matter, I suspect a grump will request my comment be reviewed.
I would love to see an Alpha based Linux offering. I don’t think Solaris and UltraSparc would stand much of a chance if that happened.
Slashdot used to use Alpha servers running Linux. Didn’t crush UltraSparc or even Intel at the time.
Due to FPGA constraints I stick to rigid RISC & KISS principles but I do get some performance by using N way threading and by replicating several processors around each MMU.
In case any Athlon fans didn’t notice, the chief architect of that cpu Atiq Raza has long gone and is doing the same thing I am doing except he use the MIPs ISA and is doing ASIC design, 8 cores 4 way threaded and alot like Niagara. Not surprising since he had something to do with that also. Both cpus do not promise ridiculous gobs of FPU that nobody really uses and both are 100 person efforts. They do promise continous threaded computing with a lot less impact when branches and cache misses occur. They are not intended for folks that want 1 fast single threaded cpu.
Since I have only 1% of their budgets I have to use FPGA and only get 1/3 of ASIC perf but atleast it can be ASICed later. My per cpu cost though is only $1 per 100 Mips but you have to buy into pervasive threading ala Transputer. I think today the 64b Opterons maybe deliver 10x more cpu power but at 100x-500x the cost. My xp2400 runs about 5x faster at maybe 20x cost, but I can place multiple cores into some smaller FPGA. But FPU is out of it for now.
I also use a simple design that makes it easy to get 3 reads and 1 write from the local sram every instruction and it also drastically simplifies the instruction decode which uses 1..4 16bit codes with only 2 formats usually 3 reg or a ld/st. The datapath and ISA is a 32bit design but uses a 2 cycle 16b ALU. My branch hit is either 1 (near) or maybe 4 (far) cycles when taken so no predicter used. There is some Icache and Dcache is replaced by 8 way banked RLDRAM.
The only thing I add on top of that that others don’t is to support pervasive process and message support as the Transputer did, and user level memory allocation in HW so new and del HW managed. The other thing I use is RLDRAM which effectively gives me 2.5ns DRAM access after 20ns of latency (on paper) while I hear that AMD direct memory connect gives 100ns to DDR covered though by gobs of cache.
As for VLIW,Epic,superscalar,ooo,vector etc, I could care less for building single threaded cpus, they simply don’t deliver mips in proportion to transisters and clock frequency or power used. Massively distributed slower cores have their issues too as some of the Cell fans are finding out. If you don’t want to learn how to develop concurrent programs CSP style then you are stuck with praying at the Intel,AMD,IBM church for incrementally less and less of the same.
It all comes down to who is in control of the parallelization, the cpu or the programmer. I don’t buy the cpu idea of trying to extract it automatically.
just my 2cycles
nice ploy and a good joke.
There is really nothing wrong with itanium… except for performance on real world applications.
The effort which was required to let it vanish was actually quite considerable and
not exercised by the customers – it was more kind of a board play in the monopoly area.
Your comments are interesting, do you have more info about what you are doing?
What’s the likelyhood of systems based on your FPGA actually being constructed? And then paired up with another FPGA design, like that chip that renders in 3d using real time ray tracing?
Patents keep something like the from ever seeing the light of day in the US?
There is still NO native OS that will run on it. It was a marvel in it’s time. Could be still. However, there is no way to run it as the speeds it could run at.
Very funny Alp calling Itanium the new Alpha, if you mean that it will die soon too it’s true, now for the eleguance of the design..
And for the Itanium relying too much on non-existant compilers which could optimise well for VLIW (ok it works on *some* Fortran kernels, what about the rest??) is NOT a good idea!
Yeh I post on comp.sys.transputer or comp.arch.fpga, trying to keep on a monthly update of anouncements. I have a paper for a sep conference on parallel computing otherwise it will go out earlier on the web when ever they make up their minds.
Funny you should mention that 3d ray tracer, thats been brought to me before by someone else that helped me out on the architecture thinking. It does look like nVidia ATI may be way over the top as well as the cpu guys. Sometimes working on a shoestring helps clear the air and think about how to do things in a different way.
My burden is doubled because I must also do the C compiler which adds in occam and some of Verilog to allow the processor to be programmed in the par language I’d like to use. GCC, and or Pthreads would hardly be any good. Inmos had the same burden with occam.
FPGA processors are quite doable today but the rest of the crowd are all doing 120MHz or much slower basic RISCs about as simple as you can do and suffer from the real limits of FPGA at making logic decisions in line with the instruction decode every cycle. ASICs get it 3-5x better than FPGAs and full custom VLSI that Intel,AMD,IBM get another 2-3x on top. But FPGA memories are quite fast. The 4 way 2 cycle threading I use allows for the FPGA to make its bra type decisions and spread it over 8 pipe stages instead of just 1 stage so it just runs and runs, and stalls are relatively small and limited to each thread. I have an FPGA board (Xilinx starter sp3 200) and thats good enough to prototype. The Processor Element (not quite finished) has gone through P/R upto 300MHz (high end V2Pro) and compares with an XP300 if that existed. The MMU is at conceptual design still but has gone through C simulations.
The threading approach I, Raza, Niagara use all help to bust through the usual complexity of cpu design and replace it with another type of more manageable complexity. No one AFAIK is trying to do process and message support in the HW today ala Transputer and certainly no one is thinking of bringing the MMU to help the management of user objects.
The long term goal is to bring back the TRAM idea, each mini credit card module has an FPGA and DRAM with a few or many cpus inside with some other function on the side such as NIC, XVGA, PCIex etc. One then combines these cards to build a system with as many features as you want, connected with serial links. Other folks esp in Germany are doing some amazing FPGA boards.
Patents I am not too worried about, everthing has already been done before, I just put it back together again like humpty dumpty.
I can be reached offline at the usa address.
There’s actually nothing really wrong with the Itanium
How about VLIW and EPIC?
Very large instruction words, with 43 bits per instruction in the Itanium case, unsurprisingly require very large instruction caches to hold them.
Explicitly parallel instructions sound good in theory, but ignore the fact that memory latencies in today’s cache hierarchies are quite unpredictable, so that a dynamic instruction scheduler can do a much better job than a static one.
Interesting read.
@Itanium: for those wondering what is wrong with Itanium, wikipedia mentions some things in it’s Itanium article.
so what is the best design intel could have gone with if not vliw/epic?
that is quote of the day
“There is really nothing wrong with itanium… except for performance on real world applications.”
On my machine i use real world applications and NOT
some tweaked benchmarks.
if got any ideas please add them to tracker or core discussion
“This is automated message from http://www.opencores.org site.
Your request for project ‘Omega CPU – Alpha like clone’ was approved.”
http://www.opencores.org/projects.cgi/web/omega/overview
so what is the best design intel could have gone with if not vliw/epic?
Considering Itanium was Intel’s third (!) attempt to escape the original stop-gap measure that was the x86 architecture, you’ve got to suspect that nothing that couldn’t execute existing x86 instructions as fast as existing processors would have had a chance.
While x86 is ugly and legacy-laden, it’s not bad enough that you can get such big improvements that the market will simply abandon it.
What could Intel have done? Something more like AMD64, obviously. Except the 64-bit instruction set didn’t have to be yet another extension of the existing one.
E.g. it could have been a clean and compact RISC design like ARM Thumb-2. In compatibility mode a frontend could have translated IA32 instructions into those, which is basically what Pentiums and Athlons do anyway. Itanium tried the same idea, but was hampered by its lack of out-of-order execution.
What every report about new chips seems to fail on, is to address the cost/performance ratio.
I have a 2EM64T-processor machine here as my working desktop, which did cost 2500EUR, and is 30% faster in calculating stresses of machine parts than 2 processors of our HPUX Superdome (which has 32 processors and did cost by an order of magnitude more per processor than my machine).
So I can conclude: The X86-64 design is faster AND cheaper at the same time than whatever processors the HP superdome has, and this at running code that was written for big-iron machines.
Of course it makes no sense comparing chip speeds by GHz or some other figure, but it also makes no sense to buy an Itanium or Alpha processor if it clearly is NOT superior, for a higher price.
Heck, all those so-called high-performance chips cost 5 times the money and delivering (at best, depending on application) 2 times the speed, most even less speed.
This is what I like about Opencores.org, they will approve almost anything. Where on the page is there a permission statement from Intel, HP or Compaq?
They allowed nnArm for a few days before pulling that.
Remember Lexra (MIPs ISA clone), picoChips (ARM ISA clone), those guys had serious VC and businesses for several years and customers for their unlicensed clones, but they were eventually taken down.
An Alpha clone will be lucky to get 40MHz, but an Alpha inspired design might get much closer to 80MHz about what FPGAs can do for simple 64b datapath.
Anyway good luck to you, you will learn alot along the way.