According to heise.de and news.com, AMD finalized their CPU roadmap this Friday. Opteron is scheduled for April, Athlon 64 for September. The Barton core will debut on Februar 10th in the Athlon XP 3000+. The Barthon increases the L2 cache from 256 (Thoroughbred-B) to 512 kByte, and will probably run only with FSB333. A Athlon XP 3200+ will probably appear by the middle of the year.
looks like Intel won the Speed (not performance) race, but it’s far from winning the price/performanc race
It is interesting to note in the article that some of the speed gains in this chip are not from an increase in clock speed, but an increased cache and bus speed. That is a strong benefit of the “XP Rating” AMD uses.
It will be interesting to see how their XP 3000+ chip compares to the Intel P4 3.06 GHz part, especially in multithreaded situations with Intel’s hyperthreading functionality.
why not add 32 more GP registers?
Considering the x86 currently only really has 6 (eax, ebx, ecx, edx, esi, edi), 10 more would be mighty welcome. They might consider 32 to be overkill though, would be a dream to program assembly with though.
registers are added in x86-64
Only 16 new registers are added because that’s how many it is easy to add using the x86 opcode encoding.
Not only is it hard to get to more registers, it is a really ugly hack the way it is done. There is only room in the instruction itself for the 8 registers of the x86 (only 6 of which are generally usable), to enable usage of more AMD uses the horrid old x86 instruction prefix mechanism. That is you can prefix an instruction with one or more modifiers, AMD adds a bunch that means things along the lines of “Add 8 to the first register reference in this instruction” and so on for all combinations one might need.
Quite horrid I’d say, adding a full 16 new prefixes to the pile of crap that is the x86 instruction encoding
There is just one prefix for distinguishing r8-r16 from rax et al. and another prefix to access higher 32bits instead of lower 32bits of any 64 bit registers while doing 32bit stuff on them. That is just 2 prefixes. They could have used a third one for 16 more registers (or just 8 more, if they don’t want to use both prefixes for a given opcode.)
They could have even destroyed some of the usual x86 mnemonic->opcode mappings in 64 mode for a prefix-less extra register addition. The real reason is diminishing returns, 16 registers are quite sufficient for an agressive register renaming OOE core.
Actually I dont think I am mistaken, I believe they use several prefixes to allow instructions using more than one register to use registers from different “blocks”. I will have to read up on it though so I will back down for the time being.
How I wish AMD had gone a more incompatible way anyway, scrapping the x86 instruction encoding and just supporting the actually useful part of the instruction set in long mode. Would have been fairly easy to translate binaries once so compatibility would have been quite decent. What a good world to live in it would have been
why didn’t it add more GP registers?
There is a new prefix called REX prefix. It contains 4 bits which expand the meaning of the following instruction.
REX.W operand width (when set means 64 bit operand)
The following add 8 to reference the register being used.
REX.R modifies the “reg” in ModRM adding a bit 3
REX.X modifies the “index” in SIB
REX.B either modifies “r/m” in ModRM or “base” in SIB or modifies “reg” field for accessing GPR, control or debug registers.
The prefixes are borrowed from the x86 instruction set (opcodes 0x40-0x4f INC/DEC instructions)
P