Poring through the Intel's PDF files
, I was struck first by a marked absence of anything to do with 32-bit compatibility. That's strange, since this was initially such a big marketing point for Intel (See May 10th column).
But the company has recently changed its tune. The new marketing pitch is 32-bit code will run best on whatever is the fastest 32-bit processor Intel has out at the time. That certainly makes sense, since Intel has extended its 32-bit road map with a series of robust chips stretching well past 2000. Now, Intel can position IA-64 as a true workhorse for workstations and servers. Still, the absence of detail on 32-bit conversion tends to give credence to the belief that it will be done in hardware.
More clearly delineated are all the arcane permutations of the IA-64 instruction formats. Each IA-64 instruction is categorized into one of six instruction types: A (for Integer ALU), I (Non-ALU Integer), M (Memory), F (Floating-Point), B (Branch), or L + X (Extended). Three instructions are grouped together in a 128-bit-wide instruction bundle for simultaneous execution.
On the multimedia front, IA-64's multimedia instructions are semantically compatible with the existing MMX and Streaming SIMD
instruction technology. The latter are the opcodes informally dubbed MMX2, which appear in the Pentium III processor. Early on, Intel says it was hoping 3-D and streaming audio and video would be the "killer applications" driving PC sales to new heights. Those apps haven't really caught fire, though the MMX instructions live on.
Chasing Alpha
In a bid to put its chip on an equal footing with RISC
competitors such as Alpha (which are much beloved -- sometimes unjustly -- by the technically savvy), Intel has long hyped Merced's floating-point performance. In that vein, Merced is equipped with 128 floating-point registers. Of these, 96 are rotating (not stacked) and can be used to modulo-schedule loops compactly. IA-64 also has parallel floating-point instructions that can operate on two 32-bit single precision numbers.
Intel's manuals also provide perhaps more detail than anyone should have to absorb on the complexities of speculation and predication -- the two VLIW
-like techniques at the heart of IA-64. (Here, I have to add the usual disclaimer, lest I get a phone call from Intel: According to Intel, Merced is not a VLIW architecture. It is an EPIC design. However, it does incorporate VLIW concepts, the company says.)
Predication removes branches from code by executing both pre- and post-branch instructions at the same time. Then, the results from instructions that wouldn't have been executed during a real-world sequential run through the code are thrown out. Speculation masks memory latency by moving load instructions forward to be initiated as early as possible in the program flow.
>>>Predication And Speculation