www.Tutorialsforu.info

Free Tutorials Cave

  • Increase font size
  • Default font size
  • Decrease font size
Your Ad Here



Cluster Nodes - Page 3

E-mail Print
Article Index
Cluster Nodes
Page 2
Page 3
Page 4
Page 5
Page 6
Page 7
All Pages

2.1.3 Intel IA-64 Processors

IA-64 is a 64 bit processor architecture developed in cooperation by Intel and Hewlett-Packard for processors such as Itanium and Itanium 2. The goal of Itanium is to produce a post-RISC era of architecture using a very long instruction word (VLIW) design. Unlike previous Intel x86 processors the Itanium is not geared towards high performance exceution of the IA-32 (x86) instruction set.

Architecture

A key feature of the IA 64 is that it features a revolutionary 64 bit instruction set architecture which applies a new processor architecture technology known as EPIC (Explicit Parallel Instruction Computing). Another key feature is that it is fully compatible with the IA-32 instruction set. In a maninstream design, a complex decoder system examines each instruction as they flow through the pipeline and sees which can be operated on parallel across different execution units. This ability to extracct instruction level parallelism (ILP) from the instruction stream is essential to good performance in a modern CPU. However predicting which code can and cannot be split up this way is a complex task. For instance with an IF statement the inputs to one line is dependent on the output from another. The calculations although independent of one another, due to the presence of the IF statement, the THEN following the IF requires the result from the IF to know whether it should proceed at all or not. Usually in these cases the circuitry on the CPU typically "guesses" what the condition will be. However if the guesses are wrong then it causes a significant performance problem as the wrong result has to be discarded and the CPU needs to wait for the right result. The IA-64 relies on the compiler for this task. The complier examines the code and makes these decisions that would happen during run time on the chip itself. Once it decides which path to take it gathers up all the instructions and stores it in the VLIW form in the program.

This strategy of moving the task from the CPU to the complier is one of the major advantages of the IA-64. Offloading the whole prediction task to the compiler reduces the complexity of the circuitry greatly as the prediction can be very complicated. Further the compiler can spend more time examining the code, which the chip itself cannot do as it has to complete the task as quickly as possible. The Itanium architectire provides mechanisms such as instruction templates, branch hints and cache hints to enable the compiler to communicate compile-time information to the processor. It also allows compiled code to manage the processor hardware using run-time information. These compiler to processor communication mechanisms are vital in minimizing the performance penalties associated with branches and cache misses.

The disadvantage of this however is that the program's run time behaviour is sometimes not obvious in the code. It also makes the VLIW strategy heavily dependent on the performance of the compilers, thus there is a trade off between reducing microprocessor complexity and increasing the compiler software complexity.

Registers

This section briefly reviews some of the registers available in IA 64. The IA 64 includes 128 64 bit integer and 82 bit floating point registers. Besides the sheer number of the registers the IA 64, also adds in a register rotation mechanism that is controlled by the Register Stack Engine which allows the processor to rotate in a set of new registers to accomodate for new function parameters or temporaries.

General registers

A set of 128 (64 bit) general registers provide the resource for all integer and integer multimedia computation. These are numberes GR0 through to GR127. Each general register has 64 bits of normal data storage plus an additional bit called the NaT bit to track deferred speculative exceptions. The general registers are partitioned into two sets GR0 to GR31 are termed static general registers, while GR32 to GR127 are called stacked general registers. GR8 to GR31 contain the IA 32 integer, segment selector and segment descriptor registers.

Floating point registers

There are 128 (82 bit) floating point registers. Again these are numbered FR0 to FR127 and partitioned into two subsets. FR0 to FR31 are called static floating point registers, while FR32 to FR127 are called rotating floating point registers. Floating point registers FR8 to FR31 contain IA 32 floating point and multi-media registers while executing IA 32 instructions.

Register Stack Configuration registers

The RSC register is a 64 bit register used to control the operation of the Register Stack engine (RSE). Instructions that modify RSC can never set the privilege level field to a more privileged level than the currently executing process.

Predicate registers

A set of 64 (1 bit) predicate registers are used to hold the results of comparable instructions. These are numbered PR0 to PR63 and are used for conditional execution of instructions. These are further partitioned into two subsets static predicate registers (PR0 to PR15) and rotating predicate registers (PR16 to PR63).

Branch registers

A set of 8 (64 bit) registers are used for holding branch information and are numbered from BR0 to BR7.

Instruction set

The architecture provides a CISC like complement of instructions where there are explicit instructions for both floating point operations and multimedia operations. The Itanium supports several bundle mappings to allow for more instruction mixing possibility and includes a balance between serial and parallel execution modes. There is also room left in the initial bundle encodings to allow additional mappings to be added in future versions of IA 64.

Despite the huge capabilities in IA 64 instruction set, it is notoriously difficult to program directly. Intel discourages against the practise of assembly programming on Itanium and instead urges the use of the Intel C++ compiler which has platform specific heuristics.



 

Subscribe By Email

Enter your email address:

Delivered by FeedBurner

Translate

Donate

Development & maintainance needs time & money.
With your donation you can help us to keep this project alive
Donate:
  Monthly Monthly
Currency
Amount