ALU/CPU

ALU/CPU Registers

The IA-64 contains the following logic units:

There are a large number of general purpose registers. 128 64-bit integer registers. 128 82-bit floating-point registers.

64 1-bit predication registers.

Eight 64-bit branch registers.

Eight 64-bit kernel registers.

One 64-bit frame marker register.

One 64-bit Instruction pointer.

128 1-bit NaT registers.

128 1-bit NatVal registers.

 

Addressable Memory

First, IA-64 is byte addressable. And addressing registers are 64-bits long. But IA-64 also supports 32-bits pointers from IA-32.

Byte Ordering

IA-64 can be specified to store data in either big-endian or little-endian. However, all instruction fetching is done in little-endian mode.

Integer Representation

Unsigned and two's complement.

Floating Point Representation

Floating-point numbers are stored in 82-bits. One bit for the sign, 17-bits for the exponent and 64-bits for the mantissa. And it uses IEEE 754 standard format

 

Instruction Types & Supported Data Type

M Memory/Move Operations

I Complex Integer/Multimedia Operations

A Simple Integer/Logic/Multimedia Operations

F Floating Point Operations (Normal/SIMD)

B Branch Operations

 

Instruction Classes

Logical operations (e.g. and)

Arithmetic operations (e.g. add)

Compare operations

Shift operations

Multimedia operations (e.g. padd)

Branches

Loop controlling branches

Floating Point operations (e.g. fma)

SIMD Floating Point operations (e.g. fpma)

Memory operations

Move operations

Cache Management operations

 

Instruction Format

¡®Packaging entity¡¯:

3 * 41 bit Instruction Slots

5 bits for Template(Decide mapping of instruction slots to execution units)

Typical examples: MFI or MIB

Including bit for Bundle Break ¡°S¡±

A bundle of 16B:

Basic unit for expressing parallelism

The unit that the Instruction Pointer points to

The unit you branch to

Actually executed may be less, equal, or more

Slot 1

Slot2

Slot3

T

 

No ¡®unique¡¯ format; typical examples:

(p20) ld4 r15=[r30],r8

Load int (4 bytes) using address plus post-increment stride

(p4) fma.d.s0 f35=f32,f33,f127

U = X * Y + Z

(p2) add r15=r3,r49,1

C = A + B + 1

FMA

Opcode++

R4

R3

R2

R1

qp

7

7

7

7

7

6

Add

Opcode

Flags

R3

R2

R1

qp

7

7

7

7

7

6