| Article Index |
|---|
| Cluster Nodes |
| Page 2 |
| Page 3 |
| Page 4 |
| Page 5 |
| Page 6 |
| Page 7 |
| All Pages |
Enhanced Bus Efficiency
The PCI-X bus incorporates the following technologies to improve the bus efficiency:
- Attribute Phase: The PCI-C includes a new transaction phase called the attribute phase that uses a 36 bit attribute field to decsribe bus transactions in more detail than conventional PCI allows. The following enhancements are included in the attribute phase:
- Relaxed ordering: IF the device driver or the controlling software sets this bit, then the transaction is permitted to pass previously posted transactions from other devices. Relaxed ordering is especially important in applications such as audio or video streaming, where a delay in information would cause a noticeable interruption.
- Non-Cache-Coherent Transactions: This refers to maintaining a consistent view of memory during a transaction between the processors and I/O subsystem. For the PCI bus whenever a device writes or reads to main memory, the processor has to perform a snoop operation to make sure that the data does not exist in the cache memory. These snoop cycles limits the performance of the system by adding traffic. In the PCI-X non-cache-coherrent transactions are allowed by using a dont snoop bit. If any device driver or software sets this bit, then the PCI-X device informs the system cache controllers that no query is needed.
- Transaction Byte Count: In PCI protocol the bridge fetches a default number of cache lines (one or two) for every data request as it has no way knowing how much data will be requested. With the PCI-X the bridge knows exactly how much data to fetch because the byte count is included in the attribute field. Each PCI-X transaction in a sequence identifies the total number of bytes remaining to be read or written in its associated sequence. This enables more efficient buffer management schemes in the bridge as well as more efficient utilization of bus and other system resources.
- Sequence number: The sequence number uniquely identifies transactions that are part of the same sequence. The sequence number is used to increase efficiency in the buffer management algorithms.
- Split Transaction Support: Conveentional PCI protocol supports delayed transactions. With a delayed transaction, the device requesting data must poll the target to determine when the request has been completed. But with split transaction in PCI-X the device requesting data sends a signal to the target. The target device informs the requester that is accepted the request so that the requester is free to process other jobs, thus increasing the efficiency.
- Optimized Wait States: PCI-X eliminatesthe use of wait states, excepr for initial target latency. When a PCI-X device does not have data to transfer, it will remove itself from the bus so that another device can use the bus bandwidth. This provides more efficient use of bus and memory resources.
- Standard Block Size Movements: With PCI-X, adapters and bridges are permitted to disconnect transactions only on natural aligned 128 byte boundaries. This encourages longer bursts and enables more efficient use of cache line based resources such as the processor bus and main memory.
- Provides bandwidths which are an order of magnitude greater rthan existing I/O capabilities.
- Provides improved connection flexibility and scalability as storage and I/O are separated from processor and memory
- It offloads communications processing from the OS and CPU,thus eliminating traditional communications overhead.
- It can also do simultaneous device communication, rather than waiting for other devices to complete their communication.
- Provides support for up to 64,000 addressable devices and support for Internet Protocol version 6 (IPv6) for effective communications between IBA fabrics and the Internet or intranets.
- Host Channel Adapter (HCA): An HCA is an interface that resides within a server and communicates directly with the servers memory and processor as well as the IBA fabric. The HCA guarantees delivery of data, performs advanced memory access and can recover from transmission errors. HCAs can communicate with a target channel adapter or a switch. An HCA can be a PCI to InfiniBand card or it can be integrated on a system motherboard.
- Target Channel Adapter (TCA):A TCA enables I/O devices, such as disk or tape storage, to be located within the network independent of a host computer. The TCA includes an I/O controller that is specific to its particular device's protocol. TCAs can communicate with an HCA or a switch.
- Switch: The switch allows many HCAs and TCAs to connect to it and handles network traffic. The switch looks at the local route header on each packet of data that passes through it and forwards it to the appropriate location. A group of switches is referred to as a fabric. The switch also frees up servers and other devices by handling network traffic.
- Router: A router forwards data packets from a local network (called a subnet) to other external subnets. The router reads the global route header and forwards packets based on the IPv6 network layer address. The router rebuilds each packet with the proper local address header as it passes it to the new subnet.
- Subnet Manager: The subnet manager is an application responsible for configuring the local subnet and ensuring its continued operation. Configuration responsibilities include managing switch and router setups and reconfiguring the subnet if a link goes down or a new one is added.
- Physical Layer: The InfiniBand physical layer defines its electrical and mechanical characteristics, including cables, connectors and hot-swap characteristics. Connectors include fiber, copper and backplane connectors. There are three link speeds specified as 1X, 4X and 12X. The speeds are a function of the pin counts or wires within each cable. With a 1X link cable, there are four wires, two for each direction of communication (read and write). The 4X speed has four times as many pins and wires and the 12X has twelve times as many pins and wires as a 1X link cable. he bandwidth for a 1X InfiniBand link is 2.5 Gb/s, which can achieve an actual raw data bandwidth of 2 Gb/s because 8b/10b data encryption is used on all transmissions, resulting in a 20% performance overhead. Because all links are bidirectional, the aggregate bandwidth can be doubled. Many InfiniBand products have multiple ports, further increasing I/O bandwidth.
- Link Layer: The link layer is central to the Infiniband and includes packet layout, point-topoint link instructions, switching within a local subnet and data integrity. There are two types of packets, management and data. Management packets handle link configurations and maintenance. Data packets carry up to 4 kilobytes of transaction payload. Packet forwarding and switching within a local subnet is also part of the link layers responsibilities. Every device in a local subnet has a local ID (LID). Packets of data are forwarded to the appropriate LID by reading the local route header found in each packet of data. Virtual lanes are also part of the link layer. A virtual lane is a unique logical communication link that shares a single physical link. Each link can have up to 15 virtual lanes and a management lane. As a packet travels through the subnet, it can be assigned a priority or service level. Higher-priority packets are sent down special virtual lanes ahead of other packets.
- Network Layer: The network layer is responsible for routing packets from one subnet to another. The global route header located within a packet includes an IPv6 address for the source and destination of each packet. Using a router, packets are forwarded through different subnets. For singlesubnet environments, the network layer information is not used.
- Transport Layer: The transport layer handles the order of packet delivery as well as partitioning, multiplexing and transport services that determine reliable connections.
The operating mode and the frequency of the PCI-X bus depends on the type of adapters installed on the bus and on the number of adapters installed on it. A PCI-X system automatically adjusts the bus frequency to match the frequency of the slowest adapter on that bus segment. PCI-X supports upto 256 bus segments and each segment is initialized separately so that different operating frequencies can be used. Also as with conventional PCI, system designers can optimize a PCI-X system for particular I/O bandwidth needs. An important point about the PCI-X bus is that even if it operates as a conventional PCI bus, it still provides a significant performance enhancement.
2.2.5 Infiniband Bus
To meet the increasing I/O demands of the computer industry, major technology leaders including Compaq, Dell, HP, IBM, Intel, Microsoft and Sun codeveloped the Infiniband architecture and released it in the year 2000. The Infiniband architecture was developed as a means to connect servers with remote storage, networking devices and other servers as well as for use inside servers for interprocessor communications. The Infiniband architecture offers many advantages over the existing PCI architecture and other I/O architecture which include:
Infiniband architecture
InfiniBand is a point-to-point, switched I/O fabric architecture. Each end point, or node, can vary from an inexpensive single SCSI chip or Ethernet adapter to complex host systems. Point-to-point means that each communication link extends between only two devices. Both devices at each end of a link have full and exclusive access to the communication path. To go beyond a point and traverse the network, switches come into play. By adding switches, multiple points can be interconnected to create a fabric. As more switches are added to a network, aggregated bandwidth of the fabric increases. By adding multiple paths between devices, switches also provide a greater level of redundancy. There are five primary components that make up an InfiniBand fabric:
Infiniband layers
The Inifiniband is comprised of four primary layers that describe communication devices and methodology. These layers are briefly described below:
The Infiniband bus represents a significant improvement in reliability, availability and serviceability over the PCI bus. The bascic Infiniband link is comprised of only 4 signal wires compared to more than 100 on a PCI bus. IT can also accomodate multiple ports for each I/O unit. The Inifiniband also incluses a gailover mechanism that allows network to heal itself if a link fails, further it removes the I/O from the server, thus breaking the one-to-one relation between the server and the I/O elements. Thus if an I/O device fails, communication simply falls over to another redundant I/O device. This is unlike the PCI bus and so saves much time, resources and helps keep the server online.




