Benefits and Advantages of PCI-e 4.0
Since the release of Peripheral Component Interconnect Express (PCI-e) in 2002, the Open Standard has been the subject of collaboration between companies with an interest in pushing bandwidth to new levels . It is the common interface for graphics cards, spinning disk and SSD storage, network cards, and Wi-Fi. As technology advances, PCI-e pushes forward to maintain the pace demanded by newer tasks and technologies.
What are the advantages of PCI-e 4.0
PCI-e 4.0 is the next iteration of this high-speed connection interface and the primary difference between it and its predecessor is throughput. The statistics measure Gigatransfers per second (GT/s) and Gigabytes per second (GB/s) for data throughput, although the specifics are not as important as what it ultimately means to users: PCI-e 4.0 is double the speed of PCI-e 3.0.
Aside from the significant speed boost, the new standard offers better capabilities for manufacturers to assess variations in system performance. This allows them to improve signal integrity and focus on problems related to signal degradation. The upshot is that the latest evolution of the standard makes for more robust systems.
It is worthwhile to consider that there are no issues with backward compatibility. Making the transition to version 4.0 is eased by the fact that PCI-e 3.0 devices will still work, although they will not have the benefits of one designed for the most recent version.
Why upgrade? Is it worth it?
There are several reasons to upgrade to the latest standard, although the degree of benefit is somewhat dependent on the tasks and workloads that are being processed by the hardware. It may be a useful exercise to break it down by component.
Of the components to receive a boost in speed, the most significant is storage. As a common example, moving data to and from volatile RAM is one of the most resource-demanding and intensive tasks for database transactions. It is trite knowledge that faster storage leads to better database performance, so for high-volume databases, the choice of storage is an important one. To this point, the options for high-speed storage may depend on having support for the most recent PCI-e. Manufacturers Gigabyte and Corsair, for example, have launched NVMe SSDs that take full advantage of the new technology and report significant boosts in both read speed and write performance .
Organizations that are involved in Big Data certainly understand the importance of throughput, where one of the bottlenecks is efficient data management. Real-time workloads that process large volumes of data are very demanding on all resources, especially networking and storage. At the same time, it is important to recognize that other domains place high demands on data storage and retrieval.
Deep Learning systems, for example, benefit significantly from faster storage because they are more quickly able to handle training sets of ever-increasing sizes. In particular, for those who have chosen to host their Deep Learning platforms on-premises, this performance boost will be very much welcomed.
Deep Learning platforms inherit a new degree of vertical scalability when they adopt the newer standard because it frees up the limited PCI-e lanes. The number of lanes is dependent on the chipset and varies between machines. Each independent full-duplex lane transfers data in parallel and an example of the nomenclature is “x16”, indicating that it is a 16-lane configuration . While the new standard does not increase the number of lanes, it does reduce the demand on each one. In Deep Learning systems, this is relevant for the choice of video cards and respective GPUs.
Historically, video cards have not fully utilized the bandwidth made available by PCI-e 3.0 x8, although the results of performance testing by TechPowerUp have proven that this is changing . Consequently, as video cards become more bandwidth-intensive, it makes sense to use the x16 configuration. Now, with the knowledge that 16 PCI lanes are being used by a single graphics card, the limitation becomes more obvious.
It is important at this point to take note that the same set of PCI lanes are also used for moving data between all of the system components. This means that GPUs compete with storage devices, network adapters, and USB devices that are connected to the computer. The aforementioned performance testing was rated based on games with high framerates, but for Deep Learning projects, this is also a relevant consideration. For IO-intensive applications in systems with multiple video cards and GPUs, the PCI-e 3.0 lanes will become saturated, resulting in an unwanted bottleneck.
The limited number of PCI-e lanes can easily be overcome by scaling horizontally, adding new machines as requirements change and demands increase. This expandability is indeed one of the benefits of having an on-premises solution. However, with the introduction of PCI-e 4.0 and the extra bandwidth to work with, the point of saturation is put off considerably.
Figure 1: Motherboard with PCI-e slots labeled (Credit: CCBoot)
Video cards such as AMD’s Radeon Pro W5700 support the PCI-e 4.0 interface, which affords them flexibility because the x8 configuration will not become saturated as quickly. With only 8 lanes in use, it leaves the others free for other operations. Generally speaking, AMD GPUs are not the preferred choice for Deep Learning platforms, but the benefit will transfer to all devices as the 4.0 standard becomes more commonplace.
Ethernet and Wi-Fi Adapters, Onboard Sound, and USB
Other components in the system that transfer data serially include network adapters, onboard sound, and USB devices. Manufacturers have been slower when it comes to transitioning these devices to PCI-e, in particular, compared to storage and video cards, yet together, they are still an important part of the overall picture. Again, it is due to competition for PCI lanes and throughput requirements. As bandwidth is consumed by multiple devices, the availability decreases, and bottlenecks can appear.
PCI-e 3.0 x8 supports an Ethernet connection of 40 gigabits. With the demand for data and the subsequent traffic both increasing, the requirement for speed increases proportionally. In particular, for data-intensive applications, PCI-e 4.0 enhances capabilities to better support Data Mining, Big Data, Deep Learning, and more generally, Artificial Intelligence applications.
PCI-e 4.0 is the next evolution of the high-speed interface used to transfer data between components. This includes critical IO-intensive systems such as storage, video, and network adapters. As tasks and organizations become more data-centric, demands for data increase and there is constant pressure to keep up with technology and stay ahead of competitors.
The increased throughput of PCI-e 4.0 is significant and can alleviate some of the problems with growing services and capabilities. In particular, it allows for better vertical scaling and thus reduces the total cost of expanding. Importantly, making the transition to the most recent version will not exclude the use of current assets that were designed for the previous iteration, 3.0.
In summary, the path forward is clear and the question remains when, rather than if, workloads will demand the additional throughput provided by PCI-e 4.0. As with previous versions of the interface, it is only a matter of time before PCI-e 3.0 is only considered for legacy applications.
 PCMag, “Definition of PCI Express,” PCMAG. [Online]. Available: https://www.pcmag.com/encyclopedia/term/pci-express.
 S. Dent, “Gigabyte’s next-gen SSD shows the incredible potential of PCIe 4.0,” Engadget, 31-May-2019. [Online]. Available: https://www.engadget.com/2019-05-31-gigabyte-aorus-nvme-pcie4-ssd.html.
 TechPowerUp, “NVIDIA GeForce RTX 2080 Ti PCI-Express Scaling,” TechPowerUp, 24-Sep-2018. [Online]. Available: https://www.techpowerup.com/review/nvidia-geforce-rtx-2080-ti-pci-express-scaling/.
 CCBoot, Motherboard with PCI-e Slots Labeled.