Dell EMC DSS 8440 GPU Server Review

June 21, 2021 0 By Lorena Mejia

Our latest review is a GPU server: the 4U Dell EMC DSS 8440. This is a shared accelerator platform ideal for machine learning training and inference applications supporting up to 10x Nvidia double-wide GPU cards or up to 16x single wide GPUs for a multi-tenant, multi-workload environment. Of course, you don’t have to share this system with anyone and it could be your own exceptionally high-performance computing environment. We won’t judge.

What’s it good for?

We were beginning to wonder if Dell still made GPU servers… kidding, of course they do! This is Dell’s first system at 4U that supports up to 10x Nvidia v100 GPUs supported by dual Intel Xeon Scalable processors. It’s designed for machine learning and training, and inference, which are all needed to support that next predictive analytics system that will provide the algorithms for machine learning. Everybody has to start somewhere, even machines. They too have to train for a while before they can put their predictive, inference to work based on that training. The inference can’t happen without the training. There’s more to it than that, but we’re really just looking at the hardware.  

With a high speed I/O connection plus a large memory footprint coupled with Nvidia Tesla or Quadro RTX GPUs, this system is also great for modeling and simulation. As with all of these GPU servers, the key is placing them with enough air flow to ensure predictable performance is not impacted by thermal constraints. In other words, we want to squeeze every drop of performance out of those GPUs, ensuring they don’t fail from heat exhaustion. To do that, Dell placed a grid of 12x 60mm cooling fans right at the front of the system and positioned the GPUs just behind those fans for optimal cooling. Storage is in the very back with 10x 2.5-inch drive bays. The motherboard, CPUs and memory occupy the center of the chassis.

Front of System

The perforated metal panel takes up the whole front of the system, but you can see the 12x fans. If you scaled this up it might look like a bank of Marshall stacks from a hair band concert in the 80s. Hearing loss was a problem, but good times!

On the right is a little control panel including the Power ON button with power indicator LED, a system health indicator LED, and a System Identification Button.

Rear of System

Now the rear of the system is where it gets interesting. With no storage up front, storage is positioned in back on the right in a bank of 10 storage bays. Six are flexible NVMe or SATA +2x SATA and 2x dedicated NVMe, natively. If you require SAS drives, you will need a discrete controller. Above, below, and to the right of the drive cage are vents for air flow. Below that are 4x 2400W PSUs offering 2+2 redundancy. Above that in the middle are 2x RJ45 ports and 2x Small Form Factor Pluggable + (SFP+) ports. Next the PCIe slots that can be used to support significant I/O.

GPUs Inside the System

Cracking the lid on the Dell EMC DSS 8440 you can see the GPUs linked along the front of the system just like on the other Dell GPU server the C4140, but that one is 1U and supports 4x GPUs. You could stack four C4140s for up to 16x GPUs at 4U, but you would also need more CPUs and memory adding to the cost. In addition to 10x full-width V100 or T100 16 or 32GB GPUs, or Quadro RTX GPUs, you could also install up to 16x Tesla T4 GPUs for a superfast distributed environment.

If using the T4 GPUs some of them will be housed in the butterfly module which is that hulking metal bracket at the rear, which supports risers 1 and 2. For memory intensive applications, the RTX 6000 or 8000 are a great choice, while the V100 and T100 GPUs are better suited for double precision workloads.

Graphcore’s C2 IPU card is designed specifically to address the next generation of machine learning both in training and inference. IPU means Intelligent Processing Unit and these Graphcore cards feature not one but two IPUs and offer up to 200 TFLOPS of parallel processing capability and over 2416 individual processing cores. They are compatible with both Gen3 and Gen4 PCI Express and consume up to 315W each with passive cooling.

You can install up to 8 cards networked together using IPU-Link high-bandwidth interconnect cables, which connect on top of the cards, for up to 1.6 PetaFLOPs of compute power! These Graphcore GPUs may only be compatible with Intel’s next-gen Ice Lake processors, which will feature more cores and faster memory. There is still not much information on this option, but with all these Lake-based code names you would think Intel was based in Minnesota or something.  

Processors and Memory

Just behind those GPUs are the CPUs and the associated memory module slots, which reside on a completely different board from the GPU PCIe switchboard. That switchboard also has 4x PLX switches to expand the PCIe lanes. The Dell EMC DSS 8440 is compatible with both first- and second-generation Intel Xeon Scalable processors, with 2nd generation processors offering support for data centric persistent memory modules, like Intel Optane. However, there is no mention of actual support for those data centric persistent memory modules on this system. At least not yet.

Our system came with two second generation Intel Xeon Scalable 6248 gold processors with 20 cores each, 192GB of memory in 12 slots, plus 4x Nvidia Tesla V100 16GB GPUs. So, still a lot of room for expansion. Processors supporting up to 24 cores each can be installed and provide up to 12x memory module sockets each, with two DIMMs per memory channel. With both processors installed, there are 24x active memory module slots. There’s a potential for up to 768GB of memory using 32GB memory modules in all slots.

Single and dual rank Registered DIMM modules, or RDIMMs are supported. Since this is a GPU server, air flow is important, so if you’re not populating all memory module slots, DIMM blanks are required to maintain internal temperatures. Memory speeds of up to 2933MHz are supported.


iDRAC 9.0 with Lifecycle controller is used on all the new Dell PowerEdge servers for remote and at-chassis management of the system. It’s also compatible with Redfish API open systems management making it easy to integrate with a wider range of data center management options. And then there’s also Nvidia’s Graphics Cloud registry providing other software stacks if you will be using Nvidia cards.  Not to mention Graphcore’s Poplar software stack for both training and inference if you will be using Graphcore’s IPU.

Along with the integrated S140 storage controller, the DSS 8440 also supports the PERC H730+. You will need a PERC H730P if you want to install SAS drives or if you just need a little more control over your SATA drives. This is an outstanding controller even though it has been superseded by the H740. Not sure why they didn’t go with that, but whatever. It’s a great option for streaming digital media and database applications for RAID and offers 2GB non-volatile cache memory. The H730P is also a great choice for hybrid storage and supports the most popular RAID Levels (RAID 0, 1, 5, 6, 50, 60). Non-RAID thru passthrough is also an option. It’s also easily managed using Dell’s proprietary integrated Dell Remote Access Controller with Lifecycle Controller.


According to Dell, the Dell EMC DSS 8440 delivers up to 25% more accelerators, plus 10% more Tensor FLOPS in a single 4U chassis, compared to earlier versions. This is a mid-tier system that is certified to work with Nvidia’s Graphics Cloud, NGC registry where you can find pre-defined and pre-tested machine learning software stacks for immediate download, making it easy to rapidly deploy.

If you're looking for one of these babies, look no further than IT Creations. We have the parts and components on hand, including those Nvidia Tesla V100, Quadro RTX, or T4 accelerators. We can also send it out for next-day delivery! With our stellar reputation, we guarantee the experience will make you reconsider your go-to IT hardware company.