GPU Accelerator – nVidia Tesla M60
To that end, today NVIDIA is announcing the next generation of their Virtual Desktop Infrastructure (VDI) GPU technology, GRID 2.0. This is the first full-point update of their VDI technology since the company launched GRID 1.0 back in 2012, and comes as GRID 1.x vGPU capabilities are finally widely available in the latest versions of VMware’s and Citrix’s respective hypervisors. GRID 2.0 in turn builds off of what NVIDIA has accomplished so far with GRID, further expanding the number of concurrent users and performance of GRID while also introducing some new features that didn’t make the cut for GRID 1.0.
Furthermore, launching alongside GRID 2.0 are new Maxwell based Tesla cards. While the launch of these cards is something of a low-key event – NVIDIA is opting to focus on GRID as opposed to the hardware – some of the new GRID 2.0 functionality goes hand-in-hand with the new hardware, so this is where we’ll start.
Tesla M60 & Tesla M6
When NVIDIA launched the first version of the GRID (née VGX) ecosystem in 2012, they launched a pair of cards alongside it, the K1 and K2. Based around NVIDIA’s Kepler GK107 and GK104 GPUs respectively, these cards have been the backbone of GRID for the last three years. However with NVIDIA launching their Maxwell architecture in 2014 it was only a matter of time until these cards were replaced with their higher performing successors, and that time is now.
NVIDIA Tesla/GRID GPU Specification Comparison | ||||||
Tesla M60 | Tesla M6 | GRID K2 | GRID K1 | |||
CUDA Cores | 4096 (2x 2048) |
1536 | 3072 (2x 1536) |
768 (4x 192) |
||
VRAM | 16GB GDDR5 (2x 8GB) |
8GB GDDR5 | 8GB GDDR5 (2x 4GB) |
16GB DDR3 (4x 4GB) |
||
Concurrent Users | 2-32 | 1-16 | ? | ? | ||
H.264 1080p30 Streams | 36 | 18 | ? | ? | ||
Form Factor | Dual Slot PCIe | MXM | Dual Slot PCIe | Dual Slot PCIe | ||
TDP | 225W-300W | 75W-100W | 225W | 130W | ||
GPU | 2x GM204 | GM204 | 2x GK104 | 4x GK107 |
Launching for GRID 2.0 are the Tesla M60 and Tesla M6. In a departure from the GRID K series, both of the Tesla M series cards are based on the same GPU, NVIDIA’s GM204. The difference between the two cards is now the number of GPUs on a card and the overall form factor.
The larger of the two cards, Tesla M60, is a dual-GM204 solution featuring two fully enabled GM204 GPUs and 16GB of GDDR5 (8GB per GPU). The M60 is a full size, dual slot card similar to the previous generation GRID cards, and is rated for between 225W and 300W depending on the performance and cooling configuration (both passive and active are available). NVIDIA has rated the M60 for up to 32 concurrent vGPU users, or 16 users per GM204 GPU. Notably this is also the first time we’ve seen a dual GM204 card, as one was never released in the consumer market.
Meanwhile the Tesla M6 marks a new form factor for an NVIDIA GRID or Tesla product, coming in an MXM form factor. This card packs a single, partially enabled GM204 GPU with 12 of 16 SMXes (1536 CUDA cores) enabled, paired with 8GB of GDDR5. With only a single GPU it’s essentially rated for half as much work as M60, topping out at 16 users. However in turn it only draws 75W to 100W of power depending on configuration, and more importantly the MXM form factor makes the card suitable for installation in high-density blade servers, something that could not be done with the PCIe cards. Otherwise the card’s specifications are very similar to NVIDIA’s consumer GeForce GTX 980M, and we wouldn’t be surprised if this was the GTX 980M repurposed for server use.
Since the hardware is not the focus of today’s announcement NVIDIA isn’t releasing much more in the way of information, but there are two quick points we want to touch on. First, the company is also rating these cards by the number of 1080p30 H.264 streams they can encode, presumably for the video encoder market. The Tesla M60 is rated for 36 streams and the Tesla M6 for 18 streams. Second of all, these cards are being released under the Tesla brand and not the GRID brand. Tesla has previously been reserved for pure compute cards (e.g. Tesla K80), but since VDI is just another GPU application at this point – i.e. there are no appreciable hardware differences between a VDI card and a pure compute card – NVIDIA would appear to be converging all of their server cards under the Tesla brand.
GRID 2.0
Getting back to NVIDIA’s principle announcement then, let’s talk about GRID 2.0. The release of the Tesla cards means that GRID 2.0 offers a slew of performance and density features over GRID 1.0 thanks to the newer hardware. From a performance standpoint NVIDIA is advertising the new Tesla cards as offering 2x the performance of their GRID K-series cards, allowing for either per-user performance to be doubled, or for the number of concurrent users to be doubled. The performance argument is essentially about improving performance at the high-end where GPUs are already allocated on a 1:1 basis, meanwhile improving concurrent user counts ultimately brings down the number of cards required, and thereby the overall cost of servers to support a VDI operation.
The fact that NVIDIA now has MXM form factor cards for blade servers also plays a role here, as it improves on the physical density of VDI hosting. Blade servers offer tremendous hardware density, and bringing a VDI capable GPU into that environment allows for similar improvements in GPU-accelerated VDI density.
Meanwhile alongside the hardware-borne improvements, GRID 2.0 also brings new functionality to the GRID ecosystem for current GRID K-series users. NVIDIA tells us that the new software supports twice the number of concurrent users as GRID 1.0, raising the limit to 128 users per server. And while performance will scale down accordingly, it will allow for greater user density with very light workloads.
More significant for current GRID users, I suspect, will be that GRID vGPU environments now support CUDA. Previously CUDA support was not available within the vGPU, requiring a 1:1 (pass-through) environment to access it. NVIDIA has been pushing hard for over the last half-decade to get GPU compute acceleration (via CUDA) inside professional software packages, and a lack of CUDA vGPU support meant that those programs couldn’t be fully accelerated within a vGPU environment. This change improves that situation, essentially allowing for CUDA users to finally be fully virtualized and run concurrently with each other on a single GPU. However it should be noted that there are some limitations here, with NVIDIA noting that CUDA vGPU support requires using the GRID 2.0 “8GB profile.”
GRID 2.0 also improves on guest OS support for multiple operating systems. New to GRID 2.0, NVIDIA now supports Linux guests, joining the company’s existing Windows support. On the subject of Linux use cases NVIDIA specifically mentions oil & gas users, so we suspect that this is as much for compute/CUDA users as it is for graphical users. Meanwhile GRID 2.0 also introduces formal support for Windows 10 on a “tech preview” basis, allowing Microsoft’s latest OS to be virtualized while retaining the full functionality of GRID.
Last, GRID 2.0 also introduces support for 4K monitors. Previously GRID’s maximum resolution was WQXGA (2560×1600), so this lifts the limit to support newer, higher resolution monitors. Overall the new limit for GRID 2.0 is four 4K monitors per VM.