Upload
nvidia-japan
View
1.183
Download
7
Embed Size (px)
Citation preview
Jeremy Main シニアソリューションアーキテクト GRID
GRID Technical SessionvGPU Top10 PoC Survival Tips
Most Common Mistakes During POCs
Not defining PoC success criteria with stakeholders
Define measureable metrics
Use actual applications and data
Don’t use GPU-centric benchmarks to simulate multiple users
Most Common Mistakes During POCs
Attempt to add PoC into existing IT infrastructure
Use an isolated and controlled environment
Retain PoC environment for tuning and troubleshooting after deployment
Setup a gateway for license server access if required
Most Common Mistakes During POCs
Not understanding application resource requirements
During typical user workloads, performance limiting factor is?
Application is CPU or memory-bound?
GPU frame buffer or rendering-bound?
Perfmon on existing workstations : “NVIDIA_GPU” counters
Most Common Mistakes During POCs
Not using all available resources of information
NVIDIA deployment guides, application sizing guides
Citrix and VMware reviewer guides and best practices
Most Common Mistakes During POCs
Attempting to use non GRID certified servers
There are many versions of GRID / Tesla cards
Not every card works in every server
NVIDIA GRID™ Certified Platforms
UCS C240 M3, M4
UCS C460 M4
PowerEdge R720, R730, T620, T630
PowerEdge C4130, VRTX
PowerEdge C8220X GPU Sled
Precision R7610, Rack 9710
Celsius C620, R940, M740
Primergy CX400M1, RX2540M1,RX350S8, TX300S8
ProLiant WS460c Gen8
ProLiant DL380p Gen8 and Gen9, DL580 Gen 8
ProLiant SL250s Gen 8, SL270s Gen 8 SE
iDataPlex dx360 M4
NeXtScale nx360 M4, M5
Flex System
ThinkStation D30
System x3650M4/M5, x3850X6, x3950X6
For more information
on GRID enabled servers visit
www.nvidia.com/buygrid
Most Common Mistakes During POCs
Optimal CPUs for the workload are not used
Most CAD applications are very single threaded
Focus on higher CPU frequency, not number of cores
Most Common Mistakes During POCs
BIOS power profile is set incorrectly
Set power profile to “Maximum Performance”
Ensure CPUs can reach their highest clock speeds
Most Common Mistakes During POCs
Servers don’t have enough memory
Memory overcommit does not work with vGPU
4GB : Power User, Entry Level Engineering
8GB : Mid-range Engineering, Video
16GB : Advanced Engineering
32GB : CAD/CAM
64GB : Digital Mock Up
Most Common Mistakes During POCs
Insufficient storage IOPS
Workstation class users expect…
SSD performance since they use it locally as well
Most Common Mistakes During POCs
Inadequate network environment for VDI
Don’t use legacy network type in VM : prefer VMXNET3
Confirm network’s ability to deliver enough bandwidth
“iperf” may be used to simulate single and parallel TCP/UDP networkd streams to confirm available bandwidth exists
Most Common Mistakes During POCs
Not enough vCPUs assigned to a VM
Assign at least 4 vCPU to a vGPU enabled VM
Two vCPUs for application
One vCPU for OS and system-calls
One vCPU for remoting protocol compression
Most Common Mistakes During POCs
Not optimizing virtual machine base image
Eliminate OS-level performance inhibitors
Citrix : “TargetOSOptimizer” tool
VMware : “VMwareOSOptimizationTool”
Resources on www.nvidia.com/grid
White papers
Application guides
Deployment guides
Success stories
GRID 2.0 Datasheet and FAQ
Videos
Blogs