Hardware Considerations When Starting an AI Project
As AI and Deep Learning enthusiasts experiment with tools like Hugging Face, Google Colab, Jupyter notebook, and other cloud notebook services, the hardware is all taken care of. But when running these AI tasks locally, using any old system won’t cut it. And you already know… the question now is what do you need?
It may not be immediately obvious at first, but the best AI hardware will depend on the type of operations you plan on running, depending on dataset file type, size, etc. It can be easy to over-engineer a system and, but more performance never hurt anyone, and scaling your model can be easier with more and more challenging problems. With a diligent consideration you can avoid unnecessary costs, choose the hardware that matters, and find a solution that’s optimized for both your needs and budget.
When building AI tools from absolute scratch, it’s also important to choose an efficient power supply unit, ample fast storage (plus a large HD for cold storage if desired).
What CPUs for Training AI?
Wait a second, isn’t deep learning all about the GPU? After all, the “ImageNet Moment” of deep learning demonstrated the training a convolutional neural network like AlexNet is much more efficient on GPUs than CPUs.
However, a GPU is more like an accelerator while the CPU is the brain that directs. The CPU plays a pivotal role in managing the overall computational workload, and its specifications can significantly impact the performance of your AI training tasks.
Our recommendations would be from these platforms:
When selecting a CPU, you should consider its Clock Speed, Core Count, Cache Size, and PCIe lanes:
- Processing Power (Clock Speed vs Core Count) - The debate between clock speed and core count is a big deciding factor for most configurations. While more cores may mean better, high clock speeds are also important to consider. More often than not, there is a positive/negative correlation between Clock Speed and Core Count. Striking a balance on enough cores running at a higher clock speed is best.
- Cache Size - The cache of a CPU serves as a small, high-speed memory located directly on the CPU chip, enabling quick access to frequently used data and instructions. A larger cache allows the CPU to store more data and instructions closer to the processing cores, minimizing the need to fetch information from the slower RAM. This leads to faster execution times and heightens computational efficiency, especially crucial when dealing with the iterative nature of training deep neural networks.
- PCIe Lanes - Lastly for CPUs is the amount of available IO you can connect to this platform. Processors by generation usually have a set amount of PCIe lanes, with more recent processors like AMD EPYC 9004 and 5th Gen Intel Xeon having more than its previous generation. Ensuring you have enough PCIe lanes to connect your PCIe devices including storage, networking, and most importantly GPU. More is always good!
How Much RAM for AI Training?
How much RAM should be used in an AI computer? Compared to CPUs and GPUs, RAM is likely to account for a far smaller proportion of your overall budget for system. Having plenty of RAM can significantly improve your experience, and prototyping on a machine with enough RAM can have a significant impact on conserving your most valuable resource: Time.
Being able to load large datasets into memory without worrying about running out obviates the need for programming clever workarounds. It can also free up brain cycles for concentrating on the more interesting (and fun) problems that you’re actually interested in solving. That should be the focus rather than implementing memory workarounds like chunking. This is even more important for projects requiring significant tinkering with data pre-processing.
A good rule of thumb is to buy at least as much RAM as the GPU memory in a system. That’s solid advice for image-processing workflows with big GPUs, but for workflows that might weight the GPUs as slightly less important you may opt for configuring with twice as much RAM as GPU memory in a system. However, if a project is not dependent on large datasets (i.e. training is mostly done in simulations rather than on datasets) you don’t need as much memory.
What GPUs for Training AI
In the landscape of AI training, Graphics Processing Units (GPUs) have emerged as indispensable components, revolutionizing the speed and efficiency of complex computational tasks, particularly in deep learning. Unlike traditional CPUs, GPUs are designed to handle parallel processing with thousands of cores, making them exceptionally well-suited for the matrix and vector operations inherent in neural network training. We want to look for these key specifications for your next GPU:
- GPU Cores - For NVIDIA they are CUDA Cores and for AMD GPUs they are Stream Processors. For this specification, more is always better, since the number of cores is a key indicator on how parallel processing power a GPU has for handling multiple tasks simultaneously, crucial for AI workloads, deep learning tasks, and matrix operations.
- Memory Size and Bandwidth - Consider the GPU's memory bandwidth, as it directly impacts how quickly the GPU can access and manipulate data. Additionally, pay attention to the size of Video Random Access Memory (VRAM). Large models and datasets demand sufficient VRAM to avoid performance bottlenecks. Just like RAM, opt for a GPU with ample VRAM to accommodate the size of your AI models and datasets.
- Scalability - If your workload can benefit from using multi-GPU calculations, it is best to see if your GPU can scale well with itself. NVIDIA’s DGX and HGX line of SXM GPUs deliver extreme bandwidth and inter connectivity between all GPUs to act as one huge GPU for crunching complex AI workloads. Other data center GPUs like RTX 6000 Ada and L40S have a standard dual slot form factor that lets users harness up to even 4 GPUs in a workstation, something impossible with the extremely thick RTX 4090.
For our GPU recommendations:
- For large scale enterprise data center - multiple servers of NVIDIA H100
- For small scale data center - multiple NVIDIA L40S or RTX 6000 Ada
- For high performance workstations - multiple RTX 6000 Ada
- For enthusiast level performance and tinkering - One RTX 4090 will suffice.
Concluding Thoughts for Your Artificial Intelligence Hardware
The journey of AI development and training requires more than just knowledge; it demands the right hardware to turn your aspirations into reality. Popular AI models like ChatGPT utilize hundreds of NVIDIA DGX system and Automated Driving data centers require ample GPU compute to train their models.
At SabrePC, we understand that each AI project is unique, and the hardware powering it should reflect that uniqueness. Our commitment is to empower your AI ambitions by providing not just components, but tailored solutions that align perfectly with your requirements. Whether you're configuring a high-performance server for massive-scale AI training or crafting a workstation for innovative research, SabrePC stands ready to be your trusted partner.
With an array of cutting-edge CPUs and GPUs at your disposal, along with our expertise in configuring AI-centric systems, SabrePC ensures that your hardware is not just a beefy computer but a tool for innovation. If you have any questions about hardware or deep learning solutions, contact us today!