Introduction
Bring the data center to your desk. NVIDIA Grace CPU and Blackwell GPUs have been reserved and gatekept in the enterprise deployment and data center. No more! NVIDIA DGX Spark puts the DGX operating system and NVIDIA Grace Blackwell superchip in the hands of developers, engineers, and enthusiasts.
Purchase your very own NVIDIA DGX Spark from SabrePC! Alongside NVIDIA founder’s edition, we also work with other brands to offer their rendition of the DGX Spark, including Gigabyte, Asus, MSI, and more.
What Can The DGX Spark Do?
The NVIDIA DGX Spark is a development platform powered by the NVIDIA GB10 Grace Blackwell Superchip, delivering one petaFLOP of AI performance in a portable NUC-style form factor. It comes preinstalledwith NVIDIA’s AI software stack and features 128GB of unified memory, perfect for AI developers.
- AI/ML Development: Fine-tuning LLMs & running local Generative AI
- Edge AI: Using it for robotics, autonomous drones, smart manufacturing, and more
- Media & Creativity: Local AI video generation, Stable Diffusion, and AI-assisted production
- Education & Research: Universities can use DGX Spark for hands-on AI curriculum research.
NVIDIA DGX Spark is a super portable AI workstation to bring along with you everywhere. Believe it or not, the power brick is almost the same size as the system itself. Bring DGX Spark with you to the studio, the coffee shop, the lab, and run projects wherever you go. This deployment flexibility is a huge reason why many teams love this system.
The NVIDIA DGX Spark sports an all-gold chassis with an organic mesh for the front and rear, modeled after the original DGX-1 shipped back in 2016. It’s a small 6” x 6” x 2” box sporting 4 Type-C ports (one for power), HDMI out, and networking via Ethernet and QSFP.
DGX Spark Specifications
|
GPU |
Blackwell GPU |
|
CPU |
20 ARM Cores: 10 Coretex-X925 & 10 Cortex-A725 |
|
CUDA Cores |
6,144 |
|
Tensor Performance |
1000 AI TOPS |
|
FP32 Performance |
31 TFLOPS |
|
System Memory |
128GB LPDDR5 Unified Memory |
|
Memory Interface |
256-bit |
|
Memory Bandwidth |
273 GB/s |
|
Storage |
4TB NVMe M.2 |
|
Networking |
10GbE & 2 port ConnectX-7 @ 200Gbps |
|
TDP & Power Supply |
140W TDP with 240W Power Brick |
|
Display |
1x HDMI 2.1a |
- NVIDIA GB10 Grace Blackwell Superchip: Up to 1 petaflop of AI performance at FP4 precision and 31 TFLOPS of FP32.
- 128GB of unified CPU-GPU memory: Unified memory removes the data transfer bottleneck between CPU and GPU, so developers can prototype, fine-tune, and run inference locally optimally.
- NVIDIA ConnectX networking for clustering and NVIDIA NVLink-C2C for 5x PCIe bandwidth. Developers can connect multiple NVIDIA DGX Spark for increased computing performance.
Not that it’s important, but interestingly enough, the DGX Spark’s GPU specifications and performance match the RTX 5070’s CUDA core count (6144) and FP32 performance (31 TFLOPS).
NVIDIA DGX Spark AI Performance
Instead of relying on cloud computing and API calls, DGX Spark delivers surprisingly great performance in fine-tuning, image generation, data science, and inference. NVIDIA posted raw number throughput benchmarks using the DGX Spark.
DGX Spark for Local LLM Inference
The DGX Spark delivers impressive token generation speed across a wide variety of models. Unlike current accessible GPUs—like the RTX PRO 6000 with 96GB of VRAM—the DGX Spark offers 128GB of unified GPU memory. This allows you to run larger models locally, even if it means slightly slower performance. In our professional opinion, anything above 20 tokens per second is highly usable.
|
Model |
Precision |
Backend |
Prompt processing throughput(tokens/sec) |
Token generation throughput(tokens/sec) |
|
Qwen3 14B |
NVFP4 |
TRT-LLM |
5928.95 |
22.71 |
|
GPT-OSS-20B |
MXFP4 |
llama.cpp |
3670.42 |
82.74 |
|
GPT-OSS-120B |
MXFP4 |
llama.cpp |
1725.47 |
55.37 |
|
Llama 3.1 8B |
NVFP4 |
TRT-LLM |
10256.9 |
38.65 |
|
Qwen2.5-VL-7B-Instruct |
NVFP4 |
TRT-LLM |
65831.77 |
41.71 |
|
Qwen3 235B(on dual DGX Spark) |
NVFP4 |
TRT-LLM |
23477.03 |
11.73 |
DGX Spark for Fine-Tuning
Fine-tuning LLMs is an extremely common AI developer task. With the portability of DGX Spark, developers can leverage powerful compute in a dense, on-the-go workstation. This system puts custom AI models in the hands of everyone—from enthusiasts to developers to businesses.
|
Model |
Method |
Backend |
Configuration |
Peak tokens/sec |
|
Llama 3.2 3B |
Full fine tuning |
PyTorch |
Sequence length: 2048 |
82,739.20 |
|
Llama 3.1 8B |
LoRA |
PyTorch |
Sequence length: 2048 |
53,657.60 |
|
Llama 3.3 70B |
QLoRA |
PyTorch |
Sequence length: 2048 |
5,079.04 |
DGX Spark for Image Generation
The DGX Spark's large GPU memory and powerful compute performance enable you to work with higher-resolution images and higher-precision models for higher image quality. FP4 data format support allows fast image generation, even at those high resolutions.
|
Model |
Precision |
Backend |
Configuration |
Images/min |
|
Flux.1 12B Schnell |
FP4 |
TensorRT |
Resolution: 1024×1024 |
23 |
|
SDXL1.0 |
BF16 |
TensorRT |
Resolution: 1024×1024 |
7 |
FAQ about NVIDIA DGX Spark
What makes the DGX Spark different from other AI development systems?
DGX Spark brings NVIDIA’s enterprise-grade DGX architecture into a compact, developer-friendly form factor. Powered by the Grace Blackwell Superchip, it delivers up to 1 petaFLOP of AI performance in a small, portable, personal data center for AI developers and researchers.
Who is the DGX Spark designed for?
DGX Spark is built for AI developers, engineers, researchers, and enthusiasts who want local, high-performance compute without relying solely on cloud resources. It’s perfect for model prototyping, fine-tuning, and on-device inference.
What software comes preinstalled on the DGX Spark?
DGX Spark ships with NVIDIA’s full AI software stack, including CUDA, cuDNN, TensorRT, and preconfigured Docker containers fully compatible with popular frameworks like PyTorch, TensorFlow, and JAX. DGX Spark runs the DGX OS, giving users direct access to the same environment used in NVIDIA’s data center systems.
Can I cluster multiple DGX Spark units together?
Yes. With built-in ConnectX-7 networking (200 Gbps), you can connect multiple DGX Sparks for parallel processing, small-scale cluster simulations, or distributed model training.
How does the DGX Spark handle thermal and power efficiency?
Despite its compact 6” x 6” chassis, DGX Spark is engineered with an optimized cooling solution and an efficient 140W TDP design. The Grace CPU and Blackwell GPU work in tandem for high performance per watt, allowing silent and reliable operation even under heavy AI workloads.
Where can I purchase the DGX Spark?
DGX Spark is available through SabrePC, offering NVIDIA Founder’s Edition models as well as partner versions from Gigabyte, ASUS, MSI, and more. Visit SabrePC’s product page to explore configurations and availability.
Conclusion
The NVIDIA DGX Spark represents a new era of accessible AI compute. For the first time, developers, researchers, and creators can harness the power of the Grace Blackwell Superchip, previously reserved for data centers, right from their desks. Whether you are fine-tuning models, experimenting with generative AI, or building intelligent edge applications, DGX Spark delivers uncompromising performance in a compact, portable form factor.
With unified memory, enterprise-grade networking, and NVIDIA’s robust AI software ecosystem, the DGX Spark is not just another development box. It is a portable personal AI powerhouse built for innovation and AI democratization.
Bring the data center to your desk and take control of your AI workflow today. Get your DGX Spark from SabrePC and start leveling up your AI workflow.
