There has been a lot of excitement and anticipation this year for the official release of the NVIDIA DGX Spark™. First announced at NVIDIA GTC 2025 in March, the DGX Spark is a small form factor desktop computer that I think will change the CUDA ecosystem dramatically, especially for data scientists and AI researchers. It’s a big milestone: NVIDIA CEO Jensen Huang even hand-delivered one of the first units to Elon Musk at SpaceX, much like he did with the original DGX-1 to OpenAI back in 2016. In this article, we’re going to take a quick tour of the novel hardware capabilities and why they are interesting to a data scientist or AI developer working in Python.

 

NVIDIA DGX Spark

Why Is the NVIDIA DGX Spark Exciting?

The DGX Spark is a unique combination of features:

    • An extremely small footprint desktop system that fits within a 6” x 6” x 2” (150 mm x 150 mm x 50mm) envelope.

    • A heterogenous ARM CPU with 10 high performance Cortex-X925 cores and 10 power efficient Cortex-A725 cores.

    • A Blackwell generation NVIDIA GPU with 6144 CUDA Cores, 5th generation Tensor Cores, and 4th generation RT cores.

    • A unified memory architecture with 128 GB memory simultaneously accessible from the CPU and GPU.

    • A high-speed networking chip, including both a traditional 10 gigabit Ethernet (RJ-45 jack) and 2 special QSFP ethernet ports that share 200 Gbps of bandwidth between them with remote direct memory access (RDMA) support.

We have never had a GPU with this much memory and it also runs all the CUDA-compatible software we’re familiar with. Even better, this is a full computer, with a fast CPU that can collaborate directly with the GPU in hybrid calculations via that unified memory. Previously, you would have needed a rackmounted server in a data center to get this capability, but now it fits on your desk.

The ability for the CPU and the GPU to collaborate better also impact the two traditional guidelines for when CUDA is effective for solving a parallel computing problem:

    1. The working data set should fit within the GPU memory (not the CPU memory).

    1. Once the data is on the GPU, it should stay there and not be moved back to the CPU unless absolutely necessary.

The unified memory on the DGX Spark changes both of these guidelines. The CPU memory is the same as the GPU memory and is much larger than any other discrete GPU available in a desktop. That means much larger datasets and bigger models can be run locally than would be possible otherwise. Additionally, moving data between CPU and GPU does not require any data copies, because the memory space is the same. It is now practical for a calculation to bounce between the CPU and the GPU, or use both CPU and GPU at the same time, depending on what is needed. The GPU will always be the best option for data parallel tasks, but sometimes it is easier (or necessary) to assign some work to the CPU, and that’s where it is helpful to have 20 CPU cores to lean on.

From a user perspective, the big draw of the DGX Spark is the ability to run larger AI models and analyze larger data sets locally, where it can be easier to do development and get permission to access internal data sources and services. For IT buyers, the DGX Spark is small and inexpensive enough to allocate them to individual employees so they get a guaranteed resource allocation at a fixed cost. There’s no need to figure out how to split a larger server between users or worry about cost controls and tracking for cloud GPU spending. For the right kinds of situations, systems like the DGX Spark can be a real win for both users and IT.

Getting Started with Python on the DGX Spark

As mentioned above, the DGX Spark uses an ARM-based CPU and comes preinstalled with Linux, specifically a customized version of the Ubuntu 24.04 LTS series. Note that Linux is the only supported operating system for the DGX Spark. When you unbox the DGX Spark, you can either:

    1. Plug a keyboard, mouse, and monitor directly into the system and set up your user account and networking credentials.

    1. Go through a first-time headless setup process where the DGX Spark temporarily becomes a WiFi hotspot you connect to directly with a web browser to configure your user account and networking credentials.

The second option describes how you can use the DGX Spark as a “sidecar computer” to your primary laptop or desktop system. I suspect most users will opt to use it in this way, but both options are possible. Note that if you use the DGX Spark as a sidecar computer, you will need an SSH client to connect to the DGX Spark and you will need to be familiar with the ssh and scp commands, as well as general Linux command line tools to use after you log in.

Ubuntu includes a system version of Python (3.12), but if you want to easily switch between different versions of Python, as well as different versions of other packages, we recommend installing a conda-based distribution like Anaconda. You can get the full Anaconda distribution, or the minimalist Miniconda distribution at anaconda.com/download. Once you get to the installer download page, you will want to select the Linux 64-bit ARM64 Installer. (You may see linux-aarch64 mentioned in various places. This is the conda platform, and is just the more formal name for ARM64.) Copy the installer to the DGX Spark, run it, and it will create a base environment ready for you to use.

Once installed, you can use conda to create environments and install packages, just like you would on any other system. For example, you can create a Python 3.13 environment with NumPy and SciPy this way:

conda create -n spark_test python=3.13 numpy scipy

And then you can activate the environment when you need it with:

conda activate spark_test

Anaconda has a large selection of packages for ARM64, so you should be able to find your favorite CPU-based software in the repository on day one. (Note that the GPU package selection will be limited for a few months. We’ll talk about this in the GPU section.)

Working with Heterogeneous CPU Cores

With 20 ARM cores, the DGX Spark could also be the fastest computer you own for running CPU code. After you log into the DGX Spark, you can inspect the CPU configuration with the lscpu command. There you’ll see the 10 Cortex-X925 (“performance”) cores listed with a peak clock rate of 4 GHz, along with the 10 Cortex-A725 (“efficiency”) cores listed with a peak clock rate of 2.8 GHz. If you start Python and ask it how many CPU cores you have, it will count both kinds of cores and report 20:

$ python

Python 3.13.7 | packaged by Anaconda, Inc. | (main, Sep 9 2025, 20:01:03) [GCC 11.2.0] on linux

Type "help", "copyright", "credits" or "license" for more information.

>>> import os

>>> os.cpu_count()

20

This means that Python packages that use the core count to determine how many worker threads or processes to start (like Dask and others) will default to this number. This is usually fine, because the efficiency cores can do useful work, even if they are slower than the performance cores. Note that because of the speed difference between the cores, you will want to ensure there is some form of dynamic scheduling in your application that can load balance between the different core types. If dynamic scheduling is not available in your particular situation, then it may be advantageous to limit your application to 10 threads or processes (for example with the OMP_NUM_THREADS environment variable). Otherwise you might find the efficiency cores holding up the entire program after the performance cores finish their equal share of the work.

Overall, using the CPU cores on the DGX Spark is just like any other system. Your preferred parallel Python execution techniques, like multiprocessing pools, Dask workers, OpenMP, and others will just work as you expect, and let you take full advantage of this computer for workloads that aren’t running on the GPU.

The Potential for Big GPU Memory

Of course, the primary use case for the DGX Spark is running bigger AI models. Huge foundational models get a lot of attention in the media, but are basically inaccessible for self-hosted applications on a single computer. On the other end of the spectrum, there’s a lot of research in expanding the capabilities for small “on-device” models that fit in 16 GB of memory or less. The DGX Spark is part of an emerging “middle scale” hardware target for AI that can efficiently run models that fit into ~50-100 GB of memory. As model technology improves, this scale has the greatest potential to bring practical coding assistants to your desktop, without having to rely on cloud-based services.

The other huge benefit to the DGX Spark is of course its CUDA support. CUDA is a first-class platform for all AI-related Python packages, and support for CUDA has organically grown thanks to a large install base of CUDA-capable devices that has existed for more than a decade. Common inference packages like llama.cpp, ollama, vllm, and others just need their CPU code to be recompiled for ARM64 and they are ready to go on the DGX Spark with existing CUDA support.

As an extreme example, we were able to get the 480 billion parameter version of the Qwen3-coder model to run with help from the quantization tutorial provided by the Unsloth project. This required compiling a bleeding-edge version of llama.cpp, which required no changes for building on an ARM platform, highlighting how easy it is to move AI development from an x86 system to the DGX Spark. This large of a model is not practical for regular use, but models with tens of billions of parameters are very easy to use on a system like this. As devices with 128 GB of GPU memory become more popular, I expect we’ll see more models released in the 50-100 billion parameter range to take advantage of this emerging GPU category. 

Anaconda today supports several popular AI packages on x86 CUDA, and we are working to add ARM builds of these same packages (along with growing our coverage of popular AI packages) to our distribution. Look for that expanded CUDA support on ARM to come out in Q1 of 2026.

The Future of Hybrid Computing

I’m fascinated by the DGX Spark because of what it represents: seamless CPU and GPU collaboration on very large AI, data analysis, and simulation tasks. Previously, the technologies required to make this possible were confined to the data center, but the DGX Spark brings them to your desktop with the full support of the CUDA software stack you are already using. I believe broader availability of unified memory to more CUDA developers will inspire more use cases for CUDA in situations where it was not practical before. With 128 GB of memory, one can imagine doing local development and fine tuning on models with tens of billions of parameters or analyzing a data frame with a billion rows directly on your local system. In future blog posts, we’ll dive into these use cases in more detail, to show you what becomes possible with this large amount of memory.

Local AI and data science are powerful paradigms with advantages that are sometimes overlooked as so many workloads move to the cloud. The DGX Spark is carving out a new computer category that is different from laptop, desktop, or server. When used without a monitor, it can function as an “AI sidecar” that sits next to your laptop, providing the horsepower required to tackle larger compute tasks without compromising the weight and power of the laptop you carry around. This is an unexpected realization of my time with the DGX Spark: This computer makes me want to downsize my laptop and offload my heavier compute tasks to the DGX Spark instead.

I’m also looking forward to seeing the DGX Spark in the hands of more open-source developers in the PyData ecosystem. Unifying GPU and CPU memory opens up myriad innovation possibilities and also requires changing some assumptions (e.g., a NumPy array on the CPU does not always need to be copied to the GPU to pass to a GPU compute kernel, and so on). This will take some time to work through, so I expect we’ll see this system motivate a variety of projects to improve their utilization of unified memory in CUDA over time.

Finally, we’re excited at Anaconda to grow our support of the DGX Spark platform, as well as NVIDIA’s related data center Grace Hopper and Grace Blackwell systems. We have full ARM support on the CPU ready today, and will be working to bring more CUDA packages over to our ARM package repository over the coming months. 

Stay Informed

Stay tuned for updates on availability of more CUDA + ARM packages by signing up for our newsletter. For now, read the latest announcement about Anaconda + NVIDIA’s strategic partnership and find out how to standardize, secure, and scale AI initiatives with the Anaconda AI Platform.

For more details on the DGX Spark launch, including complete technical specifications and partner system availability, check out NVIDIA’s official press release.