For decades, computational chemistry has faced a tug-of-war between accuracy and speed. Ab initio methods like density functional theory (DFT) provide high fidelity but are computationally expensive, limiting researchers to systems of a few hundred atoms. Conversely, classical force fields are fast but often lack the chemical accuracy required for complex bond-breaking or transition-state analysis.
Machine learning interatomic potentials (MLIPs) have emerged as the bridge, offering quantum accuracy at classical speeds. However, the software ecosystem is a new bottleneck. While the MLIP models themselves run on GPUs, the surrounding simulation infrastructure often relies on legacy CPU-centric code.
NVIDIA ALCHEMI (AI Lab for Chemistry and Materials Innovation) helps to address these challenges by accelerating chemicals and materials discovery with AI. We have previously announced two components of the ALCHEMI portfolio:
ALCHEMI NIM microservices: Scalable, cloud‑ready microservices for AI-accelerated batched atomistic simulations in chemistry and materials science ALCHEMI Toolkit-Ops: A set of foundational GPU kernels designed to accelerate the calculations behind simulations, such as neighbor lists, dispersion corrections, and electrostaticsToday, we are introducing the NVIDIA ALCHEMI Toolkit, a collection of GPU-accelerated simulation building blocks that incorporates and expands on ALCHEMI Toolkit-Ops. ALCHEMI Toolkit is designed to manage the data flow between accelerated chemistry and materials domain-specific kernels and deep learning models. ALCHEMI Toolkit extends beyond individual models and kernels to provide a modular, PyTorch-native structure for researchers and developers to compose custom simulation workflows.
Figure 1 shows the ALCHEMI architectural stack and product features supported in this initial release of ALCHEMI Toolkit, including expanded functionality in Toolkit-Ops. This release includes capabilities for geometry relaxation and molecular dynamics, and the supporting pipeline infrastructure for combining multiple simulation workflows.
Figure 1. NVIDIA ALCHEMI Toolkit is a collection of GPU-accelerated simulation building blocks to enable large-scale, batched simulations with AI
ALCHEMI Toolkit is not just a collection of scripts. It’s designed to enable researchers and developers to build custom, performant atomistic simulation workflows with ease.
Expanding ALCHEMI Toolkit-Ops
ALCHEMI Toolkit leverages the capabilities of Toolkit-Ops to handle the underlying calculations of the simulations. The previous release included several key operations:
Neighbor list constructions DFT-D3 dispersion corrections Long-range electrostatic interactionsThis release broadens the scope of common operations addressed to include:
Batched dynamics kernels JAX support (for v0.2.0 release features)Integration with the atomistic simulation ecosystem
ALCHEMI Toolkit is designed to integrate seamlessly with the broader atomistic simulation ecosystem. We’re excited to announce the following integrations with leading platforms in the chemistry and materials science community.
Orbital
Orbital develops advanced AI foundation models used to accelerate the discovery of novel cooling systems for data centers and sustainable materials. Orbital has integrated ALCHEMI Toolkit into their new OrbMolv2 model to drastically reduce the time required for inference. The new model will leverage ALCHEMI Toolkit components such as PME electrostatics for periodic Coulomb interactions and the MTK integrator for batched constant-pressure molecular dynamics. The existing Orb models already leverage Toolkit-Ops for GPU-accelerated graph construction, providing a ~1.7x acceleration for large systems and ~33x for batched smaller systems with TorchSim support.
Materials Graph Library (MatGL)
MatGL is an open source framework for state-of-the-art graph-based MLIPs. ALCHEMI Toolkit is integrating with the MatGL TensorNet model to significantly accelerate materials simulations and property predictions workflows. By leveraging ALCHEMI Toolkit GPU-native kernels and batching infrastructure, MatGL users can achieve higher computational efficiency and lower memory consumption for simulations at scale.
Matlantis
Matlantis enables rapid materials discovery by combining universal MLIPs with high-performance cloud computing. Matlantis is actively exploring the ALCHEMI Toolkit and identifying where its composable dynamics can deliver the greatest value for industrial materials simulation customers. This builds on its proven integration of ALCHEMI Toolkit-Ops—including Warp-optimized neighbor list construction and DFT-D3 dispersion corrections—which significantly reduces computational overhead of atomistic interactions with speedups of up to 10x.
Furthermore, by evaluating specific components within ALCHEMI Toolkit, this collaboration has the potential to enable Matlantis to move beyond single-structure optimization to high-throughput, parallel relaxation of millions of molecular configurations. Ultimately, this integration aims to further power small-scale research and industrial-scale materials design, accelerating chemical evaluation with unparalleled GPU efficiency.
How to get started with ALCHEMI Toolkit
This section walks you through how to get started with ALCHEMI Toolkit, which is straightforward and designed for ease of use.
System and package requirements
Python ≥3.11, <3.14 PyTorch ≥2.8 CUDA Toolkit 12+, NVIDIA driver 470.57.02+ Operating System: Linux (primary), macOS NVIDIA GPU (RTX 20xx or newer), CUDA Compute Capability ≥ 7.0 Minimum 4 GB RAM (16GB recommended for large systems)Installation
Use the following code to install ALCHEMI Toolkit:
For more information, reference the NVIDIA/nvalchemi-toolkit GitHub repo and the ALCHEMI Toolkit documentation.
This section dives into four core ALCHEMI Toolkit features: customizable batched simulation workflows, build-your-own dynamics classes, model wrappers, and advanced data management. These features provide researchers and developers with the tools and flexibility needed to create bespoke end-to-end workflows that maximize efficiency and performance on NVIDIA GPUs.
Customizable batched simulation workflows
The distinctive feature of the NVIDIA ALCHEMI Toolkit is the GPU-native batched dynamics engine. No single MLIP model is perfect for every chemical environment, especially when dealing with nonlocal, long-range interactions.
ALCHEMI Toolkit enables researchers to combine modular chemistry and materials science domain-specific kernels and models into customized simulation workflows. This architecture supports the development of specialized compute workflows and running virtual laboratories with millions of concurrent atomic interactions without the latency of traditional software stacks.
Capabilities
Composable calculators combining MLIPs with physics-based corrections High-performance wrappers (MACE, TensorNet, AIMNet2)API example
The following example constructs the data, sets up the MLIP, and configures a FIRE2 geometry optimization that is then used as a starting point for velocity Verlet (microcanonical) dynamics:
You can run and scale the simulation pipelines in one of two ways: on a single GPU or on across multiple CPUs and GPUs.
Run and scale the pipeline on a single GPU: The FusedStage class is formed by “adding” two or more dynamics objects together. This enables wrapping the end-to-end workflow in torch.compile and sharing CUDA stream contexts.
With this approach, you can easily build simulation workflows that run sequential steps as samples within the batch converge immediately, and make optimal use of your GPU.
Run and scale the pipeline across multiple CPUs and GPUs: The second approach is to distribute the pipeline across multiple CPUs/GPUs. Using the pipe operator on two dynamics classes will then distribute the FIRE2 optimization onto one GPU, and the velocity Verlet integration on another.
While this example is deliberately simplified for illustrative purposes, such abstraction allows users to scale their pipeline up to multiple GPUs on a node, and out to multiple nodes to arbitrarily large datasets and number of ranks.
The following example configures eight GPUs to run geometry optimization, which pipelines the results to run Langevin dynamics on another eight GPUs:
Build-your-own dynamics classes
ALCHEMI Toolkit offers a modular architecture to build and customize dynamics classes from the ground up. This approach enables the community to integrate new sampling methods or thermodynamic ensembles into the ALCHEMI environment while maintaining direct access to underlying kernels. This transforms dynamics into a fully customizable environment where users can construct specialized dynamics classes from scratch.
Capabilities
Specialized GPU-first trajectory analysis tools Integrated and customizable dynamics kernels (Velocity Verlet, NPT, Langevin thermostats) FIRE and FIRE2 optimizersAPI example
Model wrappers
With ALCHEMI Toolkit, you can use your own pretrained models with accelerated physics components. It provides the essential infrastructure for importing your own models into the pipeline, ensuring that proprietary or domain-specific architectures can leverage GPU-native orchestration. This abstracts the complexity of different model types, providing a standardized path to move from a standalone model to a production-ready, high-throughput simulation.
Capabilities
MLIP support (MACE, TensorNet, AIMNet2) Composable calculators Standardized model configurationAPI example
Advanced data management
Traditionally, the “memory tax” of moving data between the CPU and GPU is a significant bottleneck in AI-driven discovery. ALCHEMI Toolkit acts as the specialized orchestrator for scientific data, providing the infrastructure required to build custom ingestion pipelines to move information from standard research files into optimized GPU tensors.
This supports discovery to scale, making industrial-scale simulations accessible through familiar interfaces. By standardizing how atomic information is represented and loaded, ALCHEMI Toolkit ensures that data remains resident on the device, meaning the entire simulation stays on the GPU, enabling batched simulations for optimization of GPU utilization and eliminating communication overhead.
Capabilities
High-performance data loaders ASE and Pymatgen interface AtomicData and batch objectsAPI example
Get started building molecular workflows with ALCHEMI Toolkit
ALCHEMI Toolkit provides researchers and developers with the low-level primitives and high-level abstractions needed to build end-to-end, GPU-native molecular workflows. Moving critical bottlenecks—such as neighbor list construction, structural relaxation, and integration steps—into the PyTorch ecosystem eliminates the host-to-device memory transfer overhead that has traditionally throttled MLIP-driven simulations.
Whether you’re composing hybrid ML or physics potentials or scaling batched molecular dynamics, ALCHEMI Toolkit exposes the necessary API hooks to manage complex tensorized states without sacrificing performance.
To accelerate your chemistry and materials science simulations and explore building your own custom workflows, visit the NVIDIA/nvalchemi-toolkit GitHub repo and ALCHEMI Toolkit documentation. As we continue to expand the library of supported operations and architectures, we encourage you to clone the repository, explore the provided Jupyter notebooks, and begin integrating these GPU-accelerated workflows into your own discovery pipelines.
Acknowledgments
We’d like to thank James Gin, Tim Duignan, Vaidas Šimkus of Orbital; Professor Shyue Ping Ong of MatGL; Susumu Ohno, Ryuhei Okuno, Jethro Tan of Matlantis for working with us to adopt NVIDIA ALCHEMI Toolkit into their platforms. We would also like to thank Nikita Fedik, Roman Zubatyuk, Atul Thakur, and Logan Ward for their contributions to this post.
.png)
6 hours ago
English (United States) ·
French (France) ·