In the past decade, quantum computers have progressed significantly and could one day be used to undermine current cybersecurity practices. If run on a quantum computer, for example, an algorithm discovered by the theoretical computer scientist Peter Shor could crack common encryption schemes, including the Rivest-Shamir-Adleman (RSA) encryption algorithm.
Post-quantum cryptography (PQC) is the response to this threat. This approach uses cryptographic algorithms based on mathematically hard problems that are secure against both traditional and quantum attacks.
Although quantum computers powerful enough to break today’s cryptosystems do not yet exist, PQC is not only of relevance in the future. “Harvest now, decrypt later” attacks, where an adversary hoards data sent today, with the intention of decrypting it in the future with a powerful enough quantum computer makes PQC a necessary security measure today. The National Institute of Standards and Technology (NIST) and other governing agencies around the world are setting new standards for security that require the use of PQC algorithms.
This post introduces the new NVIDIA cuPQC SDK, designed for developers to ease the transition from current cryptosystems to PQC protocols. It provides an easy, flexible, and GPU-accelerated implementation of NIST-approved PQC operations.
GPU-accelerated post-quantum cryptography
To provide security against quantum attacks, PQC algorithms need to leverage larger key sizes and more complex mathematical structures compared to traditional cryptographic algorithms. Fortunately, however, many of the mathematical operations required for PQC can be parallelized and implemented at high speeds on GPU hardware. Figure 1 shows an example of how batched PQC key encapsulation mechanism (KEM) enables GPU acceleration. By batching key encapsulation, User 0 can establish secure communication channels with many other users in parallel with GPUs.
Figure 1. Standard batched key encapsulation mechanisms can be accelerated and parallelized with GPUsApplications within telecommunications, financial services, and cloud infrastructure management require high throughput cryptographic operations, leading to demanding hardware requirements. GPUs can meet these requirements by parallelizing, and therefore accelerating cryptographic computations.
GPU-accelerated PQC is also critical for fundamental cryptographic research and testing new PQC use cases. Research tasks can involve running high-performance network simulations, faster discovery of optimal PQC runtime configurations, and performing better security assessments. This will effectively lower the barriers to building performant PQC applications and accelerate PQC research.
In addition to high performance, PQC implementations need to have crypto-agility, so they can adapt to evolving security threats. Traditional solutions usually require a tradeoff between crypto-agility and performance. GPUs provide both. Using NVIDIA cuPQC, you can seamlessly add and switch schemes while maintaining high performance.
Accelerating transport layer security
Transport layer security (TLS) is a crucial security protocol used in internet communications, often requiring data center servers to handle tens of thousands of TLS handshake operations per second. This computational demand can hinder the practical utilization of TLS, a problem that is only exacerbated when introducing complex PQC calculations.
To address this challenge, cuPQC provides robust support for high-throughput PQC TLS applications, delivering exceptional performance metrics. Using a single NVIDIA H100 SXM5 GPU, cuPQC achieves impressive throughputs, including up to 13.3 million key generations, 9.3 million encapsulations, and 8 million decapsulations per second for the batched NIST-approved PQC algorithm known as ML-KEM-768, a standard cryptography protocol to establish a shared key between two parties. This increases performance over state-of-the-art Intel Raptor Lake i7-13700K CPUs by 143x, 99x, and 84x, respectively (Figure 2).
Figure 2. Speedups for ML-KEM-768 operations on an NVIDIA H100 SXM5 GPU compared to a single CPUcuPQC can also accelerate the NIST standardized digital signature algorithm ML-DSA-65, which is used for validating the authenticity and integrity of digital messages. Batched ML-DSA-65 can produce 6.5 million key generations, 1 million signatures, and 5.7 million verifications per second throughput when running on an NVIDIA H100 GPU. These performance results should help to remove barriers to the adoption of PQC.
According to Hart Montgomery, CTO of Hyperledger, part of the Linux Foundation, which has been working to promote the transition to PQC algorithms, “cuPQC’s safe and high-performance algorithms make transitioning to post-quantum cryptography achievable for enterprises with high-throughput security applications.”
Enhancing GPU application security
Utilizing the GPU for cryptographic processing in GPU-accelerated applications eliminates the need for transferring messages between the host and the device, thereby enhancing efficiency, reducing latency, and leveraging accelerated cryptographic operations (Figure 3).
Figure 3. cuPQC enables PQC operations on the GPU, avoiding message transfers between host and device while remaining fortified against side channel attackscuPQC has undergone comprehensive side-channel security reviews and is fortified against threats exploiting timing data or device micro-architecture specifics. Extensive testing of cuPQC has verified its protection against advanced attack techniques such as the Kyberslash attack. This proactive and continual support for security testing ensures that cuPQC remains robust and resilient in the face of evolving security threats.
These features have encouraged cuPQC integration with other cybersecurity frameworks. Douglas Stebila, professor at University of Waterloo and founder of Open Quantum Safe, an open-source project that aims to support the transition to PQC, said that the integration of cuPQC with LibOQS will “help researchers to explore new frontiers in cryptographic applications which are enabled by cuPQC’s speed and functionality.”
Get started with NVIDIA cuPQC
Adopting PQC is necessary for enterprises to remain secure now and in the future. The transition to PQC must bolster security while ensuring that cryptographic protocols remain practical and cost effective. The NVIDIA cuPQC SDK enables security developers to easily start building and testing flexible and secure GPU-accelerated PQC applications that are both secure and practical.
To start building GPU-accelerated PQC applications, download NVIDIA cuPQC. For more details about functionality, check out the cuPQC documentation.
Learn more about NVIDIA Quantum Computing.