KAIST startup Panmnesia (the name means “the power to remember absolutely everything one thinks, feels, encounters, and experiences”) claims to have developed a new approach to boosting GPU memory.
The company’s breakthrough enables the addition of terabyte-scale memory using cost-effective storage media such as NAND-based SSDs while maintaining reasonable performance levels.
There’s a catch, however: the technology relies on the relatively new Compute Express Link (CXL) standard, which has yet to be proven in large-scale applications and requires specialized hardware integration.
Technical challenges remain
CXL is an open standard interconnect designed to efficiently connect CPUs, GPUs, memory, and other accelerators. It allows these components to share memory coherently, meaning they can access shared memory without requiring data copying or movement, reducing latency and increasing performance.
Because CXL is not a synchronous protocol like JEDEC’s DDR standard, it can support different types of storage media without requiring exact time or latency synchronization. Panmnesia says early testing has shown its CXL-GPU solution can outperform traditional GPU memory expansion methods by more than three times.
For its prototype, Panmnesia connected the CXL endpoint (which includes terabytes of memory) to its CXL-GPU via two MCIO (Multi-Channel I/O) cables. These high-speed cables support both PCIe and CXL standards, promoting efficient communication between the GPU and memory.
Adopting these standards, however, is not always straightforward. GPU cards may require additional PCIe/CXL-compatible slots, and significant technical challenges remain, particularly around integrating the CXL logic and subsystems into current GPUs. Integrating new standards such as CXL into existing hardware involves ensuring compatibility with current architectures and developing new hardware components, such as CXL-compatible slots and controllers, which can be complex and resource-intensive.
Panmnesia’s new CXL-GPU prototype promises unprecedented memory expansion for GPUs, but its reliance on the emerging CXL standard and the need for specialized hardware could create barriers to immediate widespread adoption. Despite these obstacles, the benefits are clear, especially for large-scale deep learning models that often exceed the memory capacity of current GPUs.