• Industry Trends
September 24, 2025

SK hynix builds its own 30PB storage system

Embedding storage with open-source solutions like 'Kubernetes'... Realizing IT operational cost savings

 

SK Hynix has built a 30PB-class high-capacity storage cluster using Kubernetes within the company. This expansion of infrastructure for internal IT operations aligns with the company-wide AI transformation. (Source: SK Hynix, Reporter Lee Seok-jin)

SK hynix has built its own high-performance storage system with a capacity of 30PB (petabytes, 2 to the 50th power) using open-source software to operate its in-house generative AI system. This significantly expands its data storage space, coinciding with the introduction of 'Gaia,' a generative AI platform specialized for semiconductor tasks.

Notably, the company built this storage independently using open-source software rather than commercial products. Just a few years ago, SK hynix relied on storage from global vendors. By achieving in-house development with its own technology, the company expects to significantly reduce IT operating costs.

Shin Ho-seung, Technical Leader at SK hynix, stated at the 'Kubernetes Community Day' held at the Baekbeom Memorial Hall in Yongsan on the 16th, "Until four years ago, we used Dell's ECS storage product, but due to budget constraints four years ago, we decided to build our own storage using open source." He added, "Currently, 13,000 employees are processing 800,000 queries daily by linking dozens of AI services to this storage."

 

◇ Storage deployment… Focus on resolving bottlenecks

Storage is designed based on the S3 (Simple Storage Service) architecture. The S3 structure first proposed by Amazon is characterized by storing unstructured data in object units. It's a structure where data is organized in fixed units, like filling bookshelves, and retrieved via URL (internet address) whenever needed. Due to this characteristic, S3 has the drawback of being "convenient but slow."

SK hynix overturned this conventional wisdom. By applying 'Kubernetes,' an open-source technology that manages cloud environments by dividing them into small units (containers), it reduced data bottlenecks. As a result, they achieved performance levels reaching 95% of what is possible when using physical servers directly. Shin Leader explained, "S3 can become a storage repository for analysis that is faster than traditional HDFS, not just a simple storage warehouse."

Storage can be used in a 'cloud-native' environment. In the past, managing each physical server individually was the standard approach. SK Hynix adopted Kubernetes to manage hundreds of servers as a single computer. In a Kubernetes environment, infrastructure capacity scaling is advantageous, and recovery speed is fast even if system failures occur.

The open-source 'MinIO' was adopted as the storage engine. MinIO is an engine used for distributed object storage like S3, enabling rapid storage of unstructured data. Unlike other engines, MinIO runs as a single program, eliminating unnecessary intermediate steps. It features drivers that connect directly to disks, ensuring no bottlenecks.

Network design is also a critical factor determining storage performance. SK hynix implemented the 'Spine-Leaf' architecture, now the standard for hyperscale data centers. Network equipment is deployed across two layers: the spine and the leaf. Even with thousands of servers installed, speed remains consistent through simple paths between the leaf and spine.

To streamline data transmission procedures, SK hynix also implemented BGP, the latest routing technology, and eBPF, an operating system-embedded program. BGP is a standard protocol connecting communication networks, helping designate the shortest path between each node (server) and switch (transmission device) even within the computer room. eBPF pre-processes network packets (transmission units) on the Linux operating system to prevent bottlenecks.

Optimization was also achieved from an operating system perspective. Typically, solid-state drives (SSDs) erase data before writing new data. Consequently, SSD performance degrades sharply when storage capacity exceeds 80%. SK hynix minimized this performance degradation by applying the latest Linux kernel to process data in chunks.

 

 

◇ Beyond 30PB to the 1EiB level

SK Hynix is expanding beyond 30PB capacity to build a massive infrastructure. The goal is exabyte-class storage (EiB, 2 to the 60th power). This represents over 30 times the current scale, capable of simultaneously storing hundreds of billions of movies.


. . . For more details, see [Original Article]

Source: The Elec (http://www.thelec.kr), a specialized media outlet for electronic components

 

 

 

 


 

📩 Technical Collaboration and Implementation Inquiries: business@quantumcns.ai / 010-7687-1684