As the big data era arrives, resource disaggregation has attracted significant attention thanks to its excellent scale-out capability, cost efficiency, and transparent elasticity. Many industry prototypes and academic simulation/emulation-based studies explore a wide spectrum of approaches to realize memory disaggregation technology and put significant effort into making memory disaggregation practical. However, the concept of memory disaggregation has not been successfully realized by far due to several fundamental challenges. In this project, we provides a large memory system with the world-first CXL solution framework that can achieve outstanding performance in big data applications, such as machine learning, in-memory database, and real-world graph analytics. Our CXL solution opens up a new direction for memory disaggregation, and it ensures direct accessible and high-performance capabilities.
Yet, most storage systems and operating systems (OS) kernel relies on conventional rule-based strategies. This simply makes sense as the latency to determine the running decision, but usually all they are greedy, and the huristic algorithms often cannot unfortunately find out sub-optimal solutions. We are exploring machine learning algorithms to make system-related decisions like I/O request pattern prediction and hot/colde data managment by being aware of diverse memory and storage system device-level characteristics. In addition, we are designing and implementing hardware acceleration architectures within memory and storage subsystems to enable infererence and training at runtime with a minimal overhead.
New memory including PRAM and Ultra-Low Latency Memory define different memory hierarchy compared to the convention. By far, most academic proposals related to new memory subsystems and storage systems such as persistence control, firmware algorithms and cross-layer optimizations are all evaluated by the simulation and/or emulation. In this project, we build up a set of ACTUAL hardware and software resources from the ground. Currently, we secure mult-core processor IPs, PRAM and NVM controllers, and the corresponding datapath, which are all available to exectute Linux. In addition, we are exploring a new territory to ingrate such new memory into diverse computing system domain, including domain specific accelerator, AI accelerator, machine learning platform and fully hardware automoted FPGA storage subsystem.
Heterogeneous computing is widely applied to most of data processing and big analytic applications by incorporating many dissimilar many processors such as general purpose graphic processing units (GPGPU), many-integrated core (MIC), field-programmable gate array (FPGA) based coprocessors. However, it faces nowaday many challenges coming from different programming interfaces and data movement models. We are researching energy-efficient heterogeneous computing with diverse types of FPGA devices (Xilinx and Altera) and a thousand coprocessors in Kandemir machine (NVIDIA GPGPUs and Xeon Phi). The main goal of this research is i) to remove data movement by aggressively integrating memory with hardware accelerator and ii) to enable low-power hardware acceleration with hardware/software cross-optimizations.
We are researching next generation non-volatile memory (NVM) systems as a memory extension or NAND flash alternative storage medium. Especially, this project includes 1) characterizing challenges of emerging NVMs such as Resistive RAM (RRAM), Phase-change RAM (PCRAM), Magnetic RAM (STT-RAM), 2) building system-level prototypes, 3) exploring killer applications exploiting these emerging NVMs and 4) architecting new platforms with byte-addressable NVMs.
Modern SSDs can be plagued by enormous performance variations depending on whether the underlying architectural complexities and NVM management overheads can be hidden or not. Designing a smart NVM controller is key hiding these architectural complexities and reducing the internal firmware overheads. In this project, we present a set of novel storage optimizations including various concurrency methods, I/O scheduling algorithms, and garbage collection avoidance mechanisms.
Drawing parallels to the rise of general purpose graphical processing units (GPGPUs) as accelerators for specific high performance computing (HPC) workloads, there is a rise in the use of non-volatile memory (NVM) as accelerators for I/O-intensive scientific applications. In addition, flash drives or NVM technologies begin to replace disks at major data centers of Amazon, Facebook, Dropbox and etc. In this work, we 1) deliver how to efficiently manage flash drives and emerging NVM technologies as an I/O accelerator in HPC and Datacenter systems 2) redesign current memory/storage hierarchy and HPC storage stack from scratch with emerging NVM 3) develop a novel and efficient hardware/software cooperative techniques being aware of system-level characteristics as well as underlying NVM technologies complexities.
Exploiting internal parallelism over hundreds NAND flash memory is becoming a key design issue in high-speed SSDs. The main goal behind this memory-level parallelism project is to fully take advantage of both system-level and memory-level parallelism such that SSD can offer short latency with full bandwidth. In this project, we are exploring internal SSD/NVM architecture with a full design space sitting on system and memory-level organizations with a variety of parameters such as a standard queue, multiple buses, chips, and diverse advanced flash operations.
To explore impacts of diverse NVM technologies in modern computer architecture and systems, it is required to have fast, high fidelitous and accurate NVM simulation/emulation research tools. Unfortunately, modeling NVM technologies for the broad range of variety is non-trivial research area as there are multiple design parameters and unprecedented device-level considerations. In this project, we are developing several research frameworks, including open-source simulation models, FPGA-based NVM emulators, and hardware validation prototypes. In addition to offering valuable research vehicles, we also propose a hardware-software codesign environment that will allow application, algorithm and system developers to influence the direction of future architectures, thereby satisfying diverse computing area demands.
Storage applications leveraging SSD technology are being widely deployed in diverse computing systems. These applications accelerate system performance by exploiting several SSD-specific characteristics. However, modern SSDs have undergone a dramatic technology and architecture shift in the past few years, which makes widely held assumptions and expectations regarding them highly questionable. The main goal of this project is to question popular assumptions and expectations regarding SSDs through an extensive experimental analysis. This project use two different types of SSD, which are most popular in many market segments; 1) PCI Express based SSDs and 2) mass storage type SSDs. This project also offers insightful analyses to system-level kernel and architecture designers.
- IPAD (2004~2005)
A forerunner of high-end portable media player, which supports processing and managing images, and playing entertainment contents such as music, flash and digital movies as a standalone device. IPAD suggests the potential hand-held smart devices such as Apple's iPad, but our IPAD is developed four years earlier than the iPad first generation. Our IPAD provides a method to directly upload images, processed in X25 embedded platform to the web blog through wireless networks, which leads that users do not require to connect their own device to PC or laptop at all.
- Code Wizard (2005)
An intuitive drag and drop programming tool, which enables someone who doesn’t know how to program robotics invention to easily develop their own robots. Most people can create a program through an intuitive drag and drop programming. Code-wizard project provides programmable robot suites and a convenient mechanism to control them. Such robot suites consist of several peripheral devices such as interactive servo motors, and touch sensors.
- CLASS-MATE (2005)
An object oriented paradigm-based education game framework, where the goal is to develop a humanoid to battle against other humanoids, developed under Class-mate library. Developers who are not familiar with OOP can improve their programming skills and easily learn features of OOP such as the polymorphism, inheritance design as a part of game play. The purpose of class-mate project is very similar to java Robocode project. However, unlike java, C++ RTL have no VM, which allows to link diverse user's programmed objects. Class-mate leverages COM-based dynamic linkable object methods and provides a framework for playing/coding robots.