Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp012n49t440b
Title: Data Access Optimization in Accelerator-Oriented Heterogeneous Architecture through Decoupling and Memory Hierarchy Specialization
Authors: Ham, Tae Jun
Advisors: Martonosi, Margaret R
Aragon, Juan L
Contributors: Electrical Engineering Department
Keywords: Accelerators
Decoupled Architecture
Memory Hierarchy
Subjects: Computer engineering
Computer science
Electrical engineering
Issue Date: 2018
Publisher: Princeton, NJ : Princeton University
Abstract: For the past fifty years, Moore's Law and Dennard Scaling have been playing important roles in both performance and energy efficiency of computer systems. Unfortunately, they are not likely to continue, and computers no longer benefit from technology scaling as much as they did in the past. Recently, specialized hardware accelerators have emerged as a promising alternative to general-purpose computing for their potential to achieve orders of magnitude speedup and energy efficiency improvements on compute-intensive applications. However, achieving the full potential of accelerators on data-intensive applications remains a challenge since the bottlenecks of such applications do not lie on computation, but data movement. It is particularly problematic because data accesses have become large parts of today's important workloads used for data analytics and scientific computing. To address this limitation, this thesis presents hardware and software techniques which can be utilized to design a system that can effectively accelerate data-intensive workloads. Specifically, this thesis addresses the two most important aspects in accelerating such workloads ---hiding memory latency and reducing memory bandwidth consumption. First, this thesis attacks the memory latency challenge in accelerator-oriented systems by proposing the Decoupled Supply-Compute (DeSC) framework which provides latency tolerance to accelerators without programmer effort. DeSC utilizes hardware specialization and compiler support to enable a specialized core to work as a high-performance decoupled data supplier, which supplies data to accelerators ahead-of-time to avoid exposing memory latency to them. Second, this thesis presents a way to attack the memory bandwidth challenge for accelerators through the use of customized memory hierarchy and data access optimizations. Specifically, this thesis focuses on graph analytics and presents Graphicionado, a specialized accelerator which effectively accelerates memory bandwidth-bound graph analytics and demonstrates that even such applications can benefit from customized hardware designs. In summary, this thesis investigates the memory wall challenge in the era of specialization and presents data access optimizations which enables data-intensive workloads to benefit from specialized, heterogeneous systems without being limited by data accesses. With a trend of exponentially increasing demand for data-intensive computing, the techniques presented in this thesis will work as useful tools for acceleration of such important workloads.
URI: http://arks.princeton.edu/ark:/88435/dsp012n49t440b
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Ham_princeton_0181D_12558.pdf4.49 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.