Towards Multi-Fabric 3D Integration Architectures

The main aim of this thesis is to examine the advantages of 3D stacking applied to microprocessors and related integrated microprocessor systems in the architectural level. In the succession of years microprocessors are aiming towards lower power consumption, increased performance, reduced form factor and increased integration. 3D technology is an emerging technology that can provide improvements in all the aforementioned areas. For conventional process scaling, the signal delay time (RC) is expected to increase with technology node mostly from the increasing resistance of the wires. The situation is more exaggerated because of the constant increase of the interconnect length as well as the increase of the number of interconnect layers used. Thus, mainly for microprocessor systems, it is most important to focus primarily on using 3D to reduce wiring. 3D systems can be divided into two basic categories based on the type of layers stacked to form the 3D entity. The first generically includes stacking cache, main memory or devices with similar functions onto a high- performance logic device. This type of stacking is usually referred as “logic + memory” stacking. The second category involves splitting a logic area between two or more layers and is usually referred as “logic + logic” stacking. This thesis commences with an introduction to 3D ICs and continues by demonstrating ways to improve memory organization. It then proceeds with a unique way of “logic + memory” stacking that provides interesting opportunities for FPGA implementations. Such opportunities may best be exploited with the use of DSP blocks within FPGAs. In this context, a novel DSP block to enhance FPGA performance follows. The Thesis continues with a novel type of link especially useful for 3D integration and concludes with a modular “logic + logic” 3D stacked multi-processor platform. More specifically the first chapter consists an introduction to 3D ICs. The second chapter presents a systematic technique to reduce the silicon area required for AVS-enhanced ISEs without compromising I/O bandwidth. The technique combines a search for the lowest cost memory system organization, followed by a data layout phase (formulated as LICCA—a problem akin to graph coloring), and the use of input and output alignment layers placed between the memory system and ISE logic. Optimizing the memory subsystem using this approach reduces the silicon area by around 36% while maintaining the same data bandwidth as a multi-port memory, and without clock frequency degradation. In the next chapter we propose a methodology to generate data accumulation architectures achieving, to our knowledge, the most efficient use of available memory bandwidth. Such architectures require the minimum number of cycles to complete a number of computations while maintaining the same maximum rate of computation completion as state-of-the-art known implementations. The next chapter proposes the stacking of DRAM on top of an FPGA

Multi-Ported GC-eDRAM Bitcell with Dynamic Port Configuration and Refresh Mechanism

Adam Shmuel Teman, Robert Giterman

Embedded memories occupy an increasingly dominant part of the area and power budgets of modern systems-on-chips (SoCs). Multi-ported embedded memories, commonly used by media SoCs and graphical processing units, occupy even more area and consume higher pow ...

MDPI2024

Towards Multi-Fabric 3D Integration Architectures

Graph Chatbot

EdgeAI-Aware Design of In-Memory Computing Architectures

Towards Scalable Electronics: Synthetic 2D Materials for Large-Area 2D Circuit Integration

Multi-Ported GC-eDRAM Bitcell with Dynamic Port Configuration and Refresh Mechanism

EdgeAI-Aware Design of In-Memory Computing Architectures

Towards Scalable Electronics: Synthetic 2D Materials for Large-Area 2D Circuit Integration

Multi-Ported GC-eDRAM Bitcell with Dynamic Port Configuration and Refresh Mechanism