Georgia Institute of Technology

Displaying 1-40 of 331 results

  • White Papers // Oct 2014

    CAMEO:A Two-Level Memory Organization with Capacity of Main Memory and Flexibility of Hardware-Managed Cache

    In this paper, the authors analyze the trade-offs in architecting stacked DRAM either as part of main memory or as a hardware-managed cache. Using stacked DRAM as part of main memory increases the effective capacity, but obtaining high performance from such a system requires Operating System (OS) support to migrate...

    Provided By Georgia Institute of Technology

  • White Papers // Apr 2014

    Improving MapReduce Performance in a Heterogeneous Cloud: A Measurement Study

    Hybrid clouds, geo-distributed cloud and continuous upgrades of computing, storage and networking resources in the cloud have driven datacenters evolving towards heterogeneous clusters. Unfortunately, most of MapReduce implementations are designed for homogeneous computing environments and perform poorly in heterogeneous clusters. Although a fair of research efforts have dedicated to improve...

    Provided By Georgia Institute of Technology

  • White Papers // Mar 2014

    Software-Based Techniques for Reducing the Vulnerability of GPU Applications

    As highly-parallel accelerators such as graphics processing units become more important in high-performance computing, so does the need to ensure their reliable operation. In response, research has been directed at several efforts to characterize and understand the hardware vulnerability of GPU micro-architecture structures, as well as to detecting and correcting...

    Provided By Georgia Institute of Technology

  • White Papers // Mar 2014

    A Measure of Heterogeneity in Multi-Agent Systems

    Heterogeneous multi-agent systems have previously been studied and deployed to solve a number of different tasks. Despite this, the users still lack a basic understanding of just what \"Heterogeneity\" really is. For example, what makes one team of agents more heterogeneous than another? In this paper, the authors address this...

    Provided By Georgia Institute of Technology

  • White Papers // Mar 2014

    Road-Network Aware Trajectory Clustering: Integrating Locality, Flow and Density

    Mining trajectory data has been gaining significant interest in recent years. However, existing approaches to trajectory clustering are mainly based on density and Euclidean distance measures. The authors argue that when the utility of spatial clustering of mobile object trajectories is targeted at road-network aware location-based applications, density and Euclidean...

    Provided By Georgia Institute of Technology

  • White Papers // Feb 2014

    Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks

    The authors conducted a micro-benchmarking study of the time, energy, and power of computation and memory access on several existing platforms. These platforms represent candidate compute-node building blocks of future high-performance computing systems. Their analysis uses the \"Energy roofline\" model, developed in prior work, which they extend in two ways....

    Provided By Georgia Institute of Technology

  • White Papers // Feb 2014

    Manifold: A Parallel Simulation Framework for Multicore Systems

    In this paper the authors present manifold, an open-source parallel simulation framework for multi-core architectures. It consists of a parallel simulation kernel, a set of micro-architecture components, and an integrated library of power, thermal, reliability, and energy models. Using the components as building blocks, users can assemble multi-core architecture simulation...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Methodical Approximate Hardware Design and Reuse

    Design and reuse of approximate hardware components - digital circuits that may produce inaccurate results - can potentially lead to significant performance and energy improvements. Many emerging error-resilient applications can exploit such designs provided approximation is applied in a controlled manner. This paper provides the design abstractions and semantics for...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    An Optimized 3D-Stacked Memory Architecture by Exploiting Excessive, High-Density TSV Bandwidth

    Memory bandwidth has become a major performance bottleneck as more and more cores are integrated onto a single die, demanding more and more data from the system memory. Several prior studies have demonstrated that this memory bandwidth problem can be addressed by employing a 3D-stacked memory architecture, which provides a...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Designing 3D Test Wrappers for Prebond and Postbond Test of 3D Embedded Cores

    3D integration is a promising new technology for tightly integrating multiple active silicon layers into a single chip stack. Both the integration of heterogeneous tiers and the partitioning of functional units across tiers leads to significant improvements in functionality, area, performance, and power consumption. Managing the complexity of 3D design...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D DieStacked DRAMs

    DRAMs require periodic refresh for preserving data stored in them. The refresh interval for DRAMs depends on the vendor and the design technology they use. For each refresh in a DRAM row, the stored information in each cell is read out and then written back to itself as each DRAM...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Noise-Direct: A Technique for Power Supply Noise Aware Floorplanning Using Microarchitecture Profiling

    In this paper, the authors propose Noise-Direct, a design methodology for power integrity aware floorplanning, using microarchitectural feedback to guide module placement. Stringent power constraints have led microprocessor designers to incorporate aggressive power saving techniques such as clock-gating that place a significant burden on the power delivery network. While the...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    DRAM Decay: Using Decay Counters to Reduce Energy Consumption in DRAMs

    Dynamic Random Access Memories (DRAMs) require periodic refresh for preserving data stored in them. The refresh interval for DRAMs depends on the vendor and the de-sign technology they use. For each refresh in a DRAM row, the stored information in each cell is read out and then writ-ten back to...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Asymmetry Aware Scheduling Algorithms for Asymmetric Multiprocessors

    Multiprocessor architecture is becoming popular in both desktop processors and mobile processors. Especially asymmetric architecture shows promise in saving energy and power. However, how to design applications and how to schedule applications in asymmetric multiprocessors are still challenging problems. In this paper, the authors evaluate the performance of applications in...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Hardware Support for Real-Time Embedded Multiprocessor System-on-a-Chip Memory Management

    The aggressive evolution of the semiconductor industry smaller process geometries, higher densities, and greater chip complexity has provided design engineers the means to create complex, high-performance Systems-on-Chip (SoC) designs. Such, SoC designs typically have more than one processor and huge memory, all on the same chip. Dealing with the global...

    Provided By Georgia Institute of Technology

  • White Papers // Sep 2013

    ClusterWatch: Flexible, Lightweight Monitoring for High-end GPGPU Clusters

    The ClusterWatch middleware provides runtime flexibility in what system-level metrics are monitored, how frequently such monitoring is done, and how metrics are combined to obtain reliable information about the current behavior of GPGPU clusters. Interesting attributes of ClusterWatch are the ease with which different metrics can be added to the...

    Provided By Georgia Institute of Technology

  • White Papers // Aug 2013

    An Infrastructure for Automating Large-scale Performance Studies and Data Processing

    The cloud has enabled the computing model to shift from traditional data centers to publicly shared computing infrastructure; yet, applications leveraging this new computing model can experience performance and scalability issues, which arise from the hidden complexities of the cloud. The most reliable path for better understanding these complexities is...

    Provided By Georgia Institute of Technology

  • White Papers // Aug 2013

    Oncilla: A GAS Runtime for Efficient Resource Allocation and Data Movement in Accelerated Cluster

    Accelerated and in-core implementations of big data applications typically require large amounts of host and accelerator memory as well as efficient mechanisms for transferring data to and from accelerators in heterogeneous clusters. Scheduling for heterogeneous CPU and GPU clusters has been investigated in depth in the High-Performance Computing (HPC) and...

    Provided By Georgia Institute of Technology

  • White Papers // Aug 2013

    Oncilla: A GAS Runtime for Efficient Resource Allocation and Data Movement in Accelerated Clusters

    Accelerated and in-core implementations of big data applications typically require large amounts of host and accelerator memory as well as efficient mechanisms for transferring data to and from accelerators in heterogeneous clusters. Scheduling for heterogeneous CPU and GPU clusters has been investigated in depth in the High-Performance Computing (HPC) and...

    Provided By Georgia Institute of Technology

  • White Papers // Aug 2013

    Secure Outsourced Garbled Circuit Evaluation for Mobile Devices

    Garbled circuits provide a powerful tool for jointly evaluating functions while preserving the privacy of each user's inputs. While recent research has made the use of this primitive more practical, such solutions generally assume that participants are symmetrically provisioned with massive computing resources. In reality, most people on the planet...

    Provided By Georgia Institute of Technology

  • White Papers // Jul 2013

    Personal Clouds: Sharing and Integrating Networked Resources to Enhance End User Experiences

    End user experiences on mobile devices with their rich sets of sensors are constrained by limited device battery lives and restricted form factors, as well as by the 'scope' of the data available locally. The 'Personal Cloud' distributed software abstractions address these issues by enhancing the capabilities of a mobile...

    Provided By Georgia Institute of Technology

  • White Papers // Jul 2013

    On Symmetric Encryption with Distinguishable Decryption Failures

    The authors propose to relax the assumption that decryption failures are indistinguishable in security models for symmetric encryption. Their main purpose is to build models that better reflect the reality of cryptographic implementations, and to surface the security issues that arise from doing so. They systematically explore the consequences of...

    Provided By Georgia Institute of Technology

  • White Papers // Jun 2013

    Take This Personally: Pollution Attacks on Personalized Services

    Modern web services routinely personalize content to appeal to the specific interests, viewpoints, and contexts of individual users. Ideally, personalization allows sites to highlight information uniquely relevant to each of their users, thereby increasing user satisfaction - and, eventually, the service's bottom line. Unfortunately, as the authors demonstrate in this...

    Provided By Georgia Institute of Technology

  • White Papers // Jun 2013

    An Automated Approach to Create, Store, and Analyze Large-scale Experimental Data in Clouds

    The flexibility and scalability of computing clouds make them an attractive application migration target; yet, the cloud remains a black-box for the most part. In particular, their opacity impedes the efficient but necessary testing and tuning prior to moving new applications into the cloud. A natural and presumably unbiased approach...

    Provided By Georgia Institute of Technology

  • White Papers // Jun 2013

    A Study of the Effect of Partitioning on Parallel Simulation of Multicore Systems

    There has been little research that studies the effect of partitioning on parallel simulation of multicore systems. This paper presents their paper of this important problem in the context of null-message-based synchronization algorithm for parallel multicore simulation. This paper focuses on coarse grain parallel simulation where each core and its...

    Provided By Georgia Institute of Technology

  • White Papers // Jun 2013

    FlexIO: Location-Flexible Execution of in Situ Data Analytics for Large Scale Scientific Applications

    Increasingly severe I/O bottlenecks on high-end computing machines are prompting scientists to process simulation output data while simulations are running and before placing data on disk - \"In situ\" and/or \"In-transit\". There are several options in placing in-situ data analytics along the I/O path: on compute nodes, on staging nodes...

    Provided By Georgia Institute of Technology

  • White Papers // Jun 2013

    I/O Containers: Managing the Data Analytics and Visualization Pipelines of High End Codes

    Lack of I/O scalability is known to cause measurable slowdowns for large-scale scientific applications running on high end machines. This is prompting researchers to devise 'I/O staging' methods in which outputs are processed via online analysis and visualization methods to support desired science outcomes. Organized as online workflows and carried...

    Provided By Georgia Institute of Technology

  • White Papers // Jun 2013

    Cloud Manufacturing: Drivers, Current Status, and Future Trends

    Cloud Manufacturing (CM) refers to a customer-centric manufacturing model that exploits on-demand access to a shared collection of diversified and distributed manufacturing resources to form temporary, reconfigurable production lines which enhance efficiency, reduce product lifecycle costs, and allow for optimal resource loading in response to variable-demand customer generated tasking. The...

    Provided By Georgia Institute of Technology

  • White Papers // May 2013

    Resolution-Aware Network Coded Storage

    In this paper, the authors show that coding can be used in Storage Area Networks (SANs) to improve various Quality of Service metrics under normal SAN operating conditions, without requiring additional storage space. For their analysis, they develop a model which captures modern characteristics such as constrained I/O access bandwidth...

    Provided By Georgia Institute of Technology

  • White Papers // May 2013

    Variations in Performance Measurements of Multi-Core Processors: A Study of n-Tier Applications

    The prevalence of multi-core processors has raised the question of whether applications can use the increasing number of cores efficiently in order to provide predictable Quality of Service (QoS). In this paper, the authors study the horizontal scalability of n-tier application performance within a Multi-Core Processor (MCP). Through extensive measurements...

    Provided By Georgia Institute of Technology

  • White Papers // May 2013

    Secure Cloud Storage Service with An Efficient DOKS Protocol

    Storage services based on public clouds provide customers with elastic storage and on-demand accessibility. However, moving data to remote cloud storage also raises privacy concerns. Cryptographic cloud storage and search over encrypted data have attracted attentions from both industry and academics. In this paper, the authors present a new approach...

    Provided By Georgia Institute of Technology

  • White Papers // May 2013

    Software-Controlled Transparent Management of Heterogeneous Memory Resources in Virtualized Systems

    This paper presents a software-controlled technique for managing the heterogeneous memory resources of next generation multicore platforms with fast 3D die-stacked memory and additional slow off-chip memory. Implemented for virtualized server systems, the technique detects the 'Hot' pages critical to program performance in order to then maintain them in the...

    Provided By Georgia Institute of Technology

  • White Papers // May 2013

    Efficient Trajectory Cover Search for Moving Object Trajectories

    Given a set of query locations and a set of query keywords, a Trajectory Cover (CT) query over a repository of mobile trajectories returns a minimal set of trajectories that maximally covers the query keywords and are also spatially close to the query locations. Processing CT queries over mobile trajectories...

    Provided By Georgia Institute of Technology

  • White Papers // May 2013

    Residency-Aware Virtual Machine Communication Optimization: Design Choices and Techniques

    Network I/O workloads are dominating in many data centers and cloud computing environments today. One way to improve inter Virtual Machine (VM) communication efficiency is to support co-resident VM communication by using shared memory based approaches and to resort to the traditional TCP/IP for inter-VM communications between VMs that are...

    Provided By Georgia Institute of Technology

  • White Papers // May 2013

    Performance Overhead Among Three Hypervisors: An Experimental Study using Hadoop Benchmarks

    Hypervisors are widely used in cloud environments and their impact on application performance has been a topic of significant research and practical interest. The authors conduct experimental measurements of several benchmarks using Hadoop MapReduce to evaluate and compare the performance impact of three popular hypervisors: a commercial hypervisor, Xen, and...

    Provided By Georgia Institute of Technology

  • White Papers // Apr 2013

    Residency Aware Inter-VM Communication in Virtualized Cloud: Performance Measurement and Analysis

    A known problem for virtualized cloud data centers is the inter-VM communication inefficiency for data transfer between co-resident VMs. Several engineering efforts have been made on building a shared memory based channel between co-resident VMs. The implementations differ in terms of whether user/program transparency, OS kernel transparency or VMM transparency...

    Provided By Georgia Institute of Technology

  • White Papers // Apr 2013

    SLIM: A Scalable Location-Sensitive Information Monitoring Service

    Location-sensitive information monitoring services are a centerpiece of the technology for disseminating content-rich information from massive data streams to mobile users. The key challenges for such monitoring services are characterized by the combination of spatial and non-spatial attributes being monitored and the wide spectrum of update rates. A typical example...

    Provided By Georgia Institute of Technology

  • White Papers // Apr 2013

    Measuring SSL Indicators on Mobile Browsers: Extended Life, or End of the Road?

    Mobile browsers are increasingly being relied upon to perform security sensitive operations. Like their desktop counterparts, these applications can enable SSL/TLS to provide strong security guarantees for communications over the web. However, the drastic reduction in screen size and the accompanying reorganization of screen real estate significantly changes the use...

    Provided By Georgia Institute of Technology

  • White Papers // Apr 2013

    Cloud-Based Information Technology Framework for Data Driven Intelligent Transportation Systems

    The authors present a novel cloud based IT framework, CloudTrack, for data driven intelligent transportation systems. They describe how the proposed framework can be leveraged for real-time fresh food supply tracking and monitoring. Cloud-Track allows efficient storage, processing and analysis of real-time location and sensor data collected from fresh food...

    Provided By Georgia Institute of Technology

  • White Papers // Mar 2013

    Optimizing Parallel Simulation of Multicore Systems Using Domain-Specific Knowledge

    In this paper the authors present two optimization techniques for the basic null-message algorithm in the context of parallel simulation of multicore computer architectures. Unlike the general, application-independent optimization methods, these are application-specific optimizations that make use of system properties of the simulation application. They demonstrate in two aspects that...

    Provided By Georgia Institute of Technology

  • White Papers // May 2013

    Software-Controlled Transparent Management of Heterogeneous Memory Resources in Virtualized Systems

    This paper presents a software-controlled technique for managing the heterogeneous memory resources of next generation multicore platforms with fast 3D die-stacked memory and additional slow off-chip memory. Implemented for virtualized server systems, the technique detects the 'Hot' pages critical to program performance in order to then maintain them in the...

    Provided By Georgia Institute of Technology

  • White Papers // Dec 2012

    Computing Infrastructure for Big Data Processing

    With computing systems transforming from single-processor devices to the ubiquitous and networked devices and the datacenter-scale computing in the cloud, the parallelism has become ubiquitous at many levels. At micro level, parallelisms are being explored from the underlying circuits, to pipelining and instruction level parallelism on multi-cores or many cores...

    Provided By Georgia Institute of Technology

  • White Papers // Nov 2011

    A Power Capping Controller for Multicore Processors

    In this paper, the authors present an online controller for tracking power-budgets in multicore processors using dynamic voltage-frequency scaling. The proposed control law comprises an integral controller whose gain is adjusted online based on the derivative of the power-frequency relationship. The control law is designed to achieve rapid settling time,...

    Provided By Georgia Institute of Technology

  • White Papers // Jun 2012

    Challenges and Opportunities in Consolidation at High Resource Utilization: Non-monotonic Response Time Variations in n-Tier Applications

    A central goal of cloud computing is high resource utilization through hardware sharing; however, utilization often remains modest in practice due to the challenges in predicting consolidated application performance accurately. The authors present a thorough experimental study of consolidated n-tier application performance at high utilization to address this issue through...

    Provided By Georgia Institute of Technology

  • White Papers // Mar 2012

    Xerxes: Distributed Load Generator for Cloud-scale Experimentation

    With the growing acceptance of cloud computing as a viable computing paradigm, a number of research and real-life dynamic cloud-scale resource allocation and management systems have been developed over the last few years. An important problem facing system developers is the evaluation of such systems at scale. In this paper,...

    Provided By Georgia Institute of Technology

  • White Papers // Jul 2011

    ResourceExchange: Latency-Aware Scheduling in Virtualized Environments with High Performance Fabrics

    Virtualized infrastructures have seen strong acceptance in data center systems and applications, but have not yet seen adoptance for latency-sensitive codes which require I/O to arrive predictability, or response times to be generated within certain timeliness guarantees. Examples of such applications include certain classes of parallel HPC codes, server systems...

    Provided By Georgia Institute of Technology

  • White Papers // Feb 2011

    HyperTransport Over Ethernet - A Scalable, Commodity Standard for Resource Sharing in the Data Center

    Future data center configurations are driven by Total Cost of Ownership (TCO) for specific performance capabilities. Low-latency interconnects are central to performance, while the use of commodity interconnects is central to cost. This paper reports on an effort to combine a very high-performance, commodity interconnect HT (Hyper-Transport) with a high-volume...

    Provided By Georgia Institute of Technology

  • White Papers // Apr 2014

    Improving MapReduce Performance in a Heterogeneous Cloud: A Measurement Study

    Hybrid clouds, geo-distributed cloud and continuous upgrades of computing, storage and networking resources in the cloud have driven datacenters evolving towards heterogeneous clusters. Unfortunately, most of MapReduce implementations are designed for homogeneous computing environments and perform poorly in heterogeneous clusters. Although a fair of research efforts have dedicated to improve...

    Provided By Georgia Institute of Technology

  • White Papers // Sep 2012

    A Fast and Transparent Communication Protocol for Co-Resident Virtual Machines

    Network I/O workloads are dominating in most of the cloud data centers today. One way to improve inter-VM communication efficiency is to support co-resident VM communication using a faster communication protocol than the traditional TCP/IP commonly used for inter-VM communications regardless whether VMs are located on the same physical host...

    Provided By Georgia Institute of Technology

  • White Papers // May 2008

    Double-DIP: Augmenting DIP with Adaptive Promotion Policies to Manage Shared L2 Caches

    In this paper, the authors study how the Dynamic Insert Policy (DIP) cache mechanism behaves in a multi-core shared-cache environment. Based on their observations, they explore a new direction in the design space of caches called the promotion policy. In a conventional LRU-based cache, a hit causes the line to...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Methodical Approximate Hardware Design and Reuse

    Design and reuse of approximate hardware components - digital circuits that may produce inaccurate results - can potentially lead to significant performance and energy improvements. Many emerging error-resilient applications can exploit such designs provided approximation is applied in a controlled manner. This paper provides the design abstractions and semantics for...

    Provided By Georgia Institute of Technology

  • White Papers // Oct 2012

    Performance Impact of Virtual Machine Placement in a Datacenter

    Virtualization technology adoption continues to grow in the enterprise segments across all types of workloads such as web hosting, data centers and even desktop computing. The emergence of Chip Multi-Processors (CMP) and the continuous increase in the number of cores in today's CMP architecture provide more hardware parallelism in a...

    Provided By Georgia Institute of Technology

  • White Papers // May 2012

    Extrapolation Pitfalls When Evaluating Limited Endurance Memory

    Many new non-volatile memory technologies have been considered as a future scalable alternative to DRAM. Memory technologies such as MRAM, FeRAM, PCM have emerged as the most viable alternatives. But these memories have limited wear endurance. Practically realizable main memory systems employing these memory technologies are possible only if the...

    Provided By Georgia Institute of Technology

  • White Papers // May 2012

    Accelerating Multi-threaded Application Simulation Through Barrier-Interval Time-Parallelism

    In the last decade, the microprocessor industry has undergone a dramatic change, ushering in the new era of multi-/manycore processors. As new designs incorporate increasing core counts, simulation technology has not matched pace, resulting in simulation times that increasingly dominate the design cycle. Complexities associated with the execution of code...

    Provided By Georgia Institute of Technology

  • White Papers // Mar 2011

    Energy Efficient Phase Change Memory Based Main Memory for Future High Performance Systems

    Phase Change Memory (PCM) has recently attracted a lot of attention as a scalable alternative to DRAM for main memory systems. As the need for high-density memory increases, DRAM has proven to be less attractive from the point of view of scaling and energy consumption. PCM-only memories suffer from latency...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2009

    Length Adaptive Processors: A Solution for the Energy/Performance Dilemma in Embedded Systems

    Embedded-handheld devices are the predominant computing platform today. These devices are required to perform complex tasks yet run on batteries. Some architects use ASICs to combat this energy-performance dilemma. Even though they are efficient in solving this problem, ASICs are very inflexible. Thus, it is necessary for a general purpose...

    Provided By Georgia Institute of Technology

  • White Papers // Oct 2012

    Designing Configurable, Modifiable And Reusable Components for Simulation of Multicore Systems

    A simulation system for modern multicore architectures is composed of various component models. For such a system to be useful for research purposes, modifiability is a key quality attribute. Users, when building a simulation model, need to have the capability to adjust various aspects of a component, or even replace...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2013

    High-Speed Formal Verification of Heterogeneous Coherence Hierarchies

    As more heterogeneous architecture solutions continue to emerge, coherence solutions tailored for these architectures will become mandatory. Coherence hierarchies will likely continue to be prevalent in future large-scale shared memory architectures. However, past experience has shown that hierarchical coherence protocol design is a non-trivial problem, especially when considering the verification...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Asymmetry Aware Scheduling Algorithms for Asymmetric Multiprocessors

    Multiprocessor architecture is becoming popular in both desktop processors and mobile processors. Especially asymmetric architecture shows promise in saving energy and power. However, how to design applications and how to schedule applications in asymmetric multiprocessors are still challenging problems. In this paper, the authors evaluate the performance of applications in...

    Provided By Georgia Institute of Technology

  • White Papers // Sep 2010

    Design and Analysis of 3D-MAPS: A Many-Core 3D Processor with Stacked Memory

    The potential of 3D IC stacking has been examined by researchers for many years. Only recently has the increasing cost of continuing process technology shrinks and the incredible memory-bandwidth demand of multi- and many-core systems brought 3D technology to the forefront of commercial interest. Many universities and companies are actively...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    An Optimized 3D-Stacked Memory Architecture by Exploiting Excessive, High-Density TSV Bandwidth

    Memory bandwidth has become a major performance bottleneck as more and more cores are integrated onto a single die, demanding more and more data from the system memory. Several prior studies have demonstrated that this memory bandwidth problem can be addressed by employing a 3D-stacked memory architecture, which provides a...

    Provided By Georgia Institute of Technology

  • White Papers // Apr 2009

    High Performance Nonblocking Switch Design in 3D DieStacking Technology

    Die stacking is a promising new technology that enables integration of devices in the third dimension. It allows the stacking of multiple active layers directly on top of one another with short, dense die-to-die vias providing communication. The authors has shown significant benefits at all design targets, from stacking memory...

    Provided By Georgia Institute of Technology

  • White Papers // May 2008

    Total Recall: A Debugging Framework for GPUs

    GPUs have transformed from simple fixed-function processors to powerful, programmable stream processors and are continuing to evolve. Programming these massively parallel GPUs, however, is very different from programming a sequential CPU. Lack of native support for debugging coupled with the parallelism in the GPU makes program development for the GPU...

    Provided By Georgia Institute of Technology

  • White Papers // May 2008

    IdlePower: Application-Aware Management of Processor Idle States

    Power has become the first class design constraint in modern processor design. To reduce the power density caused by aggressive, speculative execution seen in previous processor generations, computer architects have turned to a multi-core design strategy with each core substantially simplified. Additionally, different power-saving features have been pro-posed and integrated...

    Provided By Georgia Institute of Technology

  • White Papers // Apr 2008

    Helper Transactions: Enabling Thread-Level Speculation via A Transactional Memory System

    As multi-core processors become readily available in the market, how to exploit parallelization opportunities to unleash the performance potential has become the utmost concern. Thread-Level Speculation (TLS) has been studied as one such enabling technique to automatically extracting possibly non-conflicting threads for execution in a program. On the other hand,...

    Provided By Georgia Institute of Technology

  • White Papers // Dec 2007

    A Unified Methodology for Power Supply Noise Reduction in Modern Microarchitecture Design

    In this paper, the authors present a novel design methodology to combat the ever-aggravating high frequency power supply noise (di/dt) in modern microprocessors. Their methodology integrates micro-architectural profiling for noise-aware floor-planning, dynamic runtime noise control to prevent unsustainable noise emergencies, as well as decap allocation; all to produce a design...

    Provided By Georgia Institute of Technology

  • White Papers // Jun 2012

    Can MultiLevel Cell PCM Be Reliable and Usable? Analyzing the Impact of Resistance Drift

    There are several emerging memory technologies looming on the horizon to compensate the physical scaling challenges of DRAM. Phase Change Memory (PCM) is one of such candidates proposed for being part of the main memory in computing systems. One salient feature of PCM is its Multi-Level Cell (MLC) property which...

    Provided By Georgia Institute of Technology

  • White Papers // Feb 2012

    Global Built-In Self-Repair for 3D Memories with Redundancy Sharing and Parallel Testing

    3D integration is a promising technology that provides high memory bandwidth, reduced power, shortened latency, and smaller form factor. Among many issues in 3D IC design and production, testing remains one of the major challenges. This paper introduces a new design-for-test technique called 3D-GESP, an efficient Built-In-Self-Repair (BISR) algorithm to...

    Provided By Georgia Institute of Technology

  • White Papers // Aug 2011

    Ally: OS-Transparent Packet Inspection Using Sequestered Cores

    In this paper, the authors present Ally, a server platform architecture that supports compute-intensive management services on multi-core processors. Ally introduces simple hardware mechanisms to sequester cores to run a separate software environment dedicated to management tasks, including packet processing software appliances with efficient mechanisms to safely and transparently intercept...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Designing 3D Test Wrappers for Prebond and Postbond Test of 3D Embedded Cores

    3D integration is a promising new technology for tightly integrating multiple active silicon layers into a single chip stack. Both the integration of heterogeneous tiers and the partitioning of functional units across tiers leads to significant improvements in functionality, area, performance, and power consumption. Managing the complexity of 3D design...

    Provided By Georgia Institute of Technology

  • White Papers // Apr 2011

    Heterogeneous Die Stacking of SRAM Row Cache and 3D DRAM: An Empirical Design Evaluation

    As DRAM scaling becomes more challenging and its energy efficiency receives a growing concern for data center operation, an alternative approach - stacking DRAM die with Thru-Silicon Vias (TSV) using 3-D integration technology is being undertaken by industry to address these looming issues. Furthermore, 3-D technology also enables heterogeneous die...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D DieStacked DRAMs

    DRAMs require periodic refresh for preserving data stored in them. The refresh interval for DRAMs depends on the vendor and the design technology they use. For each refresh in a DRAM row, the stored information in each cell is read out and then written back to itself as each DRAM...

    Provided By Georgia Institute of Technology

  • White Papers // Dec 2007

    Virtual Exclusion: An Architectural Approach to Reducing Leakage Energy in Caches for Multiprocessor Systems

    In this paper, the authors propose virtual exclusion, an architectural technique to reduce leakage energy in the L2 caches for cache-coherent multiprocessor systems. This technique leverages two previously proposed circuits techniques - gated Vdd and drowsy cache, and proposes a low cost, easily implementable scheme for cache-coherent multiprocessor systems. The...

    Provided By Georgia Institute of Technology

  • White Papers // Jul 2007

    Hierarchical Means: Single Number Benchmarking with Workload Cluster Analysis

    Benchmark suite scores are typically calculated by averaging the performance of each individual workload. The scores are inherently affected by the distribution of workloads. Given the applications of a benchmark suite are typically contributed by many consortium members, workload redundancy becomes inevitable. Especially, the merger of the benchmarks can significantly...

    Provided By Georgia Institute of Technology

  • White Papers // Feb 2007

    Secure Processing On-Chip

    While embedded computing is becoming more pervasive and invisible, the ways users communicate and operate data on these devices, however, are becoming more vulnerable to malicious exploits. Providing security in embedded systems is in urgent needs while there are many challenges in both software and hardware sides that require further...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Noise-Direct: A Technique for Power Supply Noise Aware Floorplanning Using Microarchitecture Profiling

    In this paper, the authors propose Noise-Direct, a design methodology for power integrity aware floorplanning, using microarchitectural feedback to guide module placement. Stringent power constraints have led microprocessor designers to incorporate aggressive power saving techniques such as clock-gating that place a significant burden on the power delivery network. While the...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    DRAM Decay: Using Decay Counters to Reduce Energy Consumption in DRAMs

    Dynamic Random Access Memories (DRAMs) require periodic refresh for preserving data stored in them. The refresh interval for DRAMs depends on the vendor and the de-sign technology they use. For each refresh in a DRAM row, the stored information in each cell is read out and then writ-ten back to...

    Provided By Georgia Institute of Technology

  • White Papers // Jan 2014

    Hardware Support for Real-Time Embedded Multiprocessor System-on-a-Chip Memory Management

    The aggressive evolution of the semiconductor industry smaller process geometries, higher densities, and greater chip complexity has provided design engineers the means to create complex, high-performance Systems-on-Chip (SoC) designs. Such, SoC designs typically have more than one processor and huge memory, all on the same chip. Dealing with the global...

    Provided By Georgia Institute of Technology

  • White Papers // Oct 2011

    Using Active NVRAM for Cloud I/O

    A well-known problem for large scale cloud applications is how to scale their I/O performance. While next generation storage class memories like phase change memory and Memristors offer potential for high I/O bandwidths, if left unchecked, the raw volumes and rates of I/O already present in current cloud applications can...

    Provided By Georgia Institute of Technology

  • White Papers // Aug 2013

    Oncilla: A GAS Runtime for Efficient Resource Allocation and Data Movement in Accelerated Clusters

    Accelerated and in-core implementations of big data applications typically require large amounts of host and accelerator memory as well as efficient mechanisms for transferring data to and from accelerators in heterogeneous clusters. Scheduling for heterogeneous CPU and GPU clusters has been investigated in depth in the High-Performance Computing (HPC) and...

    Provided By Georgia Institute of Technology