Association for Computing Machinery

Displaying 5961-6000 of 6819 results

  • White Papers // Jun 2009

    An SLA-Based Resource Virtualization Approach For On-demand Service Provision

    Cloud computing is a newly emerged research infrastructure that builds on the latest achievements of diverse research areas, such as Grid computing, Service-oriented computing, business processes and virtualization. In this paper, the authors present architecture for SLA-based resource virtualization that provides an extensive solution for executing user applications in Clouds....

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Flow-Sensitive Semantics for Dynamic Information Flow Policies

    Dynamic information flow policies, such as declassification, are essential for practically useful information flow control systems. However, most systems proposed to date that handle dynamic information flow policies suffer from a common drawback. They build on semantic models of security which are inherently flow insensitive, which means that many simple...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Catch Me if You Can: Permissive Yet Secure Error Handling

    Program errors are a source of information leaks. Tracking these leaks is hard because error propagation breaks out of program structure. Programming languages often feature exception constructs to provide some structure to error handling: for example, the try...catch blocks in Java and Caml. Mainstream information-flow security compilers such as Jif...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    ActionScript Bytecode Verification With Co-Logic Programming

    A prototype security policy verification system for Action-Script binaries is presented, whose implementation leverages recent advances in co-logic programming. The authors' experience with co-logic programming indicates that it is an extremely useful paradigm for elegantly expressing algorithms that lie at the heart of model-checking technologies. This results in an unusually...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Encoding Information Flow in AURA

    Two of the main ways to protect security-sensitive resources in computer systems are to enforce access-control policies and information-flow policies. In this paper, the authors show how to enforce information-flow policies in AURA, which is a programming language for access control. When augmented with this mechanism for enforcing information-flow polices,...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Evaluating the Cost-Benefit of Using Cloud Computing to Extend the Capacity of Clusters

    This paper investigates the benefits that organisations can reap by using Cloud Computing" providers to augment the computing capacity of their local infrastructure. It evaluates the cost of six scheduling strategies used by an organisation that operates a cluster managed by virtual machine technology and seeks to utilise resources from...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Resource Co-Allocation for Large-Scale Distributed Environments

    Advances in the development of large scale distributed computing systems such as Grids and Computing Clouds have intensified the need for developing scheduling algorithms capable of allocating multiple resources simultaneously. In principle, the required resources may be allocated by sequentially scheduling each resource individually. However, such a solution can be...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Privacy Aware Data Sharing: Balancing the Usability and Privacy of Datasets

    Existing models of privacy assume that the set of data to be held confidential is immutable. Unfortunately, that is often not the case. The need for privacy is balanced against the need to use the data, and the benefits that will accrue from the use of the data. The authors...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Performance Enhancement With Speculative Execution Based Parallelism for Processing Large-Scale XML-Based Application Data

    This paper presents the design and implementation of a toolkit for processing large-scale XML datasets that utilizes the capabilities for parallelism that are available in the emerging multi-core architectures. Multi-core processors are expected to be widely available in research clusters and scientific desktops, and it is critical to harness the...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Towards Faster Activity Search Using Embedding-Based Subsequence Matching

    A time series is simply a sequence of vectors, where every vector corresponds to a measurement/observation at a specific time, and the next vector in the sequence corresponds to the measurement at the next time step. Measurements can be defined depending on the specific type of activity that the authors...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    A Scheme for On-Site Service Provision in Pervasive Assistive Environments

    Remote healthcare monitoring and on demand provision of support attracts a lot of interest due to the ability to provide assistance to elderly and patients when needed; thus on one side the hospitals demand less personnel to be engaged in monitoring patients, whereas on the other side the patient does...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    DataStager: Scalable Data Staging Services for Petascale Applications

    Known challenges for petascale machines are that the costs of I/O for high performance applications can be substantial, especially for output tasks like check pointing, and noise from I/O actions can inject undesirable delays into the runtimes of such codes on individual compute nodes. This paper introduces the flexible 'DataStager'...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Exploring Data Reliability Tradeoffs in Replicated Storage Systems

    This paper explores the feasibility of a cost-efficient storage architecture that offers the reliability and access performance characteristics of a high-end system. This architecture exploits two opportunities: First, scavenging idle storage from LAN-connected desktops not only offers a low-cost storage space, but also high I/O throughput by aggregating the I/O...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Optimizing Memory System Performance for Data Center Applications Via Parameter Value Prediction

    The current scenario with respect to data center applications is constantly evolving with applications being introduced and severs are regularly updated. Data centers require processor cycles of numerous machines and the application programmer plays an important role. The application programmer provides information on the physical abilities and configuration of servers....

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Virtualization Polling Engine (VPE): Using Dedicated CPU Cores to Accelerate I/O Virtualization

    Virtual Machine (VM) technologies are making rapid progress and VM performance is approaching that of native hardware in many aspects. Achieving high performance for I/O virtualization remains a challenge, however, especially for high speed networking devices such as 10 Gigabit Ethernet (10 GbE) NICs. Traditional software-based approaches to I/O virtualization...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Creating Artificial Global History to Improve Branch Prediction Accuracy

    Modern processors require highly accurate branch prediction for good performance. As such, a number of branch predictors have been proposed with varying size and complexity. This work identifies techniques to improve the accuracy of most predictors. It is especially effective with smaller, simpler predictors, allowing those predictors to be competitive...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Are Evolutionary Rule Learning Algorithms Appropriate for Malware Detection?

    In this paper, the authors evaluate the performance of ten well-known evolutionary and non-evolutionary rule learning algorithms. The comparative study is performed on a real-world classification problem of detecting malicious executables. The executable dataset, used in this study, consists of a total of 189 attributes which are statically extracted from...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    TransMetric: Architecture Independent Workload Characterization for Transactional Memory Benchmarks

    Transactional Memory (TM) is a parallel programming model that uses transactions for synchronization. Transactional Memory (TM) has emerged as a parallel programming paradigm for multi-core processors yet there is no standardized set of metrics with which to describe their behavior. In this paper, the authors propose a set of transaction-oriented...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Tuned and Wildly Asynchronous Stencil Kernels for Hybrid CPU/GPU Systems

    The authors describe heterogeneous multi-CPU and multi-GPU implementations of Jacobi's iterative method for the 2-D Poisson equation on a structured grid, in both single- and double-precision. Properly tuned, their best implementation achieves 98% of the empirical streaming GPU bandwidth (66% of peak) on a NVIDIA C1060, and 78% on a...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    P-Code: A New RAID-6 Code with Optimal Properties

    RAID-6 significantly outperforms the other RAID levels in disk-failure tolerance due to its ability to tolerate arbitrary two concurrent disk failures in a disk array. The underlying parity array codes have a significant impact on RAID-6's performance. In this paper, the authors propose a new XOR-based RAID-6 code, called the...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Compiling Structural Types on the JVM

    This paper describes Scala's compilation technique of structural types for the JVM. The technique uses Java reflection and polymorphic inline caches. Performance measurements of this technique are presented and analyzed. Further measurements compare Scala's reflective technique with the "Generative" technique used by Whiteoak to compile structural types. The paper ends...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Designing Multi-Socket Systems Using Silicon Photonics

    Future single-board multi-socket systems may be unable to deliver the needed memory bandwidth electrically due to power limitations, which will hurt their ability to drive performance improvements. Energy efficient off-chip silicon photonics could be used to deliver the needed bandwidth, and it could be extended on-chip to create a relatively...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Scratch as a Cache: Rethinking HPC Center Scratch Storage

    To sustain emerging data-intensive scientific applications, High Performance Computing (HPC) centers invest a notable fraction of their operating budget on a specialized fast storage system, scratch space, which is designed for storing the data of currently running and soon-to-run HPC jobs. Instead, it is often used as a standard file...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    SniffMob: Inferring Human Contact Patterns Using Wireless Devices

    The size of existing data sets regarding human mobility and person-to-person contact has been limited by the labor-intensive nature of the data collection techniques employed. In this paper, the authors propose a practical data collection system which is automatic and transparent to the user, requires only installing new software, and...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    BugCache for Inspections : Hit or Miss?

    Inspection is a highly effective but costly technique for quality control. Most companies do not have the resources to inspect all the code; thus accurate defect prediction can help focus available inspection resources. BugCache is a simple, elegant, award-winning prediction scheme that "Caches" files that is likely to contain defects....

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Optimal Communications Systems and Network Design for Cargo Monitoring

    In the United States there is an emerging trend to ship goods by rail directly from ports to inland intermodal traffic terminals. However, for this trend to succeed shippers must have "Visibility" into rail shipments. In this paper the authors seek to provide visibility into shipments through optimal placement of...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Balancing TCP Buffer Vs Parallel Streams in Application Level Throughput Optimization

    The end-to-end performance of TCP over wide-area may be a major bottleneck for large-scale network-based applications. Two practical ways of increasing the TCP performance at the application layer is using multiple parallel streams and tuning the buffer size. Tuning the buffer size can lead to significant increase in the throughput...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Evaluation of an Information Service for Enhanced Multiaccess Media Delivery

    Multimedia delivery in mobile multiaccess network environments is gaining traction as a key area within the future Internet research domain. When network heterogeneity is coupled with the proliferation of multiaccess capabilities of mobile handheld devices, one can expect new avenues for the development of novel services and applications. In particular,...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    A Multi-Frequency MAC Specially Designed for Wireless Sensor Network Applications

    Multi-frequency media access control has been well understood in general wireless ad hoc networks, while in wireless sensor networks, researchers still focus on single frequency solutions. In wireless sensor networks, each device is typically equipped with a single radio transceiver and applications adopt much smaller packet sizes compared to those...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Distributed Vision With Smart Pixels

    The authors study a problem related to computer vision: How can a field of sensors compute higher-level properties of observed objects deterministically in sub-linear time, without accessing a central authority? This issue is not only important for real-time processing of images, but lies at the very heart of understanding how...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    The Ruby Intermediate Language

    Ruby is a popular, dynamic scripting language that aims to "Feel natural to programmers" and give users the "Freedom to choose" among many different ways of doing the same thing. While this arguably makes programming in Ruby easier, it makes it hard to build analysis and transformation tools that operate...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Cloud Computing: An Overview

    Google, Yahoo, Amazon, and others have built large, purpose-built architectures to support their applications and taught the rest of the world how to do massively scalable architectures to support compute, storage, and application services. Cloud computing is about moving services, computation and/or data - for cost and business advantage -...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Evaluating the Utility of Content Delivery Networks

    Content Delivery Networks (CDNs) balance costs and quality in services related to content delivery. This has urged many Web entrepreneurs to make contracts with CDNs. In the paper, a wide range of techniques has been developed, implemented and standardized for improving the performance of CDNs. The ultimate goal of all...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    On the Treeness of Internet Latency and Bandwidth

    Existing empirical studies of Internet structure and path properties indicate that the Internet is tree-like. This work quantifies the degree to which at least two important Internet measures - latency and bandwidth - approximate tree metrics. This paper evaluates the ability to model end-to-end measures using tree embeddings by actually...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Partial Memoization of Concurrency and Communication

    Memoization is a well-known optimization technique used to eliminate redundant calls for pure functions. If a call to a function f with argument v yields result r, a subsequent call to f with v can be immediately reduced to r without the need to re-evaluate f's body. Understanding memoization in...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    InstantLeap: Fast Neighbor Discovery in P2P VoD Streaming

    A fundamental challenge in Peer-To-Peer (P2P) Video-on-Demand (VoD) streaming is to quickly locate new supplying peers whenever a VCR command is issued, in order to achieve smooth viewing experiences. For most existing commercial systems which resort to tracking servers for such neighbor discovery, the increasing scale of P2P VoD systems...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google Talk, and MSN Messenger

    VoIP playout buffer dimensioning has long been a challenging optimization problem, as the buffer size must maintain a balance between conversational interactivity and speech quality. The conversational quality may be affected by a number of factors, some of which may change over time. Although a great deal of research effort...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Application of IEC 80001 in Avoiding Pitfalls of Wireless LAN System Design

    Every Wireless LAN (WLAN) network provider has white papers describing best practices for configuration of their equipment in hopes of ensuring the WLAN performance meets customer expectations. Why is it then that underperforming networks are still installed? This paper suggests the primary reason as: the WLAN is installed without a...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    A Semantic Web Based Framework for Social Network Access Control

    The existence of on-line social networks that include person specific information creates interesting opportunities for various applications ranging from marketing to community organization. On the other hand, security and privacy concerns need to be addressed for creating such applications. Improving social network access control systems appears as the first step...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2009

    Safety in Discretionary Access Control for Logic-based Publish-Subscribe Systems

    Publish-subscribe (Pub-sub) systems are useful for many applications, including pervasive environments. In the latter context, however, great care must be taken to preserve the privacy of sensitive information, such as users' location and activities. Traditional access control schemes provide at best a partial solution, since they do not capture potential...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2011

    Memory Power Management via Dynamic Voltage/Frequency Scaling

    Energy efficiency and energy-proportional computing have become a central focus in enterprise server architecture. As thermal and electrical constraints limit system power, and datacenter operators become more conscious of energy costs, energy efficiency becomes important across the whole system. There are many proposals to scale energy at the data-center and...

    Provided By Association for Computing Machinery

  • White Papers // Dec 2010

    Storage-Based Intrusion Detection

    Storage-based intrusion detection consists of storage systems watching for and identifying data access patterns characteristic of system intrusions. Storage systems can spot several common intruder actions, such as adding backdoors, inserting Trojan horses, and tampering with audit logs. For example, examination of 18 real intrusion tools reveals that most can...

    Provided By Association for Computing Machinery

  • White Papers // Aug 2013

    Cache Conscious Star-Join in MapReduce Environments

    With the popularity of big data and cloud computing, data parallel framework MapReduce based data warehouse systems are used widely. Column store is a default data placement in these systems. Traditionally star join is a core operation in the data warehouse. However, little related work study star join in column...

    Provided By Association for Computing Machinery

  • White Papers // Aug 2013

    Toward Intersection Filter-Based Optimization for Joins in MapReduce

    MapReduce has become an attractive and dominant model for processing large-scale datasets. However, this model is not designed to directly support operations with multiple inputs as joins. Many studies on join algorithms including Bloom join in MapReduce have been conducted but they still have too much non-joining data generated and...

    Provided By Association for Computing Machinery

  • White Papers // Aug 2013

    i2MapReduce: Incremental Iterative MapReduce

    Iterative computations are widely used in cloud intelligence applications, such as the well-known PageRank algorithm in web search engines, gradient descent algorithm for optimization, and many other iterative algorithms for applications including recommender systems and link prediction. Cloud intelligence applications often perform iterative computations (e.g., PageRank) on constantly changing data...

    Provided By Association for Computing Machinery

  • White Papers // May 2008

    Scalable and Fault-Tolerant Network-on-Chip Design Using the Quartered Recursive Diagonal Torus Topology

    Network-on-Chip (NoC) is an effective approach to connect and manage the communication between the variety of design elements and intellectual property blocks required in large and complex system-on-chips. In this paper, the authors propose a new NoC architecture, referred as the Quartered Recursive Diagonal Torus (QRDT), which is constructed by...

    Provided By Association for Computing Machinery

  • White Papers // Feb 2012

    A Torus-Based Hierarchical Optical-Electronic Network-on-Chip for Multiprocessor System-on-Chip

    Network-on-Chips (NoCs) are emerging as a key on-chip communication architecture for Multi-Processor System-on-Chips (MPSoCs). Optical communication technologies are introduced to NoCs in order to empower ultra-high bandwidth with low power consumption. However, in existing optical NoCs, communication locality is poorly supported, and the importance of floorplanning is overlooked. These significantly...

    Provided By Association for Computing Machinery

  • White Papers // May 2006

    A Design Methodology for Application-Specific Networks-on-Chip

    As an effective way to reduce cost, improve reliability, and produce versatile products, System-on-Chip (SoC) not only implements function units, but also emphasizes cooperation among function units to improve performance and reduce cost. With the help of HW/SW codesign, System-on-Chip (SoC) can effectively reduce cost, improve reliability, and produce versatile...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2010

    Crosstalk Noise and Bit Error Rate Analysis for Optical Network-on-Chip

    Crosstalk noise is an intrinsic characteristic of photonic devices used by Optical Network-on-Chips (ONoCs) as well as a potential issue. For the first time, this paper analyzed and modeled the crosstalk noise, Signal-to-Noise Ratio (SNR), and Bit Error Rate (BER) of optical routers and ONoCs. The analytical models for crosstalk...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2009

    A Case Study of On-Chip Sensor Network in Multiprocessor System-on-Chip

    Multi-Processor System-on-Chip (MPSoC) is becoming a favorite choice to satisfy the ever-growing performance demanded by applications. On one hand, shrinking feature size allows for more and better functions on MPSoC. On the other hand, it also makes MPSoC more susceptible to various reliability threats, such as high temperature and Power/Ground...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2009

    An Efficient Technique for Analysis of Minimal Buffer Requirements of Synchronous Dataflow Graphs with Model Checking

    Synchronous DataFlow (SDF) is a widely-used model of computation for digital signal processing and multimedia applications, which are typically implemented on memory constrained hardware platforms. SDF can be statically analyzed and scheduled, and the memory requirement for correct execution can be predicted at compile time. In this paper, the authors...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2008

    ODOR: A Microresonator-Based High-Performance Low-Cost Router for Optical Networks-on-Chip

    The performance of system-on-chip is determined not only by the performance of its functional units, but also by how efficiently they cooperate with one another. It is the on-chip communication architecture which determines the cooperation efficiency. Network-on-Chip (NoC) is introduced to improve communication bandwidth and power efficiency. However, traditional metallic...

    Provided By Association for Computing Machinery

  • White Papers // Apr 2014

    Palette: Enabling Scalable Analytics for Big-Memory, Multicore Machines

    Hadoop and its variants have been widely used for processing large scale analytics tasks in a cluster environment. However, use of a commodity cluster for analytics tasks needs to be reconsidered based on two key observations: in recent years, large memory, multicore machines have become more affordable; and recent studies...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2011

    SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-Based Storage

    A broad range of server-side applications need an underlying, often persistent, key-value store to function. Examples include state maintenance in Internet applications like online multi-player gaming and inline storage deduplication (as described in Section 3). A high throughput persistent key-value store can help to improve the performance of such applications....

    Provided By Association for Computing Machinery

  • White Papers // Jul 2012

    XML Query-Update Independence Analysis Revisited

    XML transformations can be resource-costly in particular when applied to very large XML documents and document sets. Those transformations usually involve lots of XPath queries and may not need to be entirely re-executed following an update of the input document. In this paper, a given query is said to be...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2010

    From Templates to Schemas: Bridging the Gap Between Free Editing and Safe Data Processing

    In this paper the authors present tools that provide an easy way to edit XML content directly on the web, with the usual benefit of valid XML content. These tools make it possible to create content targeted for lightweight web applications. Their approach uses the XTiger template language, the AXEL...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2010

    A Low-Cost Global Network for Data Collection and Query

    In this paper the authors present two complementary research ideas and a proto-type framework based on these ideas. The first idea is that by using semantic URIs, Xquery and XML data models, HTTP responses with embedded URIs, it should be possible to construct easily rich web services over SMS for...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2009

    On the Analysis of Queries with Counting Constraints

    The authors study the analysis problem of XPath expressions with counting constraints. Such expressions are commonly used in document transformations or programs in which they select portions of documents subject to transformations. They explore how recent results on the static analysis of navigational aspects of XPath can be extended to...

    Provided By Association for Computing Machinery

  • White Papers // Apr 2012

    Studying Hardware and Software Trade-Offs for a Real-Life Web 2.0 Workload

    Designing data centers for web 2.0 social networking applications is a major challenge because of the large number of users, the large scale of the data centers, the distributed application base, and the cost sensitivity of a data center facility. Optimizing the data center for performance per dollar is far...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2006

    The Exigency of Benchmark and Compiler Drift: Designing Tomorrow's Processors with Yesterday's Tools

    Due to the amount of time required to design a new processor, one set of benchmark programs may be used during the design phase while another may be the standard when the design is finally delivered. Using one benchmark suite to design a processor while using a different, presumably more...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2006

    Accurate Memory Data Flow Modeling in Statistical Simulation

    Microprocessor design is a very complex and time-consuming activity. One of the primary reasons is the huge design space that needs to be explored in order to identify the optimal design given a number of constraints. Simulations are usually used to explore these huge design spaces, however, they are fairly...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2011

    Optimizing the Datacenter for Data-Centric Workloads

    The amount of data produced on the internet is growing rapidly. Along with data explosion comes the trend towards more and more diverse data, including rich media such as audio and video. Data explosion and diversity leads to the emergence of data-centric workloads to manipulate, manage and analyze the vast...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2010

    Modeling Critical Sections in Amdahl's Law and its Implications for Multicore Design

    In this paper, the authors present a fundamental law for parallel performance: it shows that parallel performance is not only limited by sequential code (as suggested by Amdahl's law) but is also fundamentally limited by synchronization through critical sections. Extending Amdahl's software model to include critical sections, they derive the...

    Provided By Association for Computing Machinery

  • White Papers // Apr 2013

    Criticality Stacks: Identifying Critical Threads in Parallel Programs Using Synchronization Behavior

    Analyzing multi-threaded programs is quite challenging, but is necessary to obtain good multicore performance while saving energy. Due to synchronization, certain threads make others wait, because they hold a lock or have yet to reach a barrier. The authors call these critical threads, i.e., threads whose performance is determinative of...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2007

    Using HPM-Sampling to Drive Dynamic Compilation

    All high-performance production JVMs employ an adaptive strategy for program execution. Methods are first executed unoptimized and then an online profiling mechanism is used to find a subset of methods that should be optimized during the same execution. This paper empirically evaluates the design space of several profilers for initiating...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2012

    Exploring Multi-Threaded Java Application Performance on Multicore Hardware

    While there have been many studies of how to schedule applications to take advantage of increasing numbers of cores in modern-day multicore processors, few have focused on multi-threaded managed language applications which are prevalent from the embedded to the server domain. Managed languages complicate performance studies because they have additional...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2013

    Bottle Graphs: Visualizing Scalability Bottlenecks in Multi-Threaded Applications

    Understanding and analyzing multi-threaded program performance and scalability is far from trivial, which severely complicates parallel software development and optimization. In this paper, the authors present bottle graphs, a powerful analysis tool that visualizes multi-threaded program performance, in regards to both per-thread parallelism and execution time. Each thread is represented...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2012

    Power-Aware Multi-Core Simulation for Early Design Stage Hardware/Software Co-Optimization

    With limited increases in clock frequency because of power constraints, improving next-generation processor performance has become a real challenge. One increasingly attractive way to improve performance within a given power and energy budget is to optimize the system for a specific workload - a paradigm that is broadly adopted for...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2011

    Sniper: Exploring the Level of Abstraction for Scalable and Accurate Parallel Multi-Core Simulation

    Two major trends in high-performance computing, namely, larger numbers of cores and the growing size of on-chip cache memory, are creating significant challenges for evaluating the design space of future processor architectures. Fast and scalable simulations are therefore needed to allow for sufficient exploration of large multi-core systems within a...

    Provided By Association for Computing Machinery

  • White Papers // Mar 2009

    Memory-Level Parallelism Aware Fetch Policies for Simultaneous Multithreading Processors

    A thread executing on a Simultaneous Multi-Threading (SMT) processor that experiences a long latency load will eventually stall while holding execution resources. Existing long-latency load aware SMT fetch policies limit the amount of resources allocated by a stalled thread by identifying long-latency loads and preventing the thread from fetching more...

    Provided By Association for Computing Machinery

  • White Papers // Jan 2011

    Fine-Grained DVFS Using On-Chip Regulators

    Limit studies on Dynamic Voltage and Frequency Scaling (DVFS) provide apparently contradictory conclusions. On the one end, early limit studies report that DVFS is effective at large timescales (on the order of million(s) of cycles) with large scaling overheads (on the order of tens of microseconds), and they conclude that...

    Provided By Association for Computing Machinery

  • White Papers // May 2006

    Energy-Efficient Embedded Software Implementation on Multiprocessor System-on-Chip with Multiple Voltages

    Performance guarantee and energy efficiency are becoming increasingly important for the implementation of embedded software. Traditionally, the Worst-Case Execution Time (WCET) is considered to provide performance guarantee, however, this often leads to overdesigning the system. This paper develops energy-driven completion ratio guaranteed scheduling techniques for the implementation of embedded software...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2006

    The Pipeline Decomposition Tree: An Analysis Tool for Multiprocessor Implementation of Image Processing Applications

    Modern embedded systems for image processing involve increasingly complex levels of functionality under real-time and resource related constraints. As this complexity increases, the application of single-chip multiprocessor technology is attractive. To address the challenges of mapping image processing applications onto embedded multiprocessor platforms, this paper presents a novel data structure...

    Provided By Association for Computing Machinery

  • White Papers // May 2007

    Beyond Single-Appearance Schedules: Efficient DSP Software Synthesis Using Nested Procedure Calls

    Synthesis of Digital Signal-Processing (DSP) software from dataflow-based formal models is an effective approach for tackling the complexity of modern DSP applications. In this paper, an efficient method is proposed for applying subroutine call instantiation of module functionality when synthesizing embedded software from a dataflow specification. The technique is based...

    Provided By Association for Computing Machinery

  • White Papers // Dec 2006

    Improving SDRAM Access Energy Efficiency for Low-Power Embedded Systems

    DRAM (Dynamic Random Access Memory) energy consumption in low-power embedded systems can be very high, exceeding that of the data cache or even that of the processor. This paper presents and evaluates a scheme for reducing the energy consumption of SDRAM (Synchronous DRAM) memory access by a combination of techniques...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2008

    Automated Hardware-Independent Scenario Identification

    Scenario-based design exploits the time-varying execution behavior of applications by dynamically adapting the system on which they run. This is a particularly interesting design methodology for media applications with soft real-time constraints such as decoders: frames can be classified into scenarios based on their decode complexity, and the system can...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2007

    Java Object Header Elimination for Reduced Memory Consumption in 64-Bit Virtual Machines

    Memory performance is an important design issue for contemporary computer systems given the huge processor-memory speed gap. This paper proposes a space-efficient java object model for reducing the memory consumption of 64-bit java virtual machines. The authors completely eliminate the object header through Typed Virtual Addressing (TVA) or implicit typing....

    Provided By Association for Computing Machinery

  • White Papers // Oct 2006

    A Performance Counter Architecture for Computing Accurate CPI Components

    A common way of representing processor performance is to use Cycles Per Instruction (CPI) `Stacks' which break performance into a baseline CPI plus a number of individual miss event CPI components. CPI stacks can be very helpful in gaining insight into the behavior of an application on a given microprocessor;...

    Provided By Association for Computing Machinery

  • White Papers // Mar 2009

    Per-Thread Cycle Accounting in SMT Processors

    In this paper, the authors propose a cycle accounting architecture for Simultaneous Multi-Threading (SMT) processors that estimates the execution times for each of the threads had they been executed alone, while they are running simultaneously on the SMT processor. This is done by accounting each cycle to either a base,...

    Provided By Association for Computing Machinery

  • White Papers // Mar 2012

    Iterative Optimization for the Data Center

    Iterative optimization is a simple but powerful approach that searches for the best possible combination of compiler optimizations for a given workload. However, each program, if not each data set, potentially favors a different combination. As a result, iterative optimization is plagued by several practical issues that prevent it from...

    Provided By Association for Computing Machinery