Association for Computing Machinery

Displaying 1-40 of 6668 results

  • White Papers // Jan 2015

    Compiler Management of Communication and Parallelism for Quantum Computation

    Quantum Computing (QC) offers huge promise to accelerate a range of computationally intensive benchmarks. Quantum computing is limited, however, by the challenges of decoherence: i.e., a quantum state can only be maintained for short windows of time before it decoheres. While quantum error correction codes can protect against decoherence, fast...

    Provided By Association for Computing Machinery

  • White Papers // Jan 2015

    A Symbolic Execution Algorithm for Constraint-Based Testing of Database Programs

    In so-called constraint-based testing, symbolic execution is a common technique used as a part of the process to generate test data for imperative programs. Databases are ubiquitous in software and testing of programs manipulating databases is thus essential to enhance the reliability of software. In this paper, the authors propose...

    Provided By Association for Computing Machinery

  • White Papers // Jan 2015

    CQIC: Revisiting Cross-Layer Congestion Control for Cellular Networks

    With the advent of high-speed cellular access and the overwhelming popularity of Smartphone's, a large percent of today's Internet content is being delivered via cellular links. Due to the nature of long-range wireless signal propagation, the capacity of the last hop cellular link can vary by orders of magnitude within...

    Provided By Association for Computing Machinery

  • White Papers // Dec 2014

    GRAPHITE: An Extensible Graph Traversal Framework for Relational Database Management Systems

    Graph traversals are a basic but fundamental ingredient for a variety of graph algorithms and graph-oriented queries. To achieve the best possible query performance, they need to be implemented at the core of a database management system that aims at storing, manipulating, and querying graph data. Increasingly, modern business applications...

    Provided By Association for Computing Machinery

  • White Papers // Dec 2014

    Uncovering Network Tarpits with Degreaser

    Network tarpits, whereby a single host or appliance can masquerade as many fake hosts on a network and slow network scanners, are a form of defensive cyber-deception. In this paper, the authors develop degreaser, an efficient fingerprinting tool to remotely detect tarpits. In addition to validating their tool in a...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    Design-Theoretic Encoding of Deterministic Hypotheses as Constraints and Correlations Into U-Relational Databases

    In view of the paradigm shift that makes science ever more data-driven, in this paper the authors consider deterministic scientific hypotheses as uncertain data. In the form of mathematical equations, hypotheses symmetrically relate aspects of the studied phenomena. For computing predictions, however, deterministic hypotheses are used asymmetrically as functions. They...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    Patterns in the Chaos - A Study of Performance Variation and Predictability in Public IaaS Clouds

    Benchmarking the performance of public cloud providers is a common research topic. Previous paper has already extensively evaluated the performance of different cloud platforms for different use cases, and under different constraints and experiment setups. In this paper, the authors present a principled, large-scale literature review to collect and codify...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    Computing Multi-Relational Sufficient Statistics for Large Databases

    Databases contain information about which relationships do and do not hold among entities. To make this information accessible for statistical analysis requires computing sufficient statistics that combine information from different database tables. Such statistics may involve any number of positive and negative relationships. With a naive enumeration approach, computing sufficient...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    Analysis of SSL Certificate Reissues and Revocations in the Wake of Heartbleed

    Central to the secure operation of a Public Key Infrastructure (PKI) is the ability to revoke certificates. While much of users' security rests on this process taking place quickly, in practice, revocation typically requires a human to decide to reissue a new certificate and revoke the old one. Thus, having...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    PixelVault: Using GPUs for Securing Cryptographic Operations

    Protecting the confidentiality of cryptographic keys in the event of partial or full system compromise is crucial for containing the impact of attacks. The Heartbleed vulnerability of April 2014, which allowed the remote leakage of secret keys from HTTPS web servers, is an indicative example. In this paper, the authors...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    Faces in the Distorting Mirror: Revisiting Photo-based Social Authentication

    In an effort to hinder attackers from compromising user accounts, Facebook launched a form of two-factor authentication called Social Authentication (SA), where users are required to identify photos of their friends to complete a log-in attempt. Recent research, however, demonstrated that attackers can bypass the mechanism by employing face recognition...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    Handcrafted Fraud and Extortion: Manual Account Hijacking in the Wild

    Online accounts are inherently valuable resources - both for the data they contain and the reputation they accrue over time. Unsurprisingly, this value drives criminals to steal, or hijack, such accounts. In this paper, the authors focus on manual account hijacking - account hijacking performed manually by humans instead of...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    Search + Seizure: The Effectiveness of Interventions on SEO Campaigns

    Black hat Search Engine Optimization (SEO), the practice of abusively manipulating search results, is an enticing method to acquire targeted user traffic. In turn, a range of interventions - from modifying search results to seizing domains - are used to combat this activity. In this paper, the authors examine the...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    Challenges in Inferring Internet Interdomain Congestion

    The authors introduce and demonstrate the utility of a method to localize and quantify inter-domain congestion in the Inter-net. Their Time Sequence Latency Probes (TSLP) method depends on two facts: internet traffic patterns are typically diurnal, and queues increase packet delay through a router during periods of adjacent link congestion....

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    Characterizing Large-Scale Click Fraud in ZeroAccess

    Click fraud is a scam that hits a criminal sweet spot by both tapping into the vast wealth of online advertising and exploiting that ecosystem's complex structure to obfuscate the flow of money to its perpetrators. In this paper, the authors illuminate the intricate nature of this activity through the...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2014

    On The Security of Mobile Cockpit Information Systems

    Recent trends in aviation have led many general aviation pilots to adopt the use of iPads (or other tablets) in the cockpit. While initially used to display static charts and documents, uses have expanded to include live data such as weather and traffic information that is used to make flight...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2014

    SoftMoW: Recursive and Reconfigurable Cellular WAN Architecture

    The current LTE network architecture is organized into very large regions, each having a core network and a radio access network. The core network contains an Internet edge comprised of Packet data network GateWays (PGWs). The radio network consists of only base stations. There are minimal interactions among regions other...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2014

    PRAN: Programmable Radio Access Networks

    With the continued exponential growth of mobile traffic and the rise of diverse applications, the current LTE Radio Access Network (RAN) architecture of cellular operators faces mounting challenges. Current RAN suffers from insufficient radio resource coordination, inefficient infrastructure utilization, and inflexible data paths. The authors present the high level design...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2014

    Sweet Little Lies: Fake Topologies for Flexible Routing

    Link-state routing protocols (e.g., OSPF and IS-IS) are widely used because they are scalable, robust, and based on simple abstractions. Unfortunately, these protocols are also relatively inflexible, since they direct all traffic over shortest paths. In contrast, Software Defined Networking (SDN) offers fine-grained control over routing, at the expense of...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2014

    Tango: Simplifying SDN Control with Automatic Switch Property Inference, Abstraction, and Optimization

    A major benefit of Software-Defined Networking (SDN) over traditional networking is simpler and easier control of network devices. The diversity of SDN switch implementation properties, which include both diverse switch hardware capabilities and diverse control-plane software behaviors, however, can make it difficult to understand and/or to control the switches in...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2014

    Blender: Upgrading Tenant-based Data Center Networking

    In this paper, the authors present Blender, a framework that enables network operators to improve tenant performance by tailoring the network's behavior to tenant needs. Tenants may upgrade their provisioned portion of the network with specific features, such as multi-path routing, isolation, and failure recovery, without modifying hosted application code....

    Provided By Association for Computing Machinery

  • White Papers // Oct 2014

    Automatic Custom Instruction Identification in Memory Streaming Algorithms

    Application-Specific Instruction Set Processors (ASIPs) extend the instruction set of a general purpose processor by dedicated Custom Instructions (CIs). In the last decade, reconfigurable processors advanced this concept towards runtime reconfiguration to increase the efficiency and adaptivity. Compiler support for automatic identification and implementation of ASIP CIs exists commercially and...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2014

    System-Level Memory Optimization for High-Level Synthesis of Component-Based SoCs

    The design of specialized accelerators is essential to the success of many modern systems-on-chip. Electronic system-level design methodologies and high-level synthesis tools are critical for the efficient design and optimization of an accelerator. Still, these methodologies and tools offer only limited support for the optimization of the memory structures, which...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2014

    Measurement and Analysis of OSN Ad Auctions

    Advertising is ubiquitous on the web; numerous ad networks serve billions of ads daily via keyword or search term auctions. Recently, Online Social Networks (OSNs) such as Facebook have created site-specific ad services that differ from traditional ad networks by letting advertisers bid on users rather than keywords. With Facebook's...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2014

    Beyond CPM and CPC: Determining the Value of Users on OSNs

    Not all of the over one billion users of Online Social Networks (OSNs) are equally valuable to the OSNs. The current business model of monetizing advertisements targeted to users does not appear to be based on any visible grouping of the users. The primary metrics remain CPM (Cost Per Mille...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2014

    Stream-Oriented Network Traffic Capture and Analysis for High-Speed Networks

    Intrusion detection, traffic classification, and other network monitoring applications need to analyze the captured traffic beyond the network layer to allow for connection-oriented analysis, and achieve resilience to evasion attempts based on TCP segmentation. Existing network traffic capture frameworks, however, provide applications with raw packets and leave complex operations like...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Poster - iSync: A High Performance and Scalable Data Synchronization Protocol for Named Data Networking

    In this paper, the authors present a high performance synchronization protocol for Named Data Networking (NDN). The protocol, called iSync, uses a two-level Invertible Bloom Filter (IBF) structure to support efficient data reconciliation. Multiple differences can be found by subtracting a remote IBF from a local IBF, and therefore, from...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Consumer-Producer API for Named Data Networking

    As a new architecture, NDN requires a new API. Today's socket API cannot be reused for NDN communication because its foundational concept is point-to-point virtual channel that does not exist in NDN. This paper presents a new network programming interface to NDN communication protocols and architectural modules. This new API...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    On the Role of Routing in Named Data Networking

    A unique feature of Named Data Networking (NDN) is that its forwarding plane can detect and recover from network faults on its own, enabling each NDN router to handle network failures locally without relying on global routing convergence. This new feature prompts the authors to re-examine the role of routing...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    VIP: A Framework for Joint Dynamic Forwarding and Caching in Named Data Networks

    Emerging information-centric networking architectures seek to optimally utilize both bandwidth and storage for efficient content distribution. This highlights the need for joint design of traffic engineering and caching strategies, in order to optimize network performance in view of both current traffic loads and future traffic demands. The authors present a...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Kite: A Mobility Support Scheme for NDN

    Named Data Networking (NDN) natively supports the mobility of data consumers through its data-centric design and stateful forwarding plane. However, the mobility support for data producers remains open in the original proposal. In this paper, the authors introduce Kite, a design of mobility support for NDN. Kite leverages the state...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Security and QoS Tradeoff Recommendation System (SQT-RS) for Dynamic Assessing CPRM-based Systems

    Context-based Parametric Relationship Models (CPRM) defines complex dependencies between different types of parameters. In particular, security and QoS relationships that may occur at different levels of abstraction are easily identified using CPRM. However, the growing number of parameters and relationships, typically due to the heterogeneous scenarios of future networks, increase...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Automatic Verification of Interactions in Asynchronous Systems with Unbounded Buffers

    Asynchronous communication requires message queues to store the messages that are yet to be consumed. Verification of interactions in asynchronously communicating systems is challenging since the sizes of these queues can grow arbitrarily large during execution. In fact, behavioral models for asynchronously communicating systems typically have in finite state spaces,...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    A Bi-Objective Cost Model for Database Queries in a Multi-Cloud Environment

    Cost models are broadly used in query processing to drive the query optimization process, accurately predict the query execution time, schedule database query tasks, apply admission control and derive resource requirements to name a few applications. The main role of cost models is to produce the time needed to run...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Proximity-Based Wireless Access Control through Considerate Jamming

    As diverse types of wireless devices emerge, it becomes difficult to apply the existing wireless security measures to them without efforts. Those devices lack conventional user interfaces or they are resource-constrained to process the security protocols. Meanwhile, many of them are used within a geographical boundary to access to the...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Rethink Energy Accounting with Cooperative Game Theory

    Energy accounting determines how much a software principal contributes to the total system energy consumption. It is the foundation for evaluating software and for operating system based energy management. While various energy accounting policies have been tried, there is no known way to evaluate them directly simply because it is...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Combating Inter-cell Interference in 802.11ac-based Multi-user MIMO Networks

    In an 802.11ac-based MU-MIMO network comprised of multiple cells, inter-cell interference allows only a single AP to serve its clients at the same time, significantly limiting the network capacity. In this paper, the authors overcome this limitation by letting the APs and clients in interfering cells coordinately cancel the inter-cell...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Enfold: Downclocking OFDM in WiFi

    Dynamic Voltage and Frequency Scaling (DVFS) has long been used as a technique to save power in a variety of computing domains but typically not in communications devices. A fundamental limit that prevents decreasing the clock frequency is the Nyquist (-Shannon) sampling theorem, which states that the sampling rate must...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Cutting the Cord: a Robust Wireless Facilities Network for Data Centers

    Today's network control and management traffic are limited by their reliance on existing data networks. Fate sharing in this paper, is highly undesirable, since control traffic has very different availability and traffic delivery requirements. In this paper, the authors explore the feasibility of building a dedicated wireless facilities network for...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2014

    Overheard ACK with Token Passing: An Optimization to 802.11 MAC Protocol

    Distributed Coordination Function (DCF) is defined in IEEE 802.11 standard, which is widely used in practice. Despite of its wide use, it has several limitations. Because of the idle and collision times, it suffers from poor channel utilization. Besides, the control packets, particularly, ACKnowledgement (ACK), consume non-trivial amount of bandwidth....

    Provided By Association for Computing Machinery

  • White Papers // Jun 2011

    Cost-Effectively Offering Private Buffers in SoCs and CMPs

    High performance SoCs and CMPs integrate multiple cores and hardware accelerators such as network interface devices and speech recognition engines. Cores make use of SRAM organized as a cache. Accelerators make use of SRAM as special-purpose storage such as FIFOs, scratchpad memory, or other forms of private buffers. Dedicated private...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2009

    Leveraging 3D PCRAM Technologies to Reduce Checkpoint Overhead for Future Exascale Systems

    The scalability of future Massively Parallel Processing (MPP) systems is being severely challenged by high failure rates. Current Hard Disk Drive (HDD) checkpointing results in overhead of 25% or more at the petascale. With a direct correlation between checkpoint frequencies and node counts, novel techniques that can take more frequent...

    Provided By Association for Computing Machinery

  • White Papers // Mar 2011

    Games as Motivation in Computer Design Courses: I/O is the Key

    The design of computer games can be a powerful motivator as students learn about computer architecture and design. Students in classes where computer designs are developed and implemented (usually on Field Programmable Gate Arrays (FPGAs)) seem much more highly motivated if their computer design can be used for something visual...

    Provided By Association for Computing Machinery

  • White Papers // Feb 2013

    Multi-user Dynamic Proofs of Data Possession using Trusted Hardware

    In storage outsourcing services, clients store their data on a potentially untrusted server, which has more computational power and storage capacity than the individual clients. In this model, security properties such as integrity, authenticity, and freshness of stored data ought to be provided, while minimizing computational costs at the client,...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2011

    Flexible Aggregate Similarity Search

    Flexible Aggregate Nearest Neighbor similarity search (FANN) extends the Aggregate Nearest Neighbor similarity search (ANN) with added flexibility that is useful in many applications. In this paper, the authors presented a comprehensive study on the FANN problem, by designing exact and approximation methods that work well in low to high...

    Provided By Association for Computing Machinery

  • White Papers // Dec 2013

    Quantifying the Relationship between the Power Delivery Network and Architectural Policies in a 3D-Stacked Memory Device

    Many of the pins on a modern chip are used for power de-livery. If fewer pins were used to supply the same current, the wires and pins used for power delivery would have to carry larger currents over longer distances. This results in an \"IR-drop\" problem, where some of the...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2012

    Optimizing Datacenter Power with Memory System Levers for Guaranteed Quality-of-Service

    Co-location of applications is a proven technique to improve hardware utilization. Recent advances in virtualization have made co-location of independent applications on shared hardware a common scenario in datacenters. Co-location, while maintaining Quality-of-Service (QoS) for each application is a complex problem that is fast gaining relevance for these datacenters. The...

    Provided By Association for Computing Machinery

  • White Papers // Jan 2013

    Layout-Oblivious Compiler Optimization for Matrix Computations

    Most scientific computations serve to apply mathematical operations to a set of preconceived data structures, e.g., matrices, vectors, and grids. In this paper, the authors use a number of widely used matrix computations from the LINPACK library to demonstrate that complex internal organizations of data structures can severely degrade the...

    Provided By Association for Computing Machinery

  • White Papers // Feb 2009

    Comparability Graph Coloring for Optimizing Utilization of Stream Register Files in Stream Processors

    A stream processor executes an application that has been decomposed into a sequence of kernels that operate on streams of data elements. During the execution of a kernel, all streams accessed must be communicated through the SRF (Stream Register File), a non-bypassing software-managed on-chip memory. Therefore, optimizing utilization of the...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2006

    Minimizing Bank Selection Instructions for Partitioned Memory Architectures

    Bank switching is a technique that increases the code and data memory in microcontrollers without extending the address buses. Given a program in which variables have been assigned to data banks, the authors present a novel optimization technique that minimizes the overhead of bank switching through cost-effective placement of bank...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2009

    Optimal Loop Parallelization for Maximizing Iteration-Level Parallelism

    In this paper, the authors solve the open problem of extracting the maximal number of iterations from a loop that can be executed in parallel on Chip Multi-Processors (CMPs). Their algorithm solves it optimally by migrating the weights of parallelism-inhibiting dependences on dependence cycles in two phases. They model dependence...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2013

    Load-Balanced Pipeline Parallelism

    Accelerating a single thread in current parallel systems remains a challenging problem, because sequential threads do not naturally take advantage of the additional cores. Recent paper shows that automatic extraction of pipeline parallelism is an effective way to speed up single thread execution. However, two problems remain challenging - load...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2007

    Scratchpad Allocation for Data Aggregates in Superperfect Graphs

    Existing methods place data or code in scratchpad memory, i.e., SPM by either relying on heuristics or resorting to integer programming or mapping it to a graph coloring problem. In this paper, the SPM allocation problem is formulated as an interval coloring problem. The key observation is that in many...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2008

    Optimizing Scientific Application Loops on Stream Processors

    In this paper, the authors describe a graph coloring compiler framework to allocate on-chip SRF (Stream Register File) storage for optimizing scientific applications on stream processors. Their framework consists of first applying enabling optimizations such as loop unrolling to expose stream reuse and opportunities for maximizing parallelism, i.e., overlapping kernel...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2012

    WCET-Aware Data Selection and Allocation for Scratchpad Memory

    In embedded systems, SPM (ScratchPad Memory) is an attractive alternative to cache memory due to its lower energy consumption and higher predictability of program execution. This paper studies the problem of placing variables of a program into an SPM such that its WCET (Worst-Case Execution Time) is minimized. The authors...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2008

    Exploiting Loop-Dependent Stream Reuse for Stream Processors

    The memory access limits the performance of stream processors. By exploiting the reuse of data held in the Stream Register File (SRF), an on-chip storage, the number of memory accesses can be reduced. In current stream compilers reuse is only attempted for simple stream references, those whose start and end...

    Provided By Association for Computing Machinery

  • White Papers // Sep 2009

    Compiler-Directed Scratchpad Memory Management via Graph Coloring

    ScratchPad Memory (SPM), a fast on-chip SRAM managed by software, is widely used in embedded systems. This paper introduces a general-purpose compiler approach, called memory coloring, to assign static data aggregates such as arrays and structs in a program to an SPM. The novelty of this approach lies in partitioning...

    Provided By Association for Computing Machinery

  • White Papers // Aug 2007

    Minimal Placement of Bank Selection Instructions for Partitioned Memory Architectures

    The authors have devised an algorithm for minimal placement of bank selections in partitioned memory architectures. This algorithm is parameterizable for a chosen metric such as speed, space or energy. Bank switching is a technique that increases the code and data memory in microcontroller's with-out extending the address buses. Given...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2012

    Embedded Reconfigurable Architectures

    In current-day embedded systems design, one is faced with cut-throat competition to deliver new functionalities in increasingly shorter time frames. This is now achieved by incorporating processor cores into embedded systems through (re)programmability. However, this is not always beneficial for the performance or energy consumption. Therefore, adaptable embedded systems have...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2011

    Juggle: Proactive Load Balancing on Multicore Computers

    The authors investigate proactive dynamic load balancing on multicore systems, in which threads are continually migrated to reduce the impact of processor/thread mismatches to enhance the flexibility of the SPMD-style programming model, and enable SPMD applications to run efficiently in multi-programmed environments. They present Juggle, a practical decentralized, user-space implementation...

    Provided By Association for Computing Machinery

  • White Papers // Aug 2010

    Runtime Parallelization of Legacy Code on a Transactional Memory System

    In this paper, the authors propose a new runtime parallelization technique, based on a dynamic optimization framework, to automatically parallelize single-threaded legacy programs. It heavily leverages the optimistic concurrency of transactional memory. This paper addresses a number of challenges posed by this type of parallelization and quantifies the trade-offs of...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2010

    Improving Scratchpad Allocation with Demand-Driven Data Tiling

    Existing ScratchPad Memory (SPM) allocation algorithms for arrays, whether they rely on heuristics or resort to Integer Linear Programming (ILP) techniques, typically assume that every array is small enough to fit directly into the SPM. As a result, some arrays have to be spilled entirely to the off-chip memory in...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2011

    An Efficient Heuristic for Instruction Scheduling on Clustered VLIW Processors

    Clustering is a well-known technique for improving the scalability of classical VLIW processors. A clustered VLIW processor consists of multiple clusters, each of which has its own register le and functional units. In this paper, the authors present a novel phase coupled priority-based heuristic for scheduling a set of instructions...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2010

    Optimal WCET-Aware Code Selection for Scratchpad Memory

    In this paper, the authors studied the problem of minimizing the worst-case execution time of a loop nest executed on a processor that uses SPM to replace the instruction cache. They proposed the first polynomial-time algorithm for selecting the code of a non-nested loop such that the worst-case execution time...

    Provided By Association for Computing Machinery

  • White Papers // Jan 2012

    SESAM/Par4All: A Tool for Joint Exploration of MPSoC Architectures and Dynamic Dataflow Code Generation

    Due to the increasing complexity of new multiprocessor systems on chip, flexible and accurate simulators become a necessity for exploring the vast design space solution. In a streaming execution model, only a well-balanced pipeline can lead to an efficient implementation. However with dynamic applications, each stage is prone to execution...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2012

    Memory Management for Many-Core Processors with Software Configurable Locality Policies

    As processors evolve towards higher core counts, architects will develop more sophisticated memory systems to satisfy the cores' increasing thirst for memory bandwidth. Early many-core processor designs suggest that future memory systems will likely include multiple controllers and distributed cache coherence protocols. Many-core processors that expose memory locality policies to...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2013

    Tessellation: Refactoring the OS Around Explicit Resource Containers With Continuous Adaptation

    Adaptive Resource-Centric Computing (ARCC) enables a simultaneous mix of high-throughput parallel, real-time, and interactive applications through automatic discovery of the correct mix of resource assignments necessary to achieve application requirements. This approach, embodied in the Tessellation manycore operating system, distributes resources to QoS domains called cells. Tessellation separates global decisions...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2008

    Comparative Evaluation of Memory Models for Chip Multiprocessors

    There are two competing models for the on-chip memory in Chip Multi-Processor (CMP) systems: hardware-managed coherent caches and software-managed streaming memory. This paper performs a direct comparison of the two models under the same set of assumptions about technology, area, and computational capabilities. The goal is to quantify how and...

    Provided By Association for Computing Machinery

  • White Papers // Apr 2014

    Reconciling High Server Utilization and Sub-Millisecond Quality-of-Service

    The simplest strategy to guarantee good Quality-of-Service (QoS) for a latency-sensitive workload with sub-millisecond latency in a shared cluster environment is to never run other workloads concurrently with it on the same server. Unfortunately, this inevitably leads to low server utilization, reducing both the capability and cost effectiveness of the...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2013

    ZSim: Fast and Accurate Microarchitectural Simulation of Thousand-Core Systems

    Architectural simulation is time-consuming, and the trend towards hundreds of cores is making sequential simulation even slower. Existing parallel simulation techniques either scale poorly due to excessive synchronization, or sacrifice accuracy by allowing event reordering and using simplistic contention models. As a result, most researchers use sequential simulators and model...

    Provided By Association for Computing Machinery

  • White Papers // Jun 2013

    Convolution Engine: Balancing Efficiency & Flexibility in Specialized Computing

    In this paper, the authors focus on the trade-off between flexibility and efficiency in specialized computing. They observe that specialized units achieve most of their efficiency gains by tuning data storage and compute structures and their connectivity to the data-flow and data-locality patterns in the kernels. Hence, by identifying key...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2012

    A Case of System-Level Hardware/Software Co-Design and Co-Verification of a Commodity Multi-Processor System with Custom Hardware

    In this paper, the authors present an interesting system-level co-design and co-verification case study for a non-trivial design where multiple high-performing x86 processors and custom hard-ware were connected through a coherent interconnection fabric. In functional verification of such a system, they used a processor Bus Functional Model (BFM) to combine...

    Provided By Association for Computing Machinery

  • White Papers // May 2009

    Pleiad: A Cross-Environment Middleware Providing Efficient Multithreading on Clusters

    The engagement of cluster and grid computing, two popular trends of today's high performance computation, has formed an imperative need for efficient utilization of the afforded resources. In this paper the authors present the concept, design and implementation of the Pleiad platform. Having its origin in the proposition of Distributed...

    Provided By Association for Computing Machinery

  • White Papers // Feb 2009

    Large-Scale Wire-Speed Packet Classification on FPGAs

    Multi-field packet classification is a key enabling function of a variety of network applications, such as firewall processing, Quality of Service (QoS) differentiation, traffic billing, and other value added services. Although a plethora of research has been done in this area, wire-speed packet classification while supporting large rule sets remains...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2008

    Compact Architecture for High-Throughput Regular Expression Matching on FPGA

    In this paper, the authors present a novel architecture for high-speed and high-capacity Regular Expression Matching (REM) on FPGA. The proposed REM architecture, based on Regular Expression-Nondeterministic Finite Automaton (RE-NFA), efficiently constructs Regular Expression Matching Engines (REME) of arbitrary regular patterns and character classes in a uniform structure, utilizing both...

    Provided By Association for Computing Machinery

  • White Papers // Jul 2009

    Reduction Techniques for Synchronous Dataflow Graphs

    The Synchronous DataFlow (SDF) model of computation is popular for modeling the timing behavior of real-time embedded hardware and software systems and applications. It is an essential ingredient of several automated design-flows and design-space exploration tools. The model can be analyzed for throughput and latency properties. Although the SDF model...

    Provided By Association for Computing Machinery

  • White Papers // Jul 2009

    A Parameterized Compositional Multi-Dimensional Multiple-Choice Knapsack Heuristic for CMP Run-Time Management

    Modern embedded systems typically contain Chip Multi-Processors (CMPs) and support a variety of applications. Applications may run concurrently and can be started and stopped over time. Each application may typically have multiple feasible configurations, trading off quality aspects with resource usage for various types of resources. Overall system quality needs...

    Provided By Association for Computing Machinery

  • White Papers // Oct 2008

    SPaC: A Symbolic Pareto Calculator

    The compositional computation of Pareto points in multi-dimensional optimization problems is an important means to efficiently explore the optimization space. This paper presents a symbolic Pareto calculator, SPaC, for the algebraic computation of multi-dimensional trade-offs. SPaC uses BDDs as a representation for solution sets and operations on them. The tool...

    Provided By Association for Computing Machinery

  • White Papers // May 2013

    A Fast and Scalable Multi-Dimensional Multiple-Choice Knapsack Heuristic

    Many combinatorial optimization problems in the embedded systems and design automation domains involve decision making in multi-dimensional spaces. The Multi-dimensional Multiple-choice Knapsack Problem (MMKP) is among the most challenging of the encountered optimization problems. MMKP problem instances appear for example in chip multiprocessor run-time resource management and in global routing...

    Provided By Association for Computing Machinery

  • White Papers // Nov 2011

    The ReNoC Reconfigurable Network-on-Chip: Architecture, Configuration Algorithms, and Evaluation

    In this paper, the authors present a reconfigurable network-on-chip architecture called ReNoC, which is intended for use in general-purpose multiprocessor system-on-chip platforms, and which enables application-specific logical NoC topologies to be configured, thus providing both efficiency and flexibility. The paper presents three novel algorithms that synthesize an application-specific NoC topology,...

    Provided By Association for Computing Machinery