- Subscribe to this page:
- RSS
- Email Alert
parallel processing
(615 results)-
White Papers
Parallel Generation of l Sequences
Jan 2009
The generation of pseudo-random sequences at a high rate is an important issue in modern communication schemes. The representation of a sequence can be scaled by decimation to obtain parallelism...
Provided by Princeton University
-
White Papers
Mobile Agent Based Distributed System Computing in Network
Nov 2009
Mobile agents are emerging as a promising paradigm for the design and implementation of distributed applications. While mobile agents have generated considerable excitement in the research...
Provided by Academy Publisher
-
White Papers
A Taxonomy for Desktop Grids From Users Perspective
Jul 2008
The authors present here taxonomy of desktop grid systems and some hints on how to use it to solve real-world problems. They have customized this taxonomy according to user' perspective to...
Provided by Norwegian University of Science and Technology
-
White Papers
Kademlia: A Peer-to-Peer Information System Based on the XOR Metric
Nov 2007
The authors describe a peer-to-peer distributed hash table with provable consistency and performance in a fault-prone environment. The system routes queries and locates nodes using a novel...
Provided by New York University
-
Whitepapers
Clustered Multiprocessing: Changing the Rules of the Performance Game
Jan 2011
Because improvements in practical computer power lag far behind these exponential hardware improvements, the key to maintaining processor performance has become multiprocessor or multicore design....
Provided by F5 Networks
-
White Papers
Performance Evaluation of Scheduling Precedence-Constrained Computations on Message-Passing Systems
Feb 2009
Scheduling precedence graphs with communication times is the theoretical basis for achieving efficient parallelism in message-passing machines. The lack of global information on the tasks, due to...
Provided by King Fahd University of Petroleum & Minerals
-
White Papers
Heterogeneous Distributed Computing
Jan 2011
One of the biggest challenges with high-performance computing is that as machine architectures become more advanced to obtain increased peak performance, only a small fraction of this performance...
Provided by Purdue University
-
White Papers
A Locked Cache-Based Synchronization Protocol for CMP
Jan 2011
CMP processors are already replacing complex single core superscalar processor architectures. They offer better performance per watt and area. This is especially true in TLP rich server and web...
Provided by American University in Cairo
-
White Papers
Working Sets, Cache Sizes, and Node Granularity Issues for Large-Scale Multiprocessors
Nov 2007
The distribution of resources among processors, memory and caches is a crucial question faced by designers of large-scale parallel machines. If a machine is to solve problems with a certain data...
Provided by Stanford University
-
Whitepapers
Defining Twenty First Century Merchandising
Feb 2011
RSR conducts benchmark studies on the state of the art and science of merchandising every year. This year, our going-in hypothesis was that after a series of fits and starts, the retail thought...
Provided by SAP
-
White Papers
Parallel Processing of Sequential Media Algorithms on Heterogeneous Multi-Processor System-on-Chip
Jun 2009
Heterogeneous Multi-Processor System-on-Chip (MPSoC) and media processing are comprehensively applied in mobile electronic commerce. And heterogeneous MPSoCs provides more opportunities for...
Provided by Academy Publisher
-
White Papers
A Multi-Level WEB Based Parallel Processing System a Hierarchical Volunteer Computing Approach
Jun 2009
Over the past few years, a number of efforts have been exerted to build parallel processing systems that utilize the idle power of LAN's and PC's available in many homes and corporations. The main...
Provided by World Academy of Science, Engineering and Technology
-
White Papers
Analysis & Integrated Modeling of the Performance Evaluation Techniques for Evaluating Parallel Systems
Jul 2007
Parallel computing has emerged as an environment for computing inherently parallel and computation intensive applications. Performance is always a key factor in determining the success of any...
Provided by Guru Nanak Dev University
-
White Papers
Development of Irregular Routing Algorithms for Parallel Computing Environment
Oct 2007
In this paper, a review of various regular and irregular parallel computing networks routing algorithm is done. Since irregular networks are usually less costly and multipath in nature as compared...
Provided by Guru Jambheshwar University of Science & Technology
-
White Papers
CUDA: Speeding Up Parallel Computing
Nov 2010
CUDA is a computing engine in NVIDIA GPUs that is available to programmers using common programming languages. It's intended for developing parallel processing applications. The CUDA...
Provided by Tongji University
-
White Papers
Towards Chip-on-Chip Neuroscience: Fast Mining of Neuronal Spike Streams Using Graphics Hardware
May 2010
Computational neuroscience is being revolutionized with the advent of multi-electrode arrays that provide real-time, dynamic perspectives into brain function. Mining neuronal spike streams from...
Provided by Association for Computing Machinery
-
White Papers
Communicating Memory Transactions
Feb 2011
Many concurrent programming models enable both transactional memory and message passing. For such models, researchers have built increasingly efficient implementations and defined reasonable...
Provided by Association for Computing Machinery
-
White Papers
Cooperative Cache Partitioning for Chip Multiprocessors
Jun 2007
This paper presents Cooperative Cache Partitioning (CCP) to allocate cache resources among threads concurrently running on CMPs. Unlike cache partitioning schemes that use a single spatial...
Provided by Association for Computing Machinery
-
White Papers
ComputErl - Erlang-Based Framework for Many Task Computing
May 2010
This paper shows how Erlang programming language can be used for creating a framework for distributing and coordinating the execution of many task computing problems. The goals of the proposed...
Provided by Erlang Solutions
-
White Papers
Concurrent Separation Logic With Weak Updates
Apr 2008
Concurrent Separation Logic (CSL) provides a simple but powerful technique for reasoning about shared-memory concurrent programs. Unfortunately, CSL and separation logic can only support "Strong...
Provided by Yale University
-
White Papers
MapCG: Writing Parallel Program Portable Between CPU and GPU
Sep 2010
Graphics Processing Units (GPU) have been playing an important role in the general purpose computing market recently. The common approach to program GPU today is to write GPU specific code with...
Provided by Association for Computational Linguistics
-
White Papers
The Emergence of SA in Enterprise IT - WP
Mar 2011
In June 2010, CA Technologies commissioned Forrester Consulting to evaluate issues surrounding the assurance of critical business services and applications. In surveying more than 150 IT...
Provided by CA
-
White Papers
On the Benefits of Work Stealing in Shared-Memory Multiprocessors
Jan 2011
Load balancing is one of the key techniques exploited to improve the performance of parallel programs. However, load balancing is a difficult task for the programmer. Work stealing is an...
Provided by Carnegie Mellon University
-
White Papers
Single-Level Integrity and Confidentiality Protection for Distributed Shared Memory Multiprocessors
Apr 2008
Multiprocessor computer systems are currently widely used in commercial settings to run critical applications. These applications often operate on sensitive data such as customer records, credit...
Provided by Institute of Electrical and Electronics Engineers
-
White Papers
Supporting Highly-Decoupled Thread-Level Redundancy for Parallel Programs
Apr 2008
The continued scaling of device dimensions and the operating voltage reduces the critical charge and thus natural noise tolerance level of transistors. As a result, circuits can produce transient...
Provided by Institute of Electrical and Electronics Engineers
-
Whitepapers
Get the most out of SSD storage with IBM Storwize v7000 and Easy Tier
Mar 2011
While the performance advantages of SSD storage are clear, the cost is often prohibitive. But what if you can target the data that really needs the performance edge at the SSD drives? You could...
Provided by IBM
-
White Papers
A Performance Study of General Purpose Applications on Graphics Processors
Jan 2011
Graphic Processors Unit (GPUs), with many light-weight data-parallel cores, can provide substantial parallel computational power to accelerate general purpose applications. To best utilize the...
Provided by University of Virginia
-
White Papers
Federation: Out-of-Order Execution Using Simple In-Order Cores
Aug 2007
Manycore architectures with dozens, hundreds, or thousands of threads are likely to use single-issue, in-order execution cores with simple pipelines but multiple thread contexts per core. This...
Provided by University of Virginia
-
White Papers
Multi-Mode Energy Management for Multi-Tier Server Clusters
Oct 2008
This paper presents an energy management policy for reconfigurable clusters running a multi-tier application, exploiting DVS together with multiple sleep states. The authors develop a theoretical...
Provided by Association for Computing Machinery
-
White Papers
Leveraging Memory Level Parallelism Using Dynamic Warp Subdivision
Apr 2009
SIMD organizations have shown to allow high throughput for data-parallel applications. They can operate on multiple datapaths under the same instruction sequencer, with its set of operations...
Provided by University of Virginia
-
White Papers
Automated Dynamic Analysis of CUDA Programs
Mar 2008
Recent increases in the programmability and performance of GPUs have led to a surge of interest in utilizing them for general-purpose computations. Tools such as NVIDIA's Cuda allow programmers to...
Provided by University of Virginia
-
White Papers
Modular Reasoning for Deterministic Parallelism
Jan 2011
Weaving a concurrency control protocol into a program is difficult and error-prone. One way to alleviate this burden is deterministic parallelism. In this well-studied approach to parallelization,...
Provided by Association for Computing Machinery
-
White Papers
Relaxed-Memory Concurrency and Verified Compilation
Jan 2011
In this paper, the authors consider the semantic design and verified compilation of a C-like programming language for concurrent shared memory computation above x86 multiprocessors. The design of...
Provided by Association for Computing Machinery
-
White Papers
A Parallel Framework for Multi-Objective Evolutionary Optimization
May 2010
This work focuses on the development of a parallel framework method to improve the effectiveness and the efficiency of the obtained solutions by Multi-objective Evolutionary Algorithms....
Provided by University of Memphis
-
White Papers
Unifying UPC and MPI Runtimes: Experience With MVAPICH
Oct 2010
Unified Parallel C (UPC) is an emerging parallel programming language that is based on a shared memory paradigm. MPI has been a widely ported and dominant parallel programming model for the past...
Provided by Association for Computing Machinery
-
White Papers
Quantifying Performance Benefits of Overlap Using MPI-2 in a Seismic Modeling Application
Jun 2010
AWM-Olsen is a widely used ground motion simulation code based on a parallel finite difference solution of the 3-D velocity-stress wave equation. This application runs on tens of thousands of...
Provided by Association for Computing Machinery
-
White Papers
Reducing Network Contention With Mixed Workloads on Modern Multicore Clusters
Aug 2009
Multi-core systems are now extremely common in modern clusters. In the past commodity systems may have had up to two or four CPUs per compute node. In modern clusters, these systems still have the...
Provided by Ohio State University
-
White Papers
Communication-Avoiding QR Decomposition for GPU's
Oct 2010
The authors describe an implementation of the Communication-Avoiding QR (CAQR) factorization that runs entirely on a single graphics processor (GPU). They show that the reduction in memory traffic...
Provided by University of California
-
White Papers
Optimizing Irregular Data Accesses for Cluster and Multi-Core Architectures
Dec 2010
Applications with irregular accesses to shared state are one of the most challenging computational patterns in parallel computing. Accesses can involve both read or write operations, with writes...
Provided by University of California
-
White Papers
Exploiting Parallel Networks Using Dynamic Channel Scheduling
Nov 2008
Many researchers have been focusing on the outcomes and consequences of the rapid increase and proliferation of mobile wireless technologies. If it is not already the case, it will soon be rare...
Provided by ICST
-
Whitepapers
An FPGA Based Hardware Architecture for Network Flow Analysis
Aug 2012
Network monitoring and measurement have become more and more important in a modern complicated network. With the rapid growth in communication, the network link speed is increasing fast. So the...
Provided by EuroJournals
-
Whitepapers
Implementation of Low Power and Small Area 128-Point Mixed Radix 4-2 FFT Processor for OFDM Applications
Aug 2012
Discrete Fourier Transform (DFT) is a very important technique used in modern Digital Signal Processing (DSP) and Telecommunications, especially for the applications involving Orthogonal Frequency...
Provided by EuroJournals
-
Whitepapers
Monitoring and Diagnosis of Multi-Agent Plan: Centralized Approach
Oct 2012
Recently, a number of model-based approaches to monitoring and diagnosis of a Multi-Agent Plan (MAP) have been proposed; this fact signifies that the Artificial Intelligence (AI) community is...
Provided by EuroJournals
-
Whitepapers
Link Quality Based Multipath Routing Protocol in Wireless Mesh Networks
Oct 2012
In Wireless Mesh Networks (WMNs), the source relies on intermediate nodes to reach the destination. As a result, failure in a node greatly affects the transmission of data. To overcome this...
Provided by EuroJournals
-
Whitepapers
Design of 64 Bit Parallel Prefix Adder Using Transmission Gate
Nov 2012
Parallel prefix adder is a technique for increasing the speed in DSP processor while performing addition process. The proposed 64-bit adder is designed using four different types prefix cell...
Provided by EuroJournals
-
Whitepapers
A Cross Layer Based Early Initiation of Layer 3 Handover in WiMAX Networks
Nov 2012
Handover Management in Mobile WiMAX is a crucial factor in providing seamless mobility to the users. In an Internetwork when a user moves from one network to another a Layer 3 (L3) HandOver (HO)...
Provided by EuroJournals
-
Whitepapers
Optimizing Microsoft Exchange in the Enterprise: Optimizing the Mailbox Server Role and the Client Access Server
Jan 2013
In this white paper, we first explore key Microsoft Exchange 2010 features: the Mailbox Server Role and Client Access Server (CAS). Through configuration of Exchange databases, resource mailboxes,...
Provided by Global Knowledge
-
Whitepapers
MIMO-OFDM System Based on MVDR Weighting and Approximate Weighting for Interference Cancellation
Dec 2012
Orthogonal Frequency Division Multiplexing (OFDM) has become a popular technique for transmission of signals over wireless channels. In this paper, Multiple-Input Multiple-Output (MIMO) system...
Provided by EuroJournals
-
Whitepapers
Challenges for Parallel I/O in Grid Computing
Jul 2011
With virtually limitless resources, GRID computing has the potential to solve large-scale scientific problems that eclipse even applications that run on the largest computing clusters today. The...
Provided by Northwestern University
-
Whitepapers
Exploring I/O Strategies for Parallel Sequence-Search Tools with S3aSim
Jul 2011
Parallel sequence-search tools are rising in popularity among computational biologists. With the rapid growth of sequence databases, database segmentation is the trend of the future for such...
Provided by Northwestern University
-
Whitepapers
Design and Implementation of an FPGA Architecture for High-Speed Network Feature Extraction
Oct 2007
Network feature extraction involves the storage and classification of network packet activity. Although primarily employed in network intrusion detection systems, feature extraction is also used...
Provided by Northwestern University
-
Whitepapers
Scaling Parallel I/O Performance through I/O Delegate and Caching System
Nov 2008
Increasingly complex scientific applications require massive parallelism to achieve the goals of fidelity and high computational performance. Such applications periodically offload checkpointing...
Provided by Institute of Electrical and Electronics Engineers
-
Whitepapers
Dynamically Adapting File Domain Partitioning Methods for Collective I/O Based on Underlying Parallel File System Locking Protocols
Nov 2008
Collective I/O, such as that provided in MPI-IO, enables process collaboration among a group of processes for greater I/O parallelism. Its implementation involves file domain partitioning, and...
Provided by Institute of Electrical and Electronics Engineers
-
Whitepapers
AHPIOS: An MPI-Based Ad-Hoc Parallel I/O System
Apr 2011
This paper presents the design and implementation of a portable Ad-Hoc Parallel I/O System (AHPIOS). AHPIOS virtualizes on-demand available distributed storage resources and allows the files to be...
Provided by Institute of Electrical and Electronics Engineers
-
Whitepapers
Combining I/O Operations for Multiple Array Variables in Parallel NetCDF
Apr 2011
Parallel netCDF (PnetCDF) is a popular library used in many scientific applications to store scientific datasets. It provides high-performance parallel I/O while maintaining file-format...
Provided by Institute of Electrical and Electronics Engineers
-
Whitepapers
Using Subfiling to Improve Programming Flexibility and Performance of Parallel Shared-file I/O
Apr 2011
There are two popular parallel I/O programming styles used by modern scientific computational applications: unique-file and shared-file. Unique-file I/O usually gives satisfactory performance, but...
Provided by Institute of Electrical and Electronics Engineers
-
Whitepapers
pFANGS: Parallel High Speed Sequence Mapping for Next Generation 454-Roche Sequencing Reads
Mar 2011
Millions of DNA sequences (reads) are generated by Next Generation Sequencing machines every day. There is a need for high performance algorithms to map these sequences to the reference genome to...
Provided by Northwestern University
-
Whitepapers
Automated Tracing of I/O Stack
Sep 2010
Efficient execution of parallel scientific applications requires high-performance storage systems designed to meet their I/O requirements. Most high-performance I/O intensive applications access...
Provided by Springer Healthcare
-
Whitepapers
Accelerating Data Mining Workloads: Current Approaches and Future Challenges in System Architecture Design
Feb 2011
Conventional systems based on general-purpose processors cannot keep pace with the exponential increase in the generation and collection of data. It is therefore important to explore alternative...
Provided by Northwestern University
-
Whitepapers
Supporting Computational Data Model Representation with High-Performance I/O in Parallel NetCDF
Jan 2012
Parallel computational scientific applications have been described by their computation and communication patterns. From a storage and I/O perspective, these applications can also be grouped into...
Provided by Northwestern University
-
Whitepapers
Delegation-Based I/O Mechanism for High Performance Computing Systems
Feb 2012
Massively parallel applications often require periodic data checkpointing for program restart and post-run data analysis. Although high performance computing systems provide massive parallelism...
Provided by Institute of Electrical and Electronics Engineers
-
Whitepapers
A Parallel Tiled Solver for Dense Symmetric Indefinite Systems on Multicore Architectures
Dec 2011
The authors describe an efficient and innovative parallel tiled algorithm for solving symmetric indefinite systems on multicore architectures. This solver avoids pivoting by using a multiplicative...
Provided by INRIA
-
Whitepapers
Block-Asynchronous Multigrid Smoothers for GPU-Accelerated Systems
Dec 2011
This paper explores the need for asynchronous iteration algorithms as smoothers in multi-grid methods. The hardware target for the new algorithms is top-of-the-line, highly parallel hybrid...
Provided by University of Tehran
-
Whitepapers
A Scalable Framework for Heterogeneous GPU-Based Clusters
Jun 2012
GPU-based heterogeneous clusters continue to draw attention from vendors and HPC users due to their high energy efficiency and much improved single-node computational performance, however, there...
Provided by University of Tehran
-
Whitepapers
A Checkpoint-on-Failure Protocol for Algorithm-Based Recovery in Standard MPI
May 2012
Most predictions of Exa-scale machines picture billion way parallelism, encompassing not only millions of cores, but also tens of thousands of nodes. Even considering extremely optimistic advances...
Provided by University of Tehran
-
Whitepapers
Anatomy of a Globally Recursive Embedded LINPACK Benchmark
Sep 2012
The authors present a complete bottom-up implementation of an embedded LINPACK benchmark on the iPad 2. They use a novel formulation of a recursive LU factorization that is recursive and parallel...
Provided by University of Tehran
-
Whitepapers
Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures Using Tree Reduction
Jan 2012
This paper is to enhance the parallelism of the tile bi-diagonal transformation using tree reduction on multicore architectures. First introduced by Ltaief et. al, the bi-diagonal transformation...
Provided by University of Malta
-
Whitepapers
Implementing a Blocked Aasen's Algorithm With a Dynamic Scheduler on Multicore Architectures
Sep 2012
Factorization of a dense symmetric indefinite matrix is a key computational kernel in many scientific and engineering simulations. However, it is difficult to develop a scalable factorization...
Provided by University of Tehran
-
Whitepapers
Energy Footprint of Advanced Dense Numerical Linear Algebra Using Tile Algorithms on Multicore Architecture
Jul 2012
The authors propose to study the impact on the energy footprint of two advanced algorithmic strategies in the context of high performance dense linear algebra libraries: mixed precision algorithms...
Provided by University of Tehran
-
Whitepapers
Use of Run Time Predictions for Automatic Co-Allocation of Multi-Cluster Resources for Iterative Parallel Applications
May 2011
Metaschedulers co-allocate resources by requesting a fixed number of processors and usage time for each cluster. These static requests, defined by users, limit the initial scheduling and prevent...
Provided by Reed Business Information
-
Whitepapers
SchedulingWorkflow Applications Based on Multi-Source Parallel Data Retrieval in Distributed Computing Networks
Jan 2012
Many scientific experiments are carried out in collaboration with researchers around the world to use existing infrastructures and conduct experiments at massive scale. Data produced by such...
Provided by Oxford University Press
-
Whitepapers
MPPSOCGEN: A Framework for Automatic Generation of MPPSOC Architecture
Apr 2012
Automatic code generation is a standard method in software engineering since it improves the code consistency and reduces the overall development time. In this context, this paper presents a...
Provided by Academy & Industry Research Collaboration Center
-
Whitepapers
An Efficient Multiprocessor Memory Management Framework Using Multi-Agents
Dec 2012
The current generation computer users call for fast addressal of their requests. Multi-processing and Multicore architectures have been adopted for dynamic assignment of a program to two or more...
Provided by Academy & Industry Research Collaboration Center
-
Whitepapers
GPU-Optimized Coarse-Grained MD Simulations of Protein and RNA Folding and Assembly
Dec 2012
Molecular Dynamics (MD) simulations provide a molecular-resolution physical description of the folding and assembly processes, but the size and the timescales of simulations are limited because...
Provided by Wake Apps
-
Whitepapers
Parallel Hermite Interpolation on Extended Fibonacci Cubes
Sep 2012
In curve fitting, interpolation is the process of replacing a continuous function by a polynomial that agrees with the function at specified number of data points. Using interpolation polynomial a...
Provided by utix,Inc.
-
Whitepapers
A Study of Different Parallel Implementations of Single Source Shortest Path Algorithms
Sep 2012
The authors present a study of parallel implementations of Single Source Shortest Path (SSSP) algorithms. In the last three decades number of parallel SSSP algorithms have been developed and...
Provided by International Journal of Computer Applications
-
Whitepapers
Fast Computation of the Shortest Path Problem Through Simultaneous Forward and Backward Systolic Dynamic Programming
Sep 2012
A systolic parallel system based on simultaneous forward and backward dynamic programming is proposed for the solution of the shortest path problem. The speed-up advantage of this fast systolic...
Provided by International Journal of Computer Applications
-
Whitepapers
Multiple Bad Data Processing using Binary PSO Algorithm Based on PC Cluster System
Dec 2012
In power systems operation, state estimation takes an important role in security control. For the state estimation problem, the Weighted Least Squares (WLS) method and the fast decoupled method...
Provided by Science and Development Network (SciDev.Net)
-
Whitepapers
An Analysis of QoS specific Coherence Issues in Distributed Networks
Dec 2012
Distributed Systems remained one of the most recent development fields in computing research. The parallel applications development faced enormous hindrance in QoS delivery due to resource hungry...
Provided by National University of Sciences and Technology
-
Whitepapers
Modeling Variation of Waiting Time of Distributed Memory Heterogeneous Parallel Computer System Using Recursive Models
Jan 2013
In a heterogeneous parallel computer system, the computational power of each of the processors differs from one another. Furthermore, with distributed memory, the capacity of the memory, which is...
Provided by University of Nicosia
-
White Papers
Concurrent Separation Logic With Weak Updates
Apr 2008
Concurrent Separation Logic (CSL) provides a simple but powerful technique for reasoning about shared-memory concurrent programs. Unfortunately, CSL and separation logic can only support "Strong...
Provided by Yale University
-
White Papers
Intel Multi-Core Processors: Making the Move to Quad-Core and Beyond
Sep 2006
This paper explains the advantages and challenges of multi-core processing, plus provides a glimpse into the upcoming Intel quad-core processors and the direction in which Intel is taking...
Provided by Intel
-
White Papers
MapCG: Writing Parallel Program Portable Between CPU and GPU
Sep 2010
Graphics Processing Units (GPU) have been playing an important role in the general purpose computing market recently. The common approach to program GPU today is to write GPU specific code with...
Provided by Association for Computational Linguistics
-
White Papers
The Emergence of SA in Enterprise IT - WP
Mar 2011
In June 2010, CA Technologies commissioned Forrester Consulting to evaluate issues surrounding the assurance of critical business services and applications. In surveying more than 150 IT...
Provided by CA
-
White Papers
On the Benefits of Work Stealing in Shared-Memory Multiprocessors
Jan 2011
Load balancing is one of the key techniques exploited to improve the performance of parallel programs. However, load balancing is a difficult task for the programmer. Work stealing is an...
Provided by Carnegie Mellon University
-
White Papers
Single-Level Integrity and Confidentiality Protection for Distributed Shared Memory Multiprocessors
Apr 2008
Multiprocessor computer systems are currently widely used in commercial settings to run critical applications. These applications often operate on sensitive data such as customer records, credit...
Provided by Institute of Electrical and Electronics Engineers
-
White Papers
Supporting Highly-Decoupled Thread-Level Redundancy for Parallel Programs
Apr 2008
The continued scaling of device dimensions and the operating voltage reduces the critical charge and thus natural noise tolerance level of transistors. As a result, circuits can produce transient...
Provided by Institute of Electrical and Electronics Engineers
-
Whitepapers
Get the most out of SSD storage with IBM Storwize v7000 and Easy Tier
Mar 2011
While the performance advantages of SSD storage are clear, the cost is often prohibitive. But what if you can target the data that really needs the performance edge at the SSD drives? You could...
Provided by IBM
-
White Papers
A Performance Study of General Purpose Applications on Graphics Processors
Jan 2011
Graphic Processors Unit (GPUs), with many light-weight data-parallel cores, can provide substantial parallel computational power to accelerate general purpose applications. To best utilize the...
Provided by University of Virginia
-
White Papers
Federation: Out-of-Order Execution Using Simple In-Order Cores
Aug 2007
Manycore architectures with dozens, hundreds, or thousands of threads are likely to use single-issue, in-order execution cores with simple pipelines but multiple thread contexts per core. This...
Provided by University of Virginia
-
White Papers
Multi-Mode Energy Management for Multi-Tier Server Clusters
Oct 2008
This paper presents an energy management policy for reconfigurable clusters running a multi-tier application, exploiting DVS together with multiple sleep states. The authors develop a theoretical...
Provided by Association for Computing Machinery
-
White Papers
Leveraging Memory Level Parallelism Using Dynamic Warp Subdivision
Apr 2009
SIMD organizations have shown to allow high throughput for data-parallel applications. They can operate on multiple datapaths under the same instruction sequencer, with its set of operations...
Provided by University of Virginia
-
White Papers
Automated Dynamic Analysis of CUDA Programs
Mar 2008
Recent increases in the programmability and performance of GPUs have led to a surge of interest in utilizing them for general-purpose computations. Tools such as NVIDIA's Cuda allow programmers to...
Provided by University of Virginia
-
White Papers
Modular Reasoning for Deterministic Parallelism
Jan 2011
Weaving a concurrency control protocol into a program is difficult and error-prone. One way to alleviate this burden is deterministic parallelism. In this well-studied approach to parallelization,...
Provided by Association for Computing Machinery
-
White Papers
Relaxed-Memory Concurrency and Verified Compilation
Jan 2011
In this paper, the authors consider the semantic design and verified compilation of a C-like programming language for concurrent shared memory computation above x86 multiprocessors. The design of...
Provided by Association for Computing Machinery
-
White Papers
A Parallel Framework for Multi-Objective Evolutionary Optimization
May 2010
This work focuses on the development of a parallel framework method to improve the effectiveness and the efficiency of the obtained solutions by Multi-objective Evolutionary Algorithms....
Provided by University of Memphis
-
White Papers
Unifying UPC and MPI Runtimes: Experience With MVAPICH
Oct 2010
Unified Parallel C (UPC) is an emerging parallel programming language that is based on a shared memory paradigm. MPI has been a widely ported and dominant parallel programming model for the past...
Provided by Association for Computing Machinery
-
White Papers
Quantifying Performance Benefits of Overlap Using MPI-2 in a Seismic Modeling Application
Jun 2010
AWM-Olsen is a widely used ground motion simulation code based on a parallel finite difference solution of the 3-D velocity-stress wave equation. This application runs on tens of thousands of...
Provided by Association for Computing Machinery
-
White Papers
Reducing Network Contention With Mixed Workloads on Modern Multicore Clusters
Aug 2009
Multi-core systems are now extremely common in modern clusters. In the past commodity systems may have had up to two or four CPUs per compute node. In modern clusters, these systems still have the...
Provided by Ohio State University
-
White Papers
Communication-Avoiding QR Decomposition for GPU's
Oct 2010
The authors describe an implementation of the Communication-Avoiding QR (CAQR) factorization that runs entirely on a single graphics processor (GPU). They show that the reduction in memory traffic...
Provided by University of California
-
White Papers
Optimizing Irregular Data Accesses for Cluster and Multi-Core Architectures
Dec 2010
Applications with irregular accesses to shared state are one of the most challenging computational patterns in parallel computing. Accesses can involve both read or write operations, with writes...
Provided by University of California
-
White Papers
Exploiting Parallel Networks Using Dynamic Channel Scheduling
Nov 2008
Many researchers have been focusing on the outcomes and consequences of the rapid increase and proliferation of mobile wireless technologies. If it is not already the case, it will soon be rare...
Provided by ICST
-
White Papers
Inter-Core Prefetching for Multicore Processors Using Migrating Helper Threads
Mar 2011
Multicore processors have become ubiquitous in today's systems, but exploiting the parallelism they offer remains difficult, especially for legacy application and applications with large serial...
Provided by Association for Computing Machinery
-
White Papers
University of Toronto - Department of Computer Science Technical Report CSRG-TR578 Impromptu Clusters for Near-Interactive Cloud-Based Services
Jul 2008
The authors introduce Impromptu Clusters (ICs), a new abstraction that makes it possible to leverage cloud-based clusters to execute short-lived parallel tasks, for example Internet services that...
Provided by University of Toronto
-
White Papers
Target Container: A Target-Centric Parallel Programming Abstraction for Video-Based Surveillance
Jan 2011
Surveillance systems are some of the most computationally intensive applications. Despite technological advances, low-cost of sensors, and continuous improvement of computer vision algorithms,...
Provided by Rutgers University
-
White Papers
Logic Soft Errors in a Parallel CISC Decoder
Mar 2010
The instruction decoder is one of the most complex and least regular logic structures in a modern processor that attempts to process multiple variable-length CISC instructions per cycle. This...
Provided by Institute of Electrical and Electronics Engineers
-
White Papers
Performance and Fault Tolerance in the StoreTorrent Parallel Filesystem
Jan 2010
With a goal of supporting the timely and cost-effective analysis of Terabyte datasets on commodity components, the authors present and evaluate StoreTorrent, a simple distributed filesystem with...
Provided by Cornell University
-
Whitepapers
New Storage Whitepaper: Mesabi: IBM Storwize V7000 Easy Tier Midmarket Focused Analyst Paper
Apr 2011
While the performance advantages of SSD storage are clear, the cost is often prohibitive. But what if you can target the data that really needs the performance edge at the SSD drives? You could...
Provided by IBM
-
White Papers
Optimal Multi-Server Allocation to Parallel Queues With Independent Random Queue-Server Connectivity
Apr 2011
The authors investigate an optimal scheduling problem in a discrete-time system of L parallel queues that are served by K identical, randomly connected servers. Each queue may be connected to a...
Provided by University of Toronto
-
White Papers
Web-Based Interface in Public Cluster
Nov 2007
A web-based interface dedicated for cluster computer which is publicly accessible for free is introduced. The interface plays an important role to enable secure public access, while providing...
Provided by Lembaga Ilmu Pengetahuan Indonesia
-
White Papers
Tiling for Performance Tuning on Different Models of GPUs
Dec 2009
The strategy of using CUDA-compatible GPUs as a parallel computation solution to improve the performance of programs has been more and more widely approved during the last two years since the CUDA...
Provided by University West Trollhattan
-
White Papers
Blocked-Based Sparse Matrix-Vector Multiplication on Distributed Memory Parallel Computers
Apr 2011
The present paper discusses the implementations of sparse matrix-vector products, which are crucial for high performance solutions of large-scale linear equations, on a PC-Cluster. Three storage...
Provided by Association of Colleges of Computing and Information
-
White Papers
Behavior-Based Problem Localization for Parallel File System
Sep 2010
The authors present a behavior-based problem-diagnosis approach for PVFS that analyzes a novel source of instrumentation - CPU instruction-pointer samples and function-call traces - to localize...
Provided by Carnegie Mellon University
-
White Papers
An Empirical Performance Study of Connection Oriented Time Warp Parallel Simulation
Jul 2009
Time warp is a well-known optimistic mechanism for parallel execution of simulation programs. Implementing time warp using a connection-oriented communication approach is proposed in the...
Provided by King Saud University
-
White Papers
Input Variable Selection Using Parallel Processing of RBF Neural Networks
Jan 2010
In this paper the authors propose a new technique focused on the selection of the important input variable for modeling complex systems of function approximation problems, in order to avoid the...
Provided by Arab American University
-
White Papers
Knowledge-Based Modeling Approach for Performance Measurement of Parallel Systems
Jan 2009
Parallel systems are important computing platforms because they offer tremendous potential to solve inherently parallel and computation intensive applications. Performance is always a key...
Provided by Guru Nanak Dev University
-
White Papers
A Parallel Hardware Architecture for the Solution of Linear Equation Systems Implemented over GF(2"n)
Mar 2011
A parallel hardware architecture for the solution of linear equation systems implemented over finite fields is presented in this paper. This proposed hardware architecture could be efficiently...
Provided by South China University of Technology
-
White Papers
Fully Homomorphic SIMD Operations
Mar 2011
At PKC 2010 Smart and Vercauteren presented a variant of Gentry's fully homomorphic public key encryption scheme and mentioned that the scheme could support SIMD style operations. The slow key...
Provided by University of Bristol
-
White Papers
Generic Constructions of Parallel Key-Insulated Encryption: Stronger Security Model and Novel Schemes
Sep 2010
Exposure of a secret key is a significant threat in practice. As a notion of security against key exposure, Dodis et al. advocated key-insulated security, and proposed concrete Key-Insulated...
Provided by National Institute of Advanced Industrial Science and Technology (AIST)
-
White Papers
A Performance Study of General-Purpose Applications on Graphics Processors Using CUDA
Jul 2008
Graphics Processors Unit (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly programmable,...
Provided by University of Virginia
Keep Up with TechRepublic
Submit a Paper
Get your content listed in our directory!
Our directory is the largest library of vendor-supplied technical content on the Web. It’s also the first place IT decision makers turn to when researching technology solutions. Our members are already finding your competitors’ papers here - shouldn’t they find yours, too? It's FREE so click here and submit your white paper, case study, data sheet, research report, or other document today!



