The University of Maine at Machias

Displaying 1-22 of 22 results

  • White Papers // Jan 2014

    Automatic Data Path Extraction in Large-Scale Register-Transfer Level Designs

    Extracting data paths in large-scale register transfer level designs has important usage in automatic verification of synchronous circuits and synthesis of asynchronous circuits. Current tools rely on users to provide the data/control partition or use state-space analyses to extract data paths. Due to the explosion of state-space, the latter method...

    Provided By The University of Maine at Machias

  • White Papers // Jan 2014

    A Result Forwarding Mechanism for Asynchronous Pipelined Systems

    Modern, fast microprocessors are deeply pipelined to enhance their performance. Thus they cannot afford to wait for each instruction to complete before starting the next. When inter-instruction dependencies are encountered it is essential that data are forwarded from their point of production to where they are needed as rapidly as...

    Provided By The University of Maine at Machias

  • White Papers // Jun 2011

    Routing of Asynchronous Clos Networks

    Clos networks provide the theoretically optimal solution to build high-radix switches. Dynamically reconfiguring a three-stage Clos network is more difficult in asynchronous circuits than in synchronous circuits. This paper proposes a novel Asynchronous Dispatching (AD) algorithm for general three-stage Clos networks. It is compared with the classic synchronous Concurrent Round-Robin...

    Provided By The University of Maine at Machias

  • White Papers // Apr 2011

    Distributed Configuration of Massively-Parallel Simulation on SpiNNaker Neuromorphic Hardware

    SpiNNaker is a massively-parallel neuromorphic computing architecture designed to model very large, biologically plausible spiking neural networks in real-time. A SpiNNaker machine consists of up to 216 homogeneous eighteen-core multiprocessor chips, each with an on-board router which forms links with neighboring chips for packet-switched inter-processor communications. The architecture is designed...

    Provided By The University of Maine at Machias

  • White Papers // Sep 2010

    GPU-Enabled Steady-State Solution of Large Markov Models

    More cores, not faster clock speeds, drive performance enhancement in today's processors. The authors describe a novel parallel steady-state solver that uses NVIDIA's Compute Unified Device Architecture (CUDA) library to perform calculations on a Graphics Processing Unit (GPU). They demonstrate speed-ups of over 8 times compared with a CPU-only solver....

    Provided By The University of Maine at Machias

  • White Papers // May 2010

    M-of-N Code Decomposition for Indicating Combinational Logic

    Self-timed circuits present an attractive solution to the problem of process variation. However, implementing self-timed combinational logic is complex and expensive. In particular, mapping large function blocks into cell-libraries is difficult as decomposing gates introduces new signals which may violate indication. This paper presents a novel method for implementing any...

    Provided By The University of Maine at Machias

  • White Papers // May 2010

    A Complete Synthesis Method for Block-Level Relaxation in Self-Timed Datapaths

    Self-timed circuits present an attractive solution to the problem of process variation. However, implementing self-timed combinational logic can be complex and expensive. This paper presents a complete synthesis flow that generates self-timed combinational networks from conventional Boolean networks. The boolean network is partitioned into small function blocks which are then...

    Provided By The University of Maine at Machias

  • White Papers // Apr 2010

    Algorithm and Software for Simulation of Spiking Neural Networks on the Multi-Chip SpiNNaker System

    In this paper the authors present the algorithm and software developed for parallel simulation of spiking neural networks on multiple SpiNNaker universal neuromorphic chips. It not only describes approaches to simulating neural network models, such as dynamics, neural representations, and synaptic delays, but also presents the software design of loading...

    Provided By The University of Maine at Machias

  • White Papers // Mar 2010

    Wagging Logic: Implicit Parallelism Extraction Using Asynchronous Methodologies

    Asynchronous circuits have a number of potential performance advantages over their synchronous equivalents due to the ability to exploit average case performance. These advantages are off-set by the loss of performance caused by the handshaking overheads which causes designs to be throughput bound. This paper investigates the nature of the...

    Provided By The University of Maine at Machias

  • White Papers // Jan 2010

    The Leaky Integrate-and-Fire Neuron: A Platform for Synaptic Model Exploration on the SpiNNaker Chip

    Large-scale neural hardware systems are trending increasingly towards the \"Neuromimetic\" architecture: a general-purpose platform that specializes in hardware for neural networks but allows flexibility in model choice. Since the model is not hard-wired into the chip, exploration of different neural and synaptic models is not merely possible but provides a...

    Provided By The University of Maine at Machias

  • White Papers // Jun 2009

    Adaptive Admission Control on the Spinnaker MPSOC

    Multi-processor systems-on-chip have strict performance, power and area cost goals. A key requirement to enable MPSoC platforms to handle real-time, high-demand applications is on-chip communication service guarantees. End-to-end communication service is critical to maximize both flexibility and performance on a Multi-Processor System-on-Chip (MPSoC). The authors introduce adaptive admission control to...

    Provided By The University of Maine at Machias

  • White Papers // May 2009

    A Synthesisable Quasi-Delay Insensitive Result Forwarding Unit for an Asynchronous Processor

    The implementation of an efficient result forwarding unit for asynchronous processors faces the problem of the inherent lack of synchronization between result producer and consumer units. An efficient, full-custom solution to this problem has been proposed and implemented before (in the AMULET3 asynchronous processor) with the consequent limitations on design-space...

    Provided By The University of Maine at Machias

  • White Papers // Apr 2009

    Power, Delay and Area Efficient Self-Timed Multiplexer and Demultiplexer Designs

    Self-Timed (ST) logic design, in general, guarantees that the required functionality is satisfied irrespective of delays in the circuit components or signal wires. Efficient gate level design methods for robust self-timed realization of arbitrary size multiplexer and demultiplexer function blocks, using elements of a commercial standard cell library are discussed...

    Provided By The University of Maine at Machias

  • White Papers // Apr 2009

    Adaptive Stochastic Routing in Fault-Tolerant On-Chip Networks

    Due to shrinking transistor geometries, on-chip circuits are becoming vulnerable to errors, but at the same time on-chip networks are required to provide reliable services over unreliable physical interconnect. A Connection Oriented Stochastic Routing (COSR) algorithm has been used on one NoC platform that provides excellent fault-tolerance and dynamic reconfiguration...

    Provided By The University of Maine at Machias

  • White Papers // Mar 2009

    Self-Timed Full Adder Designs Based on Hybrid Input Encoding

    Self-timed full adder designs based on commercial synchronous resources (standard cells), constructed using a mix of complete delay-insensitive codes adopted for inputs are described in this paper. While one of the adder designs incorporates redundancy into the logic, the other design does not. Comparisons have been carried out with respect...

    Provided By The University of Maine at Machias

  • White Papers // Feb 2009

    A Programmable Adaptive Router for a GALS Parallel System

    In this paper, the authors describe a router which is the key component of a scalable asynchronous on-chip and inter-chip communication infrastructure for an application-specific parallel computing system. The authors use this system as a universal platform for real time simulations of large-scale neural networks. The communications router supports multiple...

    Provided By The University of Maine at Machias

  • White Papers // Feb 2009

    A Delay Efficient Robust Self-Timed Full Adder

    Addition forms the basis of digital computer systems. A gate level self-timed full adder design, utilizing a pre-defined set of gates, available in a commercial synchronous standard cell library is discussed in this paper. The proposed adder satisfies seitz's weak-indication specifications and exhibits reduced data path delay in comparison with...

    Provided By The University of Maine at Machias

  • White Papers // Nov 2008

    System-Level Modelling for Spinnaker CMP System

    The SpiNNaker Chip-Multi-Processor (CMP) system is a novel SoC architecture, designed specifically for large-scale neural simulations in real-time. The authors have developed a multi-chip complete system simulation for the SpiNNaker massively parallel CMP system using SystemC Transaction Level Modeling (TLM) to analyze architectural tradeoffs, verify the design, and develop/test intended...

    Provided By The University of Maine at Machias

  • White Papers // Aug 2008

    Performance Analysis of Two Synchronizers

    Synchronizers are necessary when importing signals into any clocked domain. As multiple different clocks become increasingly common on chips, synchronizers also proliferate. To achieve high performance it is important that the system designer is aware of the timing characteristics of different synchronizers -which are non-deterministic by nature - and can...

    Provided By The University of Maine at Machias

  • White Papers // Aug 2008

    Configuring a Large-Scale GALS System

    The SpiNNaker massively parallel GALS system has been designed to support large-scale simulations of biologically inspired neural networks in real-time. The system is built around the Chip-Multi-Processor (CMP) technology using low-power ARM processors with an asynchronous Network-on-Chip (NoC) to support high performance parallel distributed processing. A novel asynchronous event-driven boot-up...

    Provided By The University of Maine at Machias

  • White Papers // Jul 2007

    Design and Implementation of an Energy Efficient, Parallel, Asynchronous DSP

    Energy efficient computing in a DSP has become an important research issue in order to have a longer battery operating time to support the modern portable devices. The energy efficient functional unit has been designed and implemented for an in-house asynchronous DSP named, Configurable Asynchronous DSP for Reduced Energy (CADRE)....

    Provided By The University of Maine at Machias

  • White Papers // Oct 2006

    Performance-Driven Syntax-Directed Synthesis of Asynchronous Processors

    The development of robust and efficient synthesis tools is important if asynchronous design is to gain more widespread acceptance. Syntax-directed translation is a powerful synthesis paradigm that compiles transparently a system specification written in a high-level language into a network of pre-designed handshaking modules. The transparency is provided by a...

    Provided By The University of Maine at Machias

  • White Papers // Feb 2009

    A Programmable Adaptive Router for a GALS Parallel System

    In this paper, the authors describe a router which is the key component of a scalable asynchronous on-chip and inter-chip communication infrastructure for an application-specific parallel computing system. The authors use this system as a universal platform for real time simulations of large-scale neural networks. The communications router supports multiple...

    Provided By The University of Maine at Machias

  • White Papers // Apr 2009

    Adaptive Stochastic Routing in Fault-Tolerant On-Chip Networks

    Due to shrinking transistor geometries, on-chip circuits are becoming vulnerable to errors, but at the same time on-chip networks are required to provide reliable services over unreliable physical interconnect. A Connection Oriented Stochastic Routing (COSR) algorithm has been used on one NoC platform that provides excellent fault-tolerance and dynamic reconfiguration...

    Provided By The University of Maine at Machias

  • White Papers // Mar 2010

    Wagging Logic: Implicit Parallelism Extraction Using Asynchronous Methodologies

    Asynchronous circuits have a number of potential performance advantages over their synchronous equivalents due to the ability to exploit average case performance. These advantages are off-set by the loss of performance caused by the handshaking overheads which causes designs to be throughput bound. This paper investigates the nature of the...

    Provided By The University of Maine at Machias

  • White Papers // Oct 2006

    Performance-Driven Syntax-Directed Synthesis of Asynchronous Processors

    The development of robust and efficient synthesis tools is important if asynchronous design is to gain more widespread acceptance. Syntax-directed translation is a powerful synthesis paradigm that compiles transparently a system specification written in a high-level language into a network of pre-designed handshaking modules. The transparency is provided by a...

    Provided By The University of Maine at Machias

  • White Papers // Jul 2007

    Design and Implementation of an Energy Efficient, Parallel, Asynchronous DSP

    Energy efficient computing in a DSP has become an important research issue in order to have a longer battery operating time to support the modern portable devices. The energy efficient functional unit has been designed and implemented for an in-house asynchronous DSP named, Configurable Asynchronous DSP for Reduced Energy (CADRE)....

    Provided By The University of Maine at Machias

  • White Papers // Jan 2014

    A Result Forwarding Mechanism for Asynchronous Pipelined Systems

    Modern, fast microprocessors are deeply pipelined to enhance their performance. Thus they cannot afford to wait for each instruction to complete before starting the next. When inter-instruction dependencies are encountered it is essential that data are forwarded from their point of production to where they are needed as rapidly as...

    Provided By The University of Maine at Machias

  • White Papers // Jun 2011

    Routing of Asynchronous Clos Networks

    Clos networks provide the theoretically optimal solution to build high-radix switches. Dynamically reconfiguring a three-stage Clos network is more difficult in asynchronous circuits than in synchronous circuits. This paper proposes a novel Asynchronous Dispatching (AD) algorithm for general three-stage Clos networks. It is compared with the classic synchronous Concurrent Round-Robin...

    Provided By The University of Maine at Machias

  • White Papers // Apr 2011

    Distributed Configuration of Massively-Parallel Simulation on SpiNNaker Neuromorphic Hardware

    SpiNNaker is a massively-parallel neuromorphic computing architecture designed to model very large, biologically plausible spiking neural networks in real-time. A SpiNNaker machine consists of up to 216 homogeneous eighteen-core multiprocessor chips, each with an on-board router which forms links with neighboring chips for packet-switched inter-processor communications. The architecture is designed...

    Provided By The University of Maine at Machias

  • White Papers // May 2010

    A Complete Synthesis Method for Block-Level Relaxation in Self-Timed Datapaths

    Self-timed circuits present an attractive solution to the problem of process variation. However, implementing self-timed combinational logic can be complex and expensive. This paper presents a complete synthesis flow that generates self-timed combinational networks from conventional Boolean networks. The boolean network is partitioned into small function blocks which are then...

    Provided By The University of Maine at Machias

  • White Papers // May 2010

    M-of-N Code Decomposition for Indicating Combinational Logic

    Self-timed circuits present an attractive solution to the problem of process variation. However, implementing self-timed combinational logic is complex and expensive. In particular, mapping large function blocks into cell-libraries is difficult as decomposing gates introduces new signals which may violate indication. This paper presents a novel method for implementing any...

    Provided By The University of Maine at Machias

  • White Papers // Jun 2009

    Adaptive Admission Control on the Spinnaker MPSOC

    Multi-processor systems-on-chip have strict performance, power and area cost goals. A key requirement to enable MPSoC platforms to handle real-time, high-demand applications is on-chip communication service guarantees. End-to-end communication service is critical to maximize both flexibility and performance on a Multi-Processor System-on-Chip (MPSoC). The authors introduce adaptive admission control to...

    Provided By The University of Maine at Machias

  • White Papers // May 2009

    A Synthesisable Quasi-Delay Insensitive Result Forwarding Unit for an Asynchronous Processor

    The implementation of an efficient result forwarding unit for asynchronous processors faces the problem of the inherent lack of synchronization between result producer and consumer units. An efficient, full-custom solution to this problem has been proposed and implemented before (in the AMULET3 asynchronous processor) with the consequent limitations on design-space...

    Provided By The University of Maine at Machias

  • White Papers // Nov 2008

    System-Level Modelling for Spinnaker CMP System

    The SpiNNaker Chip-Multi-Processor (CMP) system is a novel SoC architecture, designed specifically for large-scale neural simulations in real-time. The authors have developed a multi-chip complete system simulation for the SpiNNaker massively parallel CMP system using SystemC Transaction Level Modeling (TLM) to analyze architectural tradeoffs, verify the design, and develop/test intended...

    Provided By The University of Maine at Machias

  • White Papers // Jan 2014

    Automatic Data Path Extraction in Large-Scale Register-Transfer Level Designs

    Extracting data paths in large-scale register transfer level designs has important usage in automatic verification of synchronous circuits and synthesis of asynchronous circuits. Current tools rely on users to provide the data/control partition or use state-space analyses to extract data paths. Due to the explosion of state-space, the latter method...

    Provided By The University of Maine at Machias

  • White Papers // Aug 2008

    Configuring a Large-Scale GALS System

    The SpiNNaker massively parallel GALS system has been designed to support large-scale simulations of biologically inspired neural networks in real-time. The system is built around the Chip-Multi-Processor (CMP) technology using low-power ARM processors with an asynchronous Network-on-Chip (NoC) to support high performance parallel distributed processing. A novel asynchronous event-driven boot-up...

    Provided By The University of Maine at Machias

  • White Papers // Aug 2008

    Performance Analysis of Two Synchronizers

    Synchronizers are necessary when importing signals into any clocked domain. As multiple different clocks become increasingly common on chips, synchronizers also proliferate. To achieve high performance it is important that the system designer is aware of the timing characteristics of different synchronizers -which are non-deterministic by nature - and can...

    Provided By The University of Maine at Machias

  • White Papers // Jan 2010

    The Leaky Integrate-and-Fire Neuron: A Platform for Synaptic Model Exploration on the SpiNNaker Chip

    Large-scale neural hardware systems are trending increasingly towards the \"Neuromimetic\" architecture: a general-purpose platform that specializes in hardware for neural networks but allows flexibility in model choice. Since the model is not hard-wired into the chip, exploration of different neural and synaptic models is not merely possible but provides a...

    Provided By The University of Maine at Machias

  • White Papers // Feb 2009

    A Delay Efficient Robust Self-Timed Full Adder

    Addition forms the basis of digital computer systems. A gate level self-timed full adder design, utilizing a pre-defined set of gates, available in a commercial synchronous standard cell library is discussed in this paper. The proposed adder satisfies seitz's weak-indication specifications and exhibits reduced data path delay in comparison with...

    Provided By The University of Maine at Machias

  • White Papers // Mar 2009

    Self-Timed Full Adder Designs Based on Hybrid Input Encoding

    Self-timed full adder designs based on commercial synchronous resources (standard cells), constructed using a mix of complete delay-insensitive codes adopted for inputs are described in this paper. While one of the adder designs incorporates redundancy into the logic, the other design does not. Comparisons have been carried out with respect...

    Provided By The University of Maine at Machias

  • White Papers // Apr 2009

    Power, Delay and Area Efficient Self-Timed Multiplexer and Demultiplexer Designs

    Self-Timed (ST) logic design, in general, guarantees that the required functionality is satisfied irrespective of delays in the circuit components or signal wires. Efficient gate level design methods for robust self-timed realization of arbitrary size multiplexer and demultiplexer function blocks, using elements of a commercial standard cell library are discussed...

    Provided By The University of Maine at Machias

  • White Papers // Apr 2010

    Algorithm and Software for Simulation of Spiking Neural Networks on the Multi-Chip SpiNNaker System

    In this paper the authors present the algorithm and software developed for parallel simulation of spiking neural networks on multiple SpiNNaker universal neuromorphic chips. It not only describes approaches to simulating neural network models, such as dynamics, neural representations, and synaptic delays, but also presents the software design of loading...

    Provided By The University of Maine at Machias

  • White Papers // Sep 2010

    GPU-Enabled Steady-State Solution of Large Markov Models

    More cores, not faster clock speeds, drive performance enhancement in today's processors. The authors describe a novel parallel steady-state solver that uses NVIDIA's Compute Unified Device Architecture (CUDA) library to perform calculations on a Graphics Processing Unit (GPU). They demonstrate speed-ups of over 8 times compared with a CPU-only solver....

    Provided By The University of Maine at Machias