Research and Educational Society
In this paper, implementation of 2D 8 x 8 DCT block using a concurrent architecture is presented. The block contains 16 processing elements working in parallel. High speed and high throughput is achieved through bit-serial and bit-parallel architecture along with pipelining. Multiplier accumulators in the DCT architecture have been designed with distributed arithmetic. Distributed arithmetic offers reduction in area by eliminating the parallel multipliers. Furthermore, a very high-speed operation can be achieved because the critical path is formed in adders instead of multipliers.