SCIENTIA SINICA Informationis, Volume 47, Issue 7: 846(2017) https://doi.org/10.1360/N112016-00164

## Implementation of parallel FFT algorithm on a ternary optical computer

• AcceptedAug 30, 2016
• PublishedFeb 27, 2017
Share
Rating

### Abstract

The Fast Fourier Transform (FFT) is widely used in modern digital signal processing systems. In order to improve the efficiency of FFT, parallel methods are often used to speed up FFT in high-speed real-time application environments. However, due to the restrictions of size, energy consumption, heat dissipation etc., traditional electronic methods are becoming unsuitable for FFT applications in certain areas, such as aviation, aerospace, etc. Due to its characteristics of a massive number of data bits and low energy consumption, the Ternary Optical Computer (TOC) is a potential solution for these special cases. To verify this possibility, we studied the design and implementation of a high-speed parallel FFT on a TOC. Through analysis of the traditional radix-2, radix-4, and radix-8 DIT FFT operation processes, several FFT algorithms with higher parallelism, implemented on TOC, are designed by taking advantage of the characteristics of TOC. The implementation processes of these algorithms and the differences between them are presented. Meanwhile, the clock cycles and hardware resources required for each algorithm are also discussed. Simulation results reveal that the FFT implementation method is accurate. The algorithms require less power and fewer clock cycles when implemented on a TOC compared to the traditional methods implemented on an FPGA. This provides a new possible solution for high-speed low-power FFT implementation.

### Acknowledgment

• Figure 1

• Figure 2

The process of continuous multiplication and addition ($\alpha_k^j$ stands for the $j$-th intermediate result vector in the $k$-th step)

• Figure 3

Sequence diagram of pipeline scheme to compute $X(i)$

• Figure 4

Radix-4 DIT-FFT algorithm implementation in TOC (one adders module stands for $N/4$ MSD adders)

• Figure 5

(Color online) Comparison of the clock cycles required for different FFT algorithms

• Table 1   Truth table for T, W, T$'$, W$'$, M transformations
 a b T W T$'$ W$'$ M $-$1 $-$1 $-$1 0 $-$1 0 1 $-$1 0 $-$1 1 0 $-$1 0 $-$1 1 0 0 0 0 $-$1 0 $-$1 $-$1 1 0 $-$1 0 0 0 0 0 0 0 0 0 1 1 $-$1 0 1 0 1 $-$1 0 0 0 0 $-$1 1 0 1 $-$1 0 1 0 1 1 1 0 1 0 1
• Table 2   Computation cost of different parallel FFT algorithms (divide DFT once)
 Computation Algorithm Multiplication of a $N$/2-dimensional complex Addition/Subtraction of vector and a complex matrix of order $N$/2 $N$/2-dimensional complex vectors Parallel Radix-2 FFT 2 times 2 times : 1 layer Parallel Radix-4 FFT 4 times 8 times : 2 layers; number of addition or subtraction in each layer are 4, 4 Parallel Radix-8 FFT 16 times 28 times : 3 layers; number of addition or subtraction in each layer are 12, 8, 8
• Table 3   Data bits and clock cycles required by different FFT algorithms implemented on TOC
 Algorithm Clock cycles Data bits Parallel DFT 49 3103774720 Parallel Radix-2 FFT 53 776553472 Parallel Radix-4 FFT 61 195188224 Parallel Radix-8 FFT 109 50556672
• Table 4   First six dimensions of simulation experiment inputs
 Input data MSD binary system Decimal system 1 Real part $10u10uuu01u0uu1u0u1100uuuu1uu10u$ 105.0817 Imaginary part $1110111101u0u11000110u1u01u0u100$ 239.1178 2 Real part $u1100uu010u0u0u1100uuu011u011000$ $-$37.6586 Imaginary part $u0u000uu0u0u10010u110u0uu1uu01uu$ $-$163.2776 3 Real part $00u01u1u00110uu1uu11u0u0u01uu0uu$ $-$26.8343 Imaginary part $110uuu0uu0u0u10u1uu111u011u1u111$ 162.3563 4 Real part $0u00u1uu100uu0u11uu0010uuu11u011$ $-$70.5971 Imaginary part $1u0011u110000uuu0u1uu01u00u1110u$ 75.4718 5 Real part $010u0u00u0u00111u01uuu10001u10u0$ 43.4004 Imaginary part $u0u1u0u0u1uu1u0u1001u0u0100u10uu$ $-$154.4237 6 Real part $011u01uuu001u01u1u1uu0u00uu001u1$ 80.5362 Imaginary part $0011u1u0u11u1u0u01uuuu0111uu1111$ 41.8243
• Table 5   Outputs of simulation experiment to implement parallel Radix-4 FFT
 Output data Ternary optical computer Electronic Result MSD binary system Decimal system computer 1 Real part $u100u01u001u0u10u101uu10000u10u01u$ $-$283.7847 $-$283.7847 Correct Imaginary part $1u01100uu01u000001u00100u110u0$ 21.6407 21.6407 2 Real part $uu10000100u0010u011u00000001u00u00$ $-$636.4482 $-$636.4482 Correct Imaginary part $1u0u01u10u00uu1u0011u00u1u00100100u1$ 811.3220 811.3220 3 Real part $1uu10u1u00010u0u10uu01000u0101u11u1$ 720.7079 720.7079 Correct Imaginary part $1u0u11u10u01u0u10u0000u0u0100000010$ 470.2106 470.2106 4 Real part $1uu1u01u000100u100u0010uu101u00u1$ 82.1162 82.1162 Correct Imaginary part $u10u00100uu10u1u0u010u1000u1001u1u0$ $-$626.5998 $-$626.5998 5 Real part $u011u0u100000uu1000u00u001u00u1u0$ $-$178.0396 $-$178.0396 Correct Imaginary part $1uu1u10u11u00001u10uu110u0000001u00u$ 698.0423 698.0423 6 Real part $1u0000u0010u01000100u1000u10u010u$ 124.4080 124.4080 Correct Imaginary part $u10000u100010u01u001u010u1u011uu100$ $-$519.6082 $-$519.6082
• Table 6   Outputs of simulation experiment to implement parallel Radix-8 FFT
 Output data Ternary optical computer Electronic Result MSD binary system Decimal system computer 1 Real part $u10u100100010u100u001u10000u10u01u$ $-$283.7847 $-$283.7847 Correct Imaginary part $1u10u00uu0010000001001000u10u0$ 21.6407 21.6407 2 Real part $u1u1000010u100100uu010000000010u100$ $-$636.4482 $-$636.4482 Correct Imaginary part $10u010uu10u011u01u1u00u1u0011uu00u1$ 811.3220 811.3220 3 Real part $110u1u00010u0u1u0101000u01010u1u01$ 720.7079 720.7079 Correct Imaginary part $1u0u11u10u01u00u0u0000u0u01000001u0$ 470.2106 470.2106 4 Real part $1u0101u000100u10u100011u101u000u$ 82.1162 82.1162 Correct Imaginary part $u10u1uu00uu10u0100uu00u000u1001u010$ $-$626.5998 $-$626.5998 5 Real part $u1u010u10000u10u0u1100u001u00u1u0$ $-$178.0396 $-$178.0396 Correct Imaginary part $1u10u00u0100001u1u10u10u000000010u1$ 698.0423 698.0423 6 Real part $10000u01u0u01000100u1000u10u01u1$ 124.4080 124.4080 Correct Imaginary part $u100000u001uu1001001u1uu11u1u1uu100$ $-$519.6082 $-$519.6082
• Table 7   Clock cycles comparison versus electronic methods
 Transform time (cycles) FFT design FFT size 16-bit 16-bit 16-bit 16-bit Device 16-point 64-point 256-point 1024-point Electronic method 1 [4] 37 137 525 2065 Spartan-3 Electronic method 2 [5] – – – 1283 XCV2P30 Electronic method 3 [6] 361 – 929 448805 EP2C35 Electronic method 4 [23] – – 1072 – Vertex-II Electronic method 5 [27] – – 255 – EP3SL70 Electronic method 6 [28] – – 255 – 5VSX35T Minimum clock cycle in electronic methods 36 137 255 1283 – Our parallel Radix-4 FFT method 43 49 55 61 TOC Our parallel Radix-8 FFT method 91 97 103 109 TOC
• #### 1

Citations

• Altmetric

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有