logo

SCIENTIA SINICA Informationis, Volume 47, Issue 7: 846(2017) https://doi.org/10.1360/N112016-00164

Implementation of parallel FFT algorithm on a ternary optical computer

More info
  • ReceivedJun 21, 2016
  • AcceptedAug 30, 2016
  • PublishedFeb 27, 2017

Abstract

The Fast Fourier Transform (FFT) is widely used in modern digital signal processing systems. In order to improve the efficiency of FFT, parallel methods are often used to speed up FFT in high-speed real-time application environments. However, due to the restrictions of size, energy consumption, heat dissipation etc., traditional electronic methods are becoming unsuitable for FFT applications in certain areas, such as aviation, aerospace, etc. Due to its characteristics of a massive number of data bits and low energy consumption, the Ternary Optical Computer (TOC) is a potential solution for these special cases. To verify this possibility, we studied the design and implementation of a high-speed parallel FFT on a TOC. Through analysis of the traditional radix-2, radix-4, and radix-8 DIT FFT operation processes, several FFT algorithms with higher parallelism, implemented on TOC, are designed by taking advantage of the characteristics of TOC. The implementation processes of these algorithms and the differences between them are presented. Meanwhile, the clock cycles and hardware resources required for each algorithm are also discussed. Simulation results reveal that the FFT implementation method is accurate. The algorithms require less power and fewer clock cycles when implemented on a TOC compared to the traditional methods implemented on an FPGA. This provides a new possible solution for high-speed low-power FFT implementation.


Funded by

国家自然科学基金(61572305,61103054)

中国航天科工集团二院“自主"创新项目


Acknowledgment

感谢所有三值光学计算机团队成员, 尤其是徐群博士研究生对本文所做出的贡献.

  • Figure 1

    Operating procedures of plural vector addition/subtraction in Radix-4 FFT algorithm

  • Figure 2

    The process of continuous multiplication and addition ($\alpha_k^j$ stands for the $j$-th intermediate result vector in the $k$-th step)

  • Figure 3

    Sequence diagram of pipeline scheme to compute $X(i)$

  • Figure 4

    Radix-4 DIT-FFT algorithm implementation in TOC (one adders module stands for $N/4$ MSD adders)

  • Figure 5

    (Color online) Comparison of the clock cycles required for different FFT algorithms

  • Table 1   Truth table for T, W, T$'$, W$'$, M transformations
    a b T W T$'$ W$'$ M
    $-$1 $-$1 $-$1 0 $-$1 0 1
    $-$1 0 $-$1 1 0 $-$1 0
    $-$1 1 0 0 0 0 $-$1
    0 $-$1 $-$1 1 0 $-$1 0
    0 0 0 0 0 0 0
    0 1 1 $-$1 0 1 0
    1 $-$1 0 0 0 0 $-$1
    1 0 1 $-$1 0 1 0
    1 1 1 0 1 0 1
  • Table 2   Computation cost of different parallel FFT algorithms (divide DFT once)
    Computation
    Algorithm Multiplication of a $N$/2-dimensional complex Addition/Subtraction of
    vector and a complex matrix of order $N$/2 $N$/2-dimensional complex vectors
    Parallel Radix-2 FFT 2 times 2 times : 1 layer
    Parallel Radix-4 FFT 4 times 8 times : 2 layers; number of addition
    or subtraction in each layer are 4, 4
    Parallel Radix-8 FFT 16 times 28 times : 3 layers; number of addition
    or subtraction in each layer are 12, 8, 8
  • Table 3   Data bits and clock cycles required by different FFT algorithms implemented on TOC
    Algorithm Clock cycles Data bits
    Parallel DFT 49 3103774720
    Parallel Radix-2 FFT 53 776553472
    Parallel Radix-4 FFT 61 195188224
    Parallel Radix-8 FFT 109 50556672
  • Table 4   First six dimensions of simulation experiment inputs
    Input data MSD binary system Decimal system
    1 Real part $10u10uuu01u0uu1u0u1100uuuu1uu10u$ 105.0817
    Imaginary part $1110111101u0u11000110u1u01u0u100$ 239.1178
    2 Real part $u1100uu010u0u0u1100uuu011u011000$ $-$37.6586
    Imaginary part $u0u000uu0u0u10010u110u0uu1uu01uu$ $-$163.2776
    3 Real part $00u01u1u00110uu1uu11u0u0u01uu0uu$ $-$26.8343
    Imaginary part $110uuu0uu0u0u10u1uu111u011u1u111$ 162.3563
    4 Real part $0u00u1uu100uu0u11uu0010uuu11u011$ $-$70.5971
    Imaginary part $1u0011u110000uuu0u1uu01u00u1110u$ 75.4718
    5 Real part $010u0u00u0u00111u01uuu10001u10u0$ 43.4004
    Imaginary part $u0u1u0u0u1uu1u0u1001u0u0100u10uu$ $-$154.4237
    6 Real part $011u01uuu001u01u1u1uu0u00uu001u1$ 80.5362
    Imaginary part $0011u1u0u11u1u0u01uuuu0111uu1111$ 41.8243
  • Table 5   Outputs of simulation experiment to implement parallel Radix-4 FFT
    Output data Ternary optical computer Electronic Result
    MSD binary system Decimal system computer
    1 Real part $u100u01u001u0u10u101uu10000u10u01u$ $-$283.7847 $-$283.7847 Correct
    Imaginary part $1u01100uu01u000001u00100u110u0$ 21.6407 21.6407
    2 Real part $uu10000100u0010u011u00000001u00u00$ $-$636.4482 $-$636.4482 Correct
    Imaginary part $1u0u01u10u00uu1u0011u00u1u00100100u1$ 811.3220 811.3220
    3 Real part $1uu10u1u00010u0u10uu01000u0101u11u1$ 720.7079 720.7079 Correct
    Imaginary part $1u0u11u10u01u0u10u0000u0u0100000010$ 470.2106 470.2106
    4 Real part $1uu1u01u000100u100u0010uu101u00u1$ 82.1162 82.1162 Correct
    Imaginary part $u10u00100uu10u1u0u010u1000u1001u1u0$ $-$626.5998 $-$626.5998
    5 Real part $u011u0u100000uu1000u00u001u00u1u0$ $-$178.0396 $-$178.0396 Correct
    Imaginary part $1uu1u10u11u00001u10uu110u0000001u00u$ 698.0423 698.0423
    6 Real part $1u0000u0010u01000100u1000u10u010u$ 124.4080 124.4080 Correct
    Imaginary part $u10000u100010u01u001u010u1u011uu100$ $-$519.6082 $-$519.6082
  • Table 6   Outputs of simulation experiment to implement parallel Radix-8 FFT
    Output data Ternary optical computer Electronic Result
    MSD binary system Decimal system computer
    1 Real part $u10u100100010u100u001u10000u10u01u$ $-$283.7847 $-$283.7847 Correct
    Imaginary part $1u10u00uu0010000001001000u10u0$ 21.6407 21.6407
    2 Real part $u1u1000010u100100uu010000000010u100$ $-$636.4482 $-$636.4482 Correct
    Imaginary part $10u010uu10u011u01u1u00u1u0011uu00u1$ 811.3220 811.3220
    3 Real part $110u1u00010u0u1u0101000u01010u1u01$ 720.7079 720.7079 Correct
    Imaginary part $1u0u11u10u01u00u0u0000u0u01000001u0$ 470.2106 470.2106
    4 Real part $1u0101u000100u10u100011u101u000u$ 82.1162 82.1162 Correct
    Imaginary part $u10u1uu00uu10u0100uu00u000u1001u010$ $-$626.5998 $-$626.5998
    5 Real part $u1u010u10000u10u0u1100u001u00u1u0$ $-$178.0396 $-$178.0396 Correct
    Imaginary part $1u10u00u0100001u1u10u10u000000010u1$ 698.0423 698.0423
    6 Real part $10000u01u0u01000100u1000u10u01u1$ 124.4080 124.4080 Correct
    Imaginary part $u100000u001uu1001001u1uu11u1u1uu100$ $-$519.6082 $-$519.6082
  • Table 7   Clock cycles comparison versus electronic methods
    Transform time (cycles)
    FFT design FFT size
    16-bit 16-bit 16-bit 16-bit Device
    16-point 64-point 256-point 1024-point
    Electronic method 1 [4] 37 137 525 2065 Spartan-3
    Electronic method 2 [5] 1283 XCV2P30
    Electronic method 3 [6] 361 929 448805 EP2C35
    Electronic method 4 [23] 1072 Vertex-II
    Electronic method 5 [27] 255 EP3SL70
    Electronic method 6 [28] 255 5VSX35T
    Minimum clock cycle in electronic methods 36 137 255 1283
    Our parallel Radix-4 FFT method 43 49 55 61 TOC
    Our parallel Radix-8 FFT method 91 97 103 109 TOC

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1