High-Performance Radix-2, 3 and 5 Parallel 1-D Complex FFT Algorithms for Distributed-Memory Parallel Computers | The Journal of Supercomputing Skip to main content
Log in

High-Performance Radix-2, 3 and 5 Parallel 1-D Complex FFT Algorithms for Distributed-Memory Parallel Computers

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In this paper, we propose high-performance radix-2, 3 and 5 parallel 1-D complex FFT algorithms for distributed-memory parallel computers. We use the four-step or six-step FFT algorithms to implement the radix-2, 3 and 5 parallel 1-D complex FFT algorithms. In our parallel FFT algorithms, since we use cyclic distribution, all-to-all communication takes place only once. Moreover, the input data and output data are both in natural order.

We also show that the suitability of a parallel FFT algorithm is machine-dependent because of the differences in the architecture of the processor elements in distributed-memory parallel computers. Experimental results of 2p3q5r point FFTs on distributed-memory parallel computers, HITACHI SR2201 and IBM SP2 are reported. We succeeded to get performances of about 130 GFLOPS on a 1024PE HITACHI SR2201 and about 1.25 GFLOPS on a 32PE IBM SP2.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. R. C. Agarwal and J. W. Cooley, Vectorized mixed radix discrete Fourier transform algorithms. Proc. IEEE, 75: 1283-1292, 1987.

    Google Scholar 

  2. R. C. Agarwal and F. G. Gustavson, and M. Zubair, M., A high performance parallel algorithm for 1-D FFT. '94, pp. 34-40, 1994.

  3. A. Averbuch, E. Gabber, and B. Gordissky, and Y. Medan, A parallel FFT on a MIMD machine. Parallel Computing, 15: 61-74, 1990.

    Google Scholar 

  4. D. H. Bailey, FFTs in external or hierarchical memory. The Journal of Supercomputing, 4: 23-35, 1990.

    Google Scholar 

  5. G. D. Bergland, A fast Fourier transform algorithm using base 8 iterations. Math. Comp., 22: 275-279, 1968.

    Google Scholar 

  6. T. Boku, K. Itakura, H. Nakamura, and K. Nakazawa, CP-PACS: A massively parallel processor for large scale scientific calculations, Proceedings of the 1997 International Conference on Supercomputing, pp. 108-115, 1997.

  7. J. W. Cooley and J. W. Tukey, An algorithm for the machine calculation of complex Fourier series. Math. Comp., 19: 297-301, 1965.

    Google Scholar 

  8. M. Hegland, Real and complex fast Fourier transforms on the Fujitsu VPP 500. Parallel Computing, 22: 539-553, 1996.

    Google Scholar 

  9. S. L. Johnsson and R. L. Krawitz, Cooley-Tukey FFT on the connection machine. Parallel Computing, 18, 1201-1221, 1992.

    Google Scholar 

  10. Message Passing Interface Forum, MPI: A Message-Passing Interface Standard, Version 1.1, 1995.

  11. K. Nakazawa, H. Nakamura, H. Imori, and S. Kawabe, Pseudo vector processor based on registerwindowed superscalar pipeline, '92, pp. 642-651, 1992.

  12. C. M. Rader, Discrete Fourier transforms when the number of data samples is prime. Proc. IEEE, 56: 1107-1108, 1968.

    Google Scholar 

  13. R. C. Singleton, An algorithm for computing the mixed radix fast Fourier transform. IEEE Trans. Audio Electroacoust., 17: 93-103, 1969.

    Google Scholar 

  14. P. N. Swarztrauber, FFT algorithms for vector computers. Parallel Computing, 1: 45-63, 1984.

    Google Scholar 

  15. P. N. Swarztrauber, Multiprocessor FFTs, Parallel Computing, 5: 197-210, 1987.

    Google Scholar 

  16. C. Temperton, A note on prime factor FFT algorithms, J. Comput. Phys., 52: 198-204, 1983.

    Google Scholar 

  17. C. Temperton, Self-sorting mixed-radix fast Fourier transforms, J. Comput. Phys., 52: 1-23, 1983.

    Google Scholar 

  18. C. Temperton, A generalized prime factor FFT algorithm for any N = 2p3q5r, SIAM J. Sci. Stat. Comput., 13: 676-686, 1992.

    Google Scholar 

  19. C. Van Loan, Computational frameworks for the Fast Fourier Transform, SIAM Press, Philadelphia, 1992.

    Google Scholar 

  20. S. Winograd, On computing the discrete Fourier transform, Math. Comp., 32, 175-199, 1978.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Takahashi, D., Kanada, Y. High-Performance Radix-2, 3 and 5 Parallel 1-D Complex FFT Algorithms for Distributed-Memory Parallel Computers. The Journal of Supercomputing 15, 207–228 (2000). https://doi.org/10.1023/A:1008160021085

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008160021085

Navigation