WebSVN - shark - Rev 2 - /shark/branches/xen/ports/fftw/faq/fftw-faq.asc

            FFTW FREQUENTLY ASKED QUESTIONS WITH ANSWERS
                            18 May 1999
                             Matteo Frigo
                           Steven G. Johnson
                       <fftw@theory.lcs.mit.edu>

This is the list of Frequently Asked Questions about FFTW, a collection of
fast C routines for computing the Discrete Fourier Transform in one or
more dimensions.

===============================================================================

Index

 Section 1.  Introduction and General Information
 Q1.1        What is FFTW?
 Q1.2        How do I obtain FFTW?
 Q1.3        Is FFTW free software?
 Q1.4        What is this about non-free licenses?

 Section 2.  Installing FFTW
 Q2.1        Which systems does FFTW run on?
 Q2.2        Does FFTW run on DOS/Windows?
 Q2.3        My compiler crashes when compiling FFTW.
 Q2.4        FFTW does not compile on Solaris, complaining about const.
 Q2.5        Which language is FFTW written in?
 Q2.6        Can I call FFTW from FORTRAN?
 Q2.7        Can I call FFTW from C++?
 Q2.8        Why isn't FFTW written in FORTRAN/C++?
 Q2.9        How do I compile FFTW to run in single precision?

 Section 3.  Using FFTW
 Q3.1        FFTW seems really slow.
 Q3.2        FFTW gives results different from my old FFT.
 Q3.3        Can I save FFTW's plans?
 Q3.4        Why does your inverse transform return a scaled result?
 Q3.5        How can I make FFTW put the origin (zero frequency) at the center 
 Q3.6        How do I FFT an image/audio file in *foobar* format?
 Q3.7        My program does not link (on Unix).

 Section 4.  Internals of FFTW
 Q4.1        How does FFTW work?
 Q4.2        Why is FFTW so fast?
 Q4.3        What is this wisdom thing?
 Q4.4        Why do you use wisdom? I just wanted to save a plan.

 Section 5.  Known bugs
 Q5.1        FFTW 1.1 crashes in rfftwnd on Linux.
 Q5.2        The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
 Q5.3        The test programs in FFTW 1.2.1 fail when I change FFTW to use sin
 Q5.4        The test program in FFTW 1.2.1 fails for n > 46340.
 Q5.5        The threaded code fails on Linux Redhat 5.0
 Q5.6        FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dime
 Q5.7        FFTW 2.0's complex transforms give the wrong results with prime fa
 Q5.8        FFTW 2.1.1's MPI test programs crash with MPICH.

===============================================================================

Section 1.  Introduction and General Information

 Q1.1        What is FFTW?
 Q1.2        How do I obtain FFTW?
 Q1.3        Is FFTW free software?
 Q1.4        What is this about non-free licenses?

-------------------------------------------------------------------------------

Question 1.1.  What is FFTW?

FFTW is a free collection of fast C routines for computing the Discrete
Fourier Transform in one or more dimensions.  It includes complex, real,
and parallel transforms, and can handle arbitrary array sizes efficiently.
FFTW is typically faster than other publically-available FFT
implementations, and is even competitive with vendor-tuned libraries.
(See our web page for extensive benchmarks.) To achieve this performance,
FFTW uses novel code-generation and runtime self-optimization techniques
(along with many other tricks).

-------------------------------------------------------------------------------

Question 1.2.  How do I obtain FFTW?

FFTW can be found at the FFTW web page.  You can also retrieve it from
theory.lcs.mit.edu in /pub/fftw.

-------------------------------------------------------------------------------

Question 1.3.  Is FFTW free software?

Starting with version 1.3, FFTW is Free Software in the technical sense
defined by the Free Software Foundation (see Categories of Free and
Non-Free Software), and is distributed under the terms of the GNU General
Public License.  Previous versions of FFTW were distributed without fee
for noncommercial use, but were not technically ``free.''

Non-free licenses for FFTW are also available that permit different terms
of use than the GPL.

-------------------------------------------------------------------------------

Question 1.4.  What is this about non-free licenses?

The non-free licenses are for companies that wish to use FFTW in their
products but are unwilling to release their software under the GPL (which
would require them to release source code and allow free redistribution).
Such users can purchase an unlimited-use license from MIT.  Contact us for
more details.

We could instead have released FFTW under the LGPL, or even disallowed
non-Free usage.  Suffice it to say, however, that MIT owns the copyright
to FFTW and they only let us GPL it because we convinced them that it
would neither affect their licensing revenue nor irritate existing
licensees.

===============================================================================

Section 2.  Installing FFTW

 Q2.1        Which systems does FFTW run on?
 Q2.2        Does FFTW run on DOS/Windows?
 Q2.3        My compiler crashes when compiling FFTW.
 Q2.4        FFTW does not compile on Solaris, complaining about const.
 Q2.5        Which language is FFTW written in?
 Q2.6        Can I call FFTW from FORTRAN?
 Q2.7        Can I call FFTW from C++?
 Q2.8        Why isn't FFTW written in FORTRAN/C++?
 Q2.9        How do I compile FFTW to run in single precision?

-------------------------------------------------------------------------------

Question 2.1.  Which systems does FFTW run on?

FFTW is written in ANSI C, and should work on any system with a decent C
compiler.  (See also Q2.2 `Does FFTW run on DOS/Windows?' and  Q2.3 `My
compiler crashes when compiling FFTW.'.)

-------------------------------------------------------------------------------

Question 2.2.  Does FFTW run on DOS/Windows?

It should.  FFTW was not developed on DOS or Windows, but the source code
is straight ANSI C.  Some users have reported using FFTW on DOS/Windows
using various compilers.  See also the FFTW Windows installation notes and
Q2.3 `My compiler crashes when compiling FFTW.'

-------------------------------------------------------------------------------

Question 2.3.  My compiler crashes when compiling FFTW.

Complain fiercely to the vendor of the compiler.

FFTW is a heavily-optimized piece of software that is likely to push
compilers to their limits.  We had no problems with, for example, gcc
2.7.2, Sun's SC4.0, IBM's XLC, Metrowerks' compilers for the Macintosh,
and SGI's compilers for IRIX 6.2.  Users have also reported successful
compilations of FFTW using Borland's C/C++ compilers on Windows.

Visual C++ 4.0 crashes when compiling FFTW 1.2 with all optimizations
turned on.  Visual C++ 5.0 reportedly produces incorrect code for the real
transforms in FFTW 2.x when the option "Maximize speed" is set.  We are
told that Service Pack 3 fixes the bug.

Various problems have also been observed with SGI's MIPSpro compilers,
versions 7.2.0 and 7.2.1 (you may have to lower the optimization level for
some files to get them to compile).  The test program in earlier versions
of FFTW had problems with the -xO5 option in Sun's SC4.0 C compiler.

-------------------------------------------------------------------------------

Question 2.4.  FFTW does not compile on Solaris, complaining about const.

We know that at least on Solaris 2.5.x with Sun's compilers 4.2 you might
get error messages from make such as

"./fftw.h", line 88: warning: const is a keyword in ANSI C

This is the case when the configure script reports that const does not
work:

checking for working const... (cached) no

You should be aware that Solaris comes with two compilers, namely,
/opt/SUNWspro/SC4.2/bin/cc and /usr/ucb/cc.  The latter compiler is
non-ANSI.  Indeed, it is a perverse shell script that calls the real
compiler in non-ANSI mode.  In order to compile FFTW, change your path so
that the right cc is used.

To know whether your compiler is the right one,  type cc -V.  If the
compiler prints ``ucbcc'', as in

ucbcc: WorkShop Compilers 4.2 30 Oct 1996 C 4.2

then the compiler is wrong.  The right message is something like

cc: WorkShop Compilers 4.2 30 Oct 1996 C 4.2

-------------------------------------------------------------------------------

Question 2.5.  Which language is FFTW written in?

FFTW is written in ANSI C.  Most of the code, however, was automatically
generated by a program called genfft, written in the Objective Caml
dialect of ML.  You do not need to know ML or to have an Objective Caml
compiler in order to use FFTW.

genfft is provided with the FFTW sources, which means that you can play
with the code generator if you want.  In this case, you need a working
Objective Caml system.  Objective Caml is available from ftp.inria.fr in
the directory /lang/caml-light.

-------------------------------------------------------------------------------

Question 2.6.  Can I call FFTW from FORTRAN?

Yes, but not directly.  The main problem is that Fortran cannot pass
parameters by value.  However, FFTW can be called indirectly from Fortran
through the use of special C "wrapper" routines.  Appropriate wrapper
code, documented in the FFTW manual, is included with FFTW (versions 1.3
and higher).

-------------------------------------------------------------------------------

Question 2.7.  Can I call FFTW from C++?

Most definitely.  FFTW should compile and run under any C++ compiler.

-------------------------------------------------------------------------------

Question 2.8.  Why isn't FFTW written in FORTRAN/C++?

Because we don't like those languages, and neither approaches the
portability of C.

-------------------------------------------------------------------------------

Question 2.9.  How do I compile FFTW to run in single precision?

On a Unix system: configure --enable-float.  On a non-Unix system: edit
fftw/fftw.h to #define the symbol FFTW_ENABLE_FLOAT.  In both cases, you
must then recompile FFTW.

===============================================================================

Section 3.  Using FFTW

 Q3.1        FFTW seems really slow.
 Q3.2        FFTW gives results different from my old FFT.
 Q3.3        Can I save FFTW's plans?
 Q3.4        Why does your inverse transform return a scaled result?
 Q3.5        How can I make FFTW put the origin (zero frequency) at the center 
 Q3.6        How do I FFT an image/audio file in *foobar* format?
 Q3.7        My program does not link (on Unix).

-------------------------------------------------------------------------------

Question 3.1.  FFTW seems really slow.

You are probably recreating the plan before every transform, rather than
creating it once and reusing it for all transforms of the same size.  FFTW
is designed to be used in the following way:

* First, you create a plan.  This will take several seconds.
* Then, you reuse the plan many times to perform FFTs.  These are fast.

If you don't need to compute many transforms and the time for the planner
is significant, you have two options.  First, you can use the
FFTW_ESTIMATE option in the planner, which uses heuristics instead of
runtime measurements and produces a good plan in a short time.  Second,
you can use the wisdom feature to precompute the plan; see Q3.3 `Can I
save FFTW's plans?'

-------------------------------------------------------------------------------

Question 3.2.  FFTW gives results different from my old FFT.

People follow many different conventions for the DFT, and you should be
sure to know the ones that we use (described in the FFTW manual).  In
particular, you should be aware that the FFTW_FORWARD/FFTW_BACKWARD
directions correspond to signs of -1/+1 in the exponent of the DFT
definition.  (*Numerical Recipes* uses the opposite convention.)

You should also know that we compute an unnormalized transform.  In
contrast, Matlab is an example of program that computes a normalized
transform.  See Q3.4 `Why does your inverse transform return a scaled
result?'.

-------------------------------------------------------------------------------

Question 3.3.  Can I save FFTW's plans?

Yes. Starting with version 1.2, FFTW provides the  wisdom mechanism for
saving plans.  See Q4.3 `What is this wisdom thing?' and the FFTW manual.

-------------------------------------------------------------------------------

Question 3.4.  Why does your inverse transform return a scaled result?

Computing the forward transform followed by the backward transform (or
vice versa) yields the original array scaled by the size of the array.
(For multi-dimensional transforms, the size of the array is the product of
the dimensions.)  We could, instead, have chosen a normalization that
would have returned the unscaled array. Or, to accomodate the many
conventions in this matter, the transform routines could have accepted a
"scale factor" parameter. We did not do this, however, for two reasons.
First, we didn't want to sacrifice performance in the common case where
the scale factor is 1. Second, in real applications the FFT is followed or
preceded by some computation on the data, into which the scale factor can
typically be absorbed at little or no cost.

-------------------------------------------------------------------------------

Question 3.5.  How can I make FFTW put the origin (zero frequency) at the center of its output?

For human viewing of a spectrum, it is often convenient to put the origin
in frequency space at the center of the output array, rather than in the
zero-th element (the default in FFTW).  If all of the dimensions of your
array are even, you can accomplish this by simply multiplying each element
of the input array by (-1)^(i + j + ...), where i, j, etcetera are the
indices of the element.  (This trick is a general property of the DFT, and
is not specific to FFTW.)

-------------------------------------------------------------------------------

Question 3.6.  How do I FFT an image/audio file in *foobar* format?

FFTW performs an FFT on an array of floating-point values.  You can
certainly use it to compute the transform of an image or audio stream, but
you are responsible for figuring out your data format and converting it to
the form FFTW requires.

-------------------------------------------------------------------------------

Question 3.7.  My program does not link (on Unix).

Please use the exact order in which libraries are specified by the FFTW
manual (e.g. -lrfftw -lfftw -lm).  Also, note that the libraries must be
listed after your program sources/objects.  (The general rule is that if
*A* uses *B*, then *A* must be listed before *B* in the link command.).
For example, switching the order to -lfftw -lrfftw -lm will fail.

===============================================================================

Section 4.  Internals of FFTW

 Q4.1        How does FFTW work?
 Q4.2        Why is FFTW so fast?
 Q4.3        What is this wisdom thing?
 Q4.4        Why do you use wisdom? I just wanted to save a plan.

-------------------------------------------------------------------------------

Question 4.1.  How does FFTW work?

The innovation (if it can be so called) in FFTW consists in having an
interpreter execute the transform.  The program for the interpreter (the
*plan*) is computed at runtime according to the characteristics of your
machine/compiler.  This peculiar software architecture allows FFTW to
adapt itself to almost any machine.

For more details, see the paper "The Fastest Fourier Transform in the
West", by M. Frigo and S. G. Johnson, available at the FFTW web page.  See
also "FFTW: An Adaptive Software Architecture for the FFT", in ICASSP '98.

-------------------------------------------------------------------------------

Question 4.2.  Why is FFTW so fast?

This is a complex question, and there is no simple answer.  In fact, the
authors do not fully know the answer, either.  In addition to many small
performance hacks throughout FFTW, there are three general reasons for
FFTW's speed.

*       FFTW uses an internal interpreter to adapt itself to a machine.  See
  Q4.1 `How does FFTW work?'.
*       FFTW uses a code generator to produce highly-optimized routines for
  computing small transforms.
*       FFTW uses explicit divide-and-conquer to take advantage of the memory
  hierarchy.

For more details on these three topics, see the paper "The Fastest Fourier
Transform in the West", by M. Frigo and S. G. Johnson, available at the
FFTW web page.

-------------------------------------------------------------------------------

Question 4.3.  What is this wisdom thing?

wisdom is the name of the mechanism that FFTW uses to save and restore
plans.  Rather than just saving plans, FFTW remembers what it learns about
your machine, and becomes wiser and wiser as time passes by.  You can save
wisdom for later use.

-------------------------------------------------------------------------------

Question 4.4.  Why do you use wisdom? I just wanted to save a plan.

wisdom could be implemented with less effort than a general plan-saving
mechanism would have required.  In addition, wisdom provides additional
benefits.  For example, if you are planning transforms of size 1024, and
later you want a transform of size 2048, most of the calculations of the
1024 case can be reused.

In short, wisdom does more things with less effort, and seemed like The
Right Thing to do.

===============================================================================

Section 5.  Known bugs

 Q5.1        FFTW 1.1 crashes in rfftwnd on Linux.
 Q5.2        The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
 Q5.3        The test programs in FFTW 1.2.1 fail when I change FFTW to use sin
 Q5.4        The test program in FFTW 1.2.1 fails for n > 46340.
 Q5.5        The threaded code fails on Linux Redhat 5.0
 Q5.6        FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dime
 Q5.7        FFTW 2.0's complex transforms give the wrong results with prime fa
 Q5.8        FFTW 2.1.1's MPI test programs crash with MPICH.

-------------------------------------------------------------------------------

Question 5.1.  FFTW 1.1 crashes in rfftwnd on Linux.

This bug was fixed in FFTW 1.2.  There was a bug in rfftwnd causing an
incorrect amount of memory to be allocated.  The bug showed up in Linux
with libc-5.3.12 (and nowhere else that we know of).

-------------------------------------------------------------------------------

Question 5.2.  The MPI transforms in FFTW 1.2 give incorrect results/leak memory.

These bugs were corrected in FFTW 1.2.1.  The MPI transforms (really, just
the transpose routines) in FFTW 1.2 had bugs that could cause errors in
some situations.

-------------------------------------------------------------------------------

Question 5.3.  The test programs in FFTW 1.2.1 fail when I change FFTW to use single precision.

This bug was fixed in FFTW 1.3.  (Older versions of FFTW did work in
single precision, but the test programs didn't--the error tolerances in
the tests were set for double precision.)

-------------------------------------------------------------------------------

Question 5.4.  The test program in FFTW 1.2.1 fails for n > 46340.

This bug was fixed in FFTW 1.3.  FFTW 1.2.1 produced the right answer, but
the test program was wrong.  For large n, n*n in the naive transform that
we used for comparison overflows 32 bit integer precision, breaking the
test.

-------------------------------------------------------------------------------

Question 5.5.  The threaded code fails on Linux Redhat 5.0

We had problems with glibc-2.0.5.  The code should work with glibc-2.0.7.

-------------------------------------------------------------------------------

Question 5.6.  FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dimension >= 65536.

This bug was fixed in FFTW 2.0.1.  (There was a 32-bit integer overflow
due to a poorly-parenthesized expression.)

-------------------------------------------------------------------------------

Question 5.7.  FFTW 2.0's complex transforms give the wrong results with prime factors 17 to 97.

There was a bug in the complex transforms that could cause incorrect
results under (hopefully rare) circumstances for lengths with
intermediate-size prime factors (17-97).  This bug was fixed in FFTW
2.1.1.

-------------------------------------------------------------------------------

Question 5.8.  FFTW 2.1.1's MPI test programs crash with MPICH.

This was fixed in FFTW 2.1.2.  The 2.1/2.1.1 MPI test programs crashed
when using the MPICH implementation of MPI with the ch_p4 device (TCP/IP);
the transforms themselves worked fine.  (The source of the bug was some
strange constraints that MPICH imposes on access to the program argument
list.)
Subversion Repositories shark

(root)/shark/branches/xen/ports/fftw/faq/fftw-faq.asc - Rev 2