Subversion Repositories shark

Rev

Rev 3 | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
2 pj 1
            FFTW FREQUENTLY ASKED QUESTIONS WITH ANSWERS
2
                            18 May 1999
3
			     Matteo Frigo
4
			   Steven G. Johnson
5
		       <fftw@theory.lcs.mit.edu>
6
 
7
This is the list of Frequently Asked Questions about FFTW, a collection of
8
fast C routines for computing the Discrete Fourier Transform in one or
9
more dimensions.
10
 
11
===============================================================================
12
 
13
Index
14
 
15
 Section 1.  Introduction and General Information
16
 Q1.1        What is FFTW?
17
 Q1.2        How do I obtain FFTW?
18
 Q1.3        Is FFTW free software?
19
 Q1.4        What is this about non-free licenses?
20
 
21
 Section 2.  Installing FFTW
22
 Q2.1        Which systems does FFTW run on?
23
 Q2.2        Does FFTW run on DOS/Windows?
24
 Q2.3        My compiler crashes when compiling FFTW.
25
 Q2.4        FFTW does not compile on Solaris, complaining about const.
26
 Q2.5        Which language is FFTW written in?
27
 Q2.6        Can I call FFTW from FORTRAN?
28
 Q2.7        Can I call FFTW from C++?
29
 Q2.8        Why isn't FFTW written in FORTRAN/C++?
30
 Q2.9        How do I compile FFTW to run in single precision?
31
 
32
 Section 3.  Using FFTW
33
 Q3.1        FFTW seems really slow.
34
 Q3.2        FFTW gives results different from my old FFT.
35
 Q3.3        Can I save FFTW's plans?
36
 Q3.4        Why does your inverse transform return a scaled result?
37
 Q3.5        How can I make FFTW put the origin (zero frequency) at the center
38
 Q3.6        How do I FFT an image/audio file in *foobar* format?
39
 Q3.7        My program does not link (on Unix).
40
 
41
 Section 4.  Internals of FFTW
42
 Q4.1        How does FFTW work?
43
 Q4.2        Why is FFTW so fast?
44
 Q4.3        What is this wisdom thing?
45
 Q4.4        Why do you use wisdom? I just wanted to save a plan.
46
 
47
 Section 5.  Known bugs
48
 Q5.1        FFTW 1.1 crashes in rfftwnd on Linux.
49
 Q5.2        The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
50
 Q5.3        The test programs in FFTW 1.2.1 fail when I change FFTW to use sin
51
 Q5.4        The test program in FFTW 1.2.1 fails for n > 46340.
52
 Q5.5        The threaded code fails on Linux Redhat 5.0
53
 Q5.6        FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dime
54
 Q5.7        FFTW 2.0's complex transforms give the wrong results with prime fa
55
 Q5.8        FFTW 2.1.1's MPI test programs crash with MPICH.
56
 
57
===============================================================================
58
 
59
Section 1.  Introduction and General Information
60
 
61
 Q1.1        What is FFTW?
62
 Q1.2        How do I obtain FFTW?
63
 Q1.3        Is FFTW free software?
64
 Q1.4        What is this about non-free licenses?
65
 
66
-------------------------------------------------------------------------------
67
 
68
Question 1.1.  What is FFTW?
69
 
70
FFTW is a free collection of fast C routines for computing the Discrete
71
Fourier Transform in one or more dimensions.  It includes complex, real,
72
and parallel transforms, and can handle arbitrary array sizes efficiently.
73
FFTW is typically faster than other publically-available FFT
74
implementations, and is even competitive with vendor-tuned libraries.
75
(See our web page for extensive benchmarks.) To achieve this performance,
76
FFTW uses novel code-generation and runtime self-optimization techniques
77
(along with many other tricks).
78
 
79
-------------------------------------------------------------------------------
80
 
81
Question 1.2.  How do I obtain FFTW?
82
 
83
FFTW can be found at the FFTW web page.  You can also retrieve it from
84
theory.lcs.mit.edu in /pub/fftw.
85
 
86
-------------------------------------------------------------------------------
87
 
88
Question 1.3.  Is FFTW free software?
89
 
90
Starting with version 1.3, FFTW is Free Software in the technical sense
91
defined by the Free Software Foundation (see Categories of Free and
92
Non-Free Software), and is distributed under the terms of the GNU General
93
Public License.  Previous versions of FFTW were distributed without fee
94
for noncommercial use, but were not technically ``free.''
95
 
96
Non-free licenses for FFTW are also available that permit different terms
97
of use than the GPL.
98
 
99
-------------------------------------------------------------------------------
100
 
101
Question 1.4.  What is this about non-free licenses?
102
 
103
The non-free licenses are for companies that wish to use FFTW in their
104
products but are unwilling to release their software under the GPL (which
105
would require them to release source code and allow free redistribution).
106
Such users can purchase an unlimited-use license from MIT.  Contact us for
107
more details.
108
 
109
We could instead have released FFTW under the LGPL, or even disallowed
110
non-Free usage.  Suffice it to say, however, that MIT owns the copyright
111
to FFTW and they only let us GPL it because we convinced them that it
112
would neither affect their licensing revenue nor irritate existing
113
licensees.
114
 
115
===============================================================================
116
 
117
Section 2.  Installing FFTW
118
 
119
 Q2.1        Which systems does FFTW run on?
120
 Q2.2        Does FFTW run on DOS/Windows?
121
 Q2.3        My compiler crashes when compiling FFTW.
122
 Q2.4        FFTW does not compile on Solaris, complaining about const.
123
 Q2.5        Which language is FFTW written in?
124
 Q2.6        Can I call FFTW from FORTRAN?
125
 Q2.7        Can I call FFTW from C++?
126
 Q2.8        Why isn't FFTW written in FORTRAN/C++?
127
 Q2.9        How do I compile FFTW to run in single precision?
128
 
129
-------------------------------------------------------------------------------
130
 
131
Question 2.1.  Which systems does FFTW run on?
132
 
133
FFTW is written in ANSI C, and should work on any system with a decent C
134
compiler.  (See also Q2.2 `Does FFTW run on DOS/Windows?' and  Q2.3 `My
135
compiler crashes when compiling FFTW.'.)
136
 
137
-------------------------------------------------------------------------------
138
 
139
Question 2.2.  Does FFTW run on DOS/Windows?
140
 
141
It should.  FFTW was not developed on DOS or Windows, but the source code
142
is straight ANSI C.  Some users have reported using FFTW on DOS/Windows
143
using various compilers.  See also the FFTW Windows installation notes and
144
Q2.3 `My compiler crashes when compiling FFTW.'
145
 
146
-------------------------------------------------------------------------------
147
 
148
Question 2.3.  My compiler crashes when compiling FFTW.
149
 
150
Complain fiercely to the vendor of the compiler.
151
 
152
FFTW is a heavily-optimized piece of software that is likely to push
153
compilers to their limits.  We had no problems with, for example, gcc
154
2.7.2, Sun's SC4.0, IBM's XLC, Metrowerks' compilers for the Macintosh,
155
and SGI's compilers for IRIX 6.2.  Users have also reported successful
156
compilations of FFTW using Borland's C/C++ compilers on Windows.
157
 
158
Visual C++ 4.0 crashes when compiling FFTW 1.2 with all optimizations
159
turned on.  Visual C++ 5.0 reportedly produces incorrect code for the real
160
transforms in FFTW 2.x when the option "Maximize speed" is set.  We are
161
told that Service Pack 3 fixes the bug.
162
 
163
Various problems have also been observed with SGI's MIPSpro compilers,
164
versions 7.2.0 and 7.2.1 (you may have to lower the optimization level for
165
some files to get them to compile).  The test program in earlier versions
166
of FFTW had problems with the -xO5 option in Sun's SC4.0 C compiler.
167
 
168
-------------------------------------------------------------------------------
169
 
170
Question 2.4.  FFTW does not compile on Solaris, complaining about const.
171
 
172
We know that at least on Solaris 2.5.x with Sun's compilers 4.2 you might
173
get error messages from make such as
174
 
175
"./fftw.h", line 88: warning: const is a keyword in ANSI C
176
 
177
This is the case when the configure script reports that const does not
178
work:
179
 
180
checking for working const... (cached) no
181
 
182
You should be aware that Solaris comes with two compilers, namely,
183
/opt/SUNWspro/SC4.2/bin/cc and /usr/ucb/cc.  The latter compiler is
184
non-ANSI.  Indeed, it is a perverse shell script that calls the real
185
compiler in non-ANSI mode.  In order to compile FFTW, change your path so
186
that the right cc is used.
187
 
188
To know whether your compiler is the right one,  type cc -V.  If the
189
compiler prints ``ucbcc'', as in
190
 
191
ucbcc: WorkShop Compilers 4.2 30 Oct 1996 C 4.2
192
 
193
then the compiler is wrong.  The right message is something like
194
 
195
cc: WorkShop Compilers 4.2 30 Oct 1996 C 4.2
196
 
197
-------------------------------------------------------------------------------
198
 
199
Question 2.5.  Which language is FFTW written in?
200
 
201
FFTW is written in ANSI C.  Most of the code, however, was automatically
202
generated by a program called genfft, written in the Objective Caml
203
dialect of ML.  You do not need to know ML or to have an Objective Caml
204
compiler in order to use FFTW.
205
 
206
genfft is provided with the FFTW sources, which means that you can play
207
with the code generator if you want.  In this case, you need a working
208
Objective Caml system.  Objective Caml is available from ftp.inria.fr in
209
the directory /lang/caml-light.
210
 
211
-------------------------------------------------------------------------------
212
 
213
Question 2.6.  Can I call FFTW from FORTRAN?
214
 
215
Yes, but not directly.  The main problem is that Fortran cannot pass
216
parameters by value.  However, FFTW can be called indirectly from Fortran
217
through the use of special C "wrapper" routines.  Appropriate wrapper
218
code, documented in the FFTW manual, is included with FFTW (versions 1.3
219
and higher).
220
 
221
-------------------------------------------------------------------------------
222
 
223
Question 2.7.  Can I call FFTW from C++?
224
 
225
Most definitely.  FFTW should compile and run under any C++ compiler.
226
 
227
-------------------------------------------------------------------------------
228
 
229
Question 2.8.  Why isn't FFTW written in FORTRAN/C++?
230
 
231
Because we don't like those languages, and neither approaches the
232
portability of C.
233
 
234
-------------------------------------------------------------------------------
235
 
236
Question 2.9.  How do I compile FFTW to run in single precision?
237
 
238
On a Unix system: configure --enable-float.  On a non-Unix system: edit
239
fftw/fftw.h to #define the symbol FFTW_ENABLE_FLOAT.  In both cases, you
240
must then recompile FFTW.
241
 
242
===============================================================================
243
 
244
Section 3.  Using FFTW
245
 
246
 Q3.1        FFTW seems really slow.
247
 Q3.2        FFTW gives results different from my old FFT.
248
 Q3.3        Can I save FFTW's plans?
249
 Q3.4        Why does your inverse transform return a scaled result?
250
 Q3.5        How can I make FFTW put the origin (zero frequency) at the center
251
 Q3.6        How do I FFT an image/audio file in *foobar* format?
252
 Q3.7        My program does not link (on Unix).
253
 
254
-------------------------------------------------------------------------------
255
 
256
Question 3.1.  FFTW seems really slow.
257
 
258
You are probably recreating the plan before every transform, rather than
259
creating it once and reusing it for all transforms of the same size.  FFTW
260
is designed to be used in the following way:
261
 
262
* First, you create a plan.  This will take several seconds.
263
* Then, you reuse the plan many times to perform FFTs.  These are fast.
264
 
265
If you don't need to compute many transforms and the time for the planner
266
is significant, you have two options.  First, you can use the
267
FFTW_ESTIMATE option in the planner, which uses heuristics instead of
268
runtime measurements and produces a good plan in a short time.  Second,
269
you can use the wisdom feature to precompute the plan; see Q3.3 `Can I
270
save FFTW's plans?'
271
 
272
-------------------------------------------------------------------------------
273
 
274
Question 3.2.  FFTW gives results different from my old FFT.
275
 
276
People follow many different conventions for the DFT, and you should be
277
sure to know the ones that we use (described in the FFTW manual).  In
278
particular, you should be aware that the FFTW_FORWARD/FFTW_BACKWARD
279
directions correspond to signs of -1/+1 in the exponent of the DFT
280
definition.  (*Numerical Recipes* uses the opposite convention.)
281
 
282
You should also know that we compute an unnormalized transform.  In
283
contrast, Matlab is an example of program that computes a normalized
284
transform.  See Q3.4 `Why does your inverse transform return a scaled
285
result?'.
286
 
287
-------------------------------------------------------------------------------
288
 
289
Question 3.3.  Can I save FFTW's plans?
290
 
291
Yes. Starting with version 1.2, FFTW provides the  wisdom mechanism for
292
saving plans.  See Q4.3 `What is this wisdom thing?' and the FFTW manual.
293
 
294
-------------------------------------------------------------------------------
295
 
296
Question 3.4.  Why does your inverse transform return a scaled result?
297
 
298
Computing the forward transform followed by the backward transform (or
299
vice versa) yields the original array scaled by the size of the array.
300
(For multi-dimensional transforms, the size of the array is the product of
301
the dimensions.)  We could, instead, have chosen a normalization that
302
would have returned the unscaled array. Or, to accomodate the many
303
conventions in this matter, the transform routines could have accepted a
304
"scale factor" parameter. We did not do this, however, for two reasons.
305
First, we didn't want to sacrifice performance in the common case where
306
the scale factor is 1. Second, in real applications the FFT is followed or
307
preceded by some computation on the data, into which the scale factor can
308
typically be absorbed at little or no cost.
309
 
310
-------------------------------------------------------------------------------
311
 
312
Question 3.5.  How can I make FFTW put the origin (zero frequency) at the center of its output?
313
 
314
For human viewing of a spectrum, it is often convenient to put the origin
315
in frequency space at the center of the output array, rather than in the
316
zero-th element (the default in FFTW).  If all of the dimensions of your
317
array are even, you can accomplish this by simply multiplying each element
318
of the input array by (-1)^(i + j + ...), where i, j, etcetera are the
319
indices of the element.  (This trick is a general property of the DFT, and
320
is not specific to FFTW.)
321
 
322
-------------------------------------------------------------------------------
323
 
324
Question 3.6.  How do I FFT an image/audio file in *foobar* format?
325
 
326
FFTW performs an FFT on an array of floating-point values.  You can
327
certainly use it to compute the transform of an image or audio stream, but
328
you are responsible for figuring out your data format and converting it to
329
the form FFTW requires.
330
 
331
-------------------------------------------------------------------------------
332
 
333
Question 3.7.  My program does not link (on Unix).
334
 
335
Please use the exact order in which libraries are specified by the FFTW
336
manual (e.g. -lrfftw -lfftw -lm).  Also, note that the libraries must be
337
listed after your program sources/objects.  (The general rule is that if
338
*A* uses *B*, then *A* must be listed before *B* in the link command.).
339
For example, switching the order to -lfftw -lrfftw -lm will fail.
340
 
341
===============================================================================
342
 
343
Section 4.  Internals of FFTW
344
 
345
 Q4.1        How does FFTW work?
346
 Q4.2        Why is FFTW so fast?
347
 Q4.3        What is this wisdom thing?
348
 Q4.4        Why do you use wisdom? I just wanted to save a plan.
349
 
350
-------------------------------------------------------------------------------
351
 
352
Question 4.1.  How does FFTW work?
353
 
354
The innovation (if it can be so called) in FFTW consists in having an
355
interpreter execute the transform.  The program for the interpreter (the
356
*plan*) is computed at runtime according to the characteristics of your
357
machine/compiler.  This peculiar software architecture allows FFTW to
358
adapt itself to almost any machine.
359
 
360
For more details, see the paper "The Fastest Fourier Transform in the
361
West", by M. Frigo and S. G. Johnson, available at the FFTW web page.  See
362
also "FFTW: An Adaptive Software Architecture for the FFT", in ICASSP '98.
363
 
364
-------------------------------------------------------------------------------
365
 
366
Question 4.2.  Why is FFTW so fast?
367
 
368
This is a complex question, and there is no simple answer.  In fact, the
369
authors do not fully know the answer, either.  In addition to many small
370
performance hacks throughout FFTW, there are three general reasons for
371
FFTW's speed.
372
 
373
* 	FFTW uses an internal interpreter to adapt itself to a machine.  See
374
  Q4.1 `How does FFTW work?'.
375
* 	FFTW uses a code generator to produce highly-optimized routines for
376
  computing small transforms.
377
* 	FFTW uses explicit divide-and-conquer to take advantage of the memory
378
  hierarchy.
379
 
380
For more details on these three topics, see the paper "The Fastest Fourier
381
Transform in the West", by M. Frigo and S. G. Johnson, available at the
382
FFTW web page.
383
 
384
-------------------------------------------------------------------------------
385
 
386
Question 4.3.  What is this wisdom thing?
387
 
388
wisdom is the name of the mechanism that FFTW uses to save and restore
389
plans.  Rather than just saving plans, FFTW remembers what it learns about
390
your machine, and becomes wiser and wiser as time passes by.  You can save
391
wisdom for later use.
392
 
393
-------------------------------------------------------------------------------
394
 
395
Question 4.4.  Why do you use wisdom? I just wanted to save a plan.
396
 
397
wisdom could be implemented with less effort than a general plan-saving
398
mechanism would have required.  In addition, wisdom provides additional
399
benefits.  For example, if you are planning transforms of size 1024, and
400
later you want a transform of size 2048, most of the calculations of the
401
1024 case can be reused.
402
 
403
In short, wisdom does more things with less effort, and seemed like The
404
Right Thing to do.
405
 
406
===============================================================================
407
 
408
Section 5.  Known bugs
409
 
410
 Q5.1        FFTW 1.1 crashes in rfftwnd on Linux.
411
 Q5.2        The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
412
 Q5.3        The test programs in FFTW 1.2.1 fail when I change FFTW to use sin
413
 Q5.4        The test program in FFTW 1.2.1 fails for n > 46340.
414
 Q5.5        The threaded code fails on Linux Redhat 5.0
415
 Q5.6        FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dime
416
 Q5.7        FFTW 2.0's complex transforms give the wrong results with prime fa
417
 Q5.8        FFTW 2.1.1's MPI test programs crash with MPICH.
418
 
419
-------------------------------------------------------------------------------
420
 
421
Question 5.1.  FFTW 1.1 crashes in rfftwnd on Linux.
422
 
423
This bug was fixed in FFTW 1.2.  There was a bug in rfftwnd causing an
424
incorrect amount of memory to be allocated.  The bug showed up in Linux
425
with libc-5.3.12 (and nowhere else that we know of).
426
 
427
-------------------------------------------------------------------------------
428
 
429
Question 5.2.  The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
430
 
431
These bugs were corrected in FFTW 1.2.1.  The MPI transforms (really, just
432
the transpose routines) in FFTW 1.2 had bugs that could cause errors in
433
some situations.
434
 
435
-------------------------------------------------------------------------------
436
 
437
Question 5.3.  The test programs in FFTW 1.2.1 fail when I change FFTW to use single precision.
438
 
439
This bug was fixed in FFTW 1.3.  (Older versions of FFTW did work in
440
single precision, but the test programs didn't--the error tolerances in
441
the tests were set for double precision.)
442
 
443
-------------------------------------------------------------------------------
444
 
445
Question 5.4.  The test program in FFTW 1.2.1 fails for n > 46340.
446
 
447
This bug was fixed in FFTW 1.3.  FFTW 1.2.1 produced the right answer, but
448
the test program was wrong.  For large n, n*n in the naive transform that
449
we used for comparison overflows 32 bit integer precision, breaking the
450
test.
451
 
452
-------------------------------------------------------------------------------
453
 
454
Question 5.5.  The threaded code fails on Linux Redhat 5.0
455
 
456
We had problems with glibc-2.0.5.  The code should work with glibc-2.0.7.
457
 
458
-------------------------------------------------------------------------------
459
 
460
Question 5.6.  FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dimension >= 65536.
461
 
462
This bug was fixed in FFTW 2.0.1.  (There was a 32-bit integer overflow
463
due to a poorly-parenthesized expression.)
464
 
465
-------------------------------------------------------------------------------
466
 
467
Question 5.7.  FFTW 2.0's complex transforms give the wrong results with prime factors 17 to 97.
468
 
469
There was a bug in the complex transforms that could cause incorrect
470
results under (hopefully rare) circumstances for lengths with
471
intermediate-size prime factors (17-97).  This bug was fixed in FFTW
472
2.1.1.
473
 
474
-------------------------------------------------------------------------------
475
 
476
Question 5.8.  FFTW 2.1.1's MPI test programs crash with MPICH.
477
 
478
This was fixed in FFTW 2.1.2.  The 2.1/2.1.1 MPI test programs crashed
479
when using the MPICH implementation of MPI with the ch_p4 device (TCP/IP);
480
the transforms themselves worked fine.  (The source of the bug was some
481
strange constraints that MPICH imposes on access to the program argument
482
list.)
483