Rev 3 | Details | Compare with Previous | Last modification | View Log | RSS feed
Rev | Author | Line No. | Line |
---|---|---|---|
2 | pj | 1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> |
2 | <HTML> |
||
3 | <HEAD> |
||
4 | <!-- This HTML file has been created by texi2html 1.52 |
||
5 | from fftw.texi on 18 May 1999 --> |
||
6 | |||
7 | <TITLE>FFTW - Tutorial</TITLE> |
||
8 | </HEAD> |
||
9 | <BODY TEXT="#000000" BGCOLOR="#FFFFFF"> |
||
10 | Go to the <A HREF="fftw_1.html">first</A>, <A HREF="fftw_1.html">previous</A>, <A HREF="fftw_3.html">next</A>, <A HREF="fftw_10.html">last</A> section, <A HREF="fftw_toc.html">table of contents</A>. |
||
11 | <P><HR><P> |
||
12 | |||
13 | |||
14 | <H1><A NAME="SEC2">Tutorial</A></H1> |
||
15 | <P> |
||
16 | <A NAME="IDX14"></A> |
||
17 | This chapter describes the basic usage of FFTW, i.e., how to compute the |
||
18 | Fourier transform of a single array. This chapter tells the truth, but |
||
19 | not the <EM>whole</EM> truth. Specifically, FFTW implements additional |
||
20 | routines and flags, providing extra functionality, that are not |
||
21 | documented here. See Section <A HREF="fftw_3.html#SEC16">FFTW Reference</A>, for more complete information. |
||
22 | (Note that you need to compile and install FFTW before you can use it in |
||
23 | a program. See Section <A HREF="fftw_6.html#SEC66">Installation and Customization</A>, for the details of |
||
24 | the installation.) |
||
25 | |||
26 | |||
27 | <P> |
||
28 | This tutorial chapter is structured as follows. Section <A HREF="fftw_2.html#SEC3">Complex One-dimensional Transforms Tutorial</A> describes the basic usage of the |
||
29 | one-dimensional transform of complex data. Section <A HREF="fftw_2.html#SEC4">Complex Multi-dimensional Transforms Tutorial</A> describes the basic usage of the |
||
30 | multi-dimensional transform of complex data. Section <A HREF="fftw_2.html#SEC5">Real One-dimensional Transforms Tutorial</A> describes the one-dimensional transform of real |
||
31 | data and its inverse. Finally, Section <A HREF="fftw_2.html#SEC6">Real Multi-dimensional Transforms Tutorial</A> describes the multi-dimensional transform of real data and its |
||
32 | inverse. We recommend that you read these sections in the order that |
||
33 | they are presented. We then discuss two topics in detail. In |
||
34 | Section <A HREF="fftw_2.html#SEC7">Multi-dimensional Array Format</A>, we discuss the various |
||
35 | alternatives for storing multi-dimensional arrays in memory. Section <A HREF="fftw_2.html#SEC13">Words of Wisdom</A> shows how you can save FFTW's plans for future use. |
||
36 | |||
37 | |||
38 | |||
39 | |||
40 | <H2><A NAME="SEC3">Complex One-dimensional Transforms Tutorial</A></H2> |
||
41 | <P> |
||
42 | <A NAME="IDX15"></A> |
||
43 | <A NAME="IDX16"></A> |
||
44 | |||
45 | |||
46 | <P> |
||
47 | The basic usage of FFTW is simple. A typical call to FFTW looks like: |
||
48 | |||
49 | |||
50 | |||
51 | <PRE> |
||
52 | #include <fftw.h> |
||
53 | ... |
||
54 | { |
||
55 | fftw_complex in[N], out[N]; |
||
56 | fftw_plan p; |
||
57 | ... |
||
58 | p = fftw_create_plan(N, FFTW_FORWARD, FFTW_ESTIMATE); |
||
59 | ... |
||
60 | fftw_one(p, in, out); |
||
61 | ... |
||
62 | fftw_destroy_plan(p); |
||
63 | } |
||
64 | </PRE> |
||
65 | |||
66 | <P> |
||
67 | The first thing we do is to create a <EM>plan</EM>, which is an object |
||
68 | <A NAME="IDX17"></A> |
||
69 | that contains all the data that FFTW needs to compute the FFT, using the |
||
70 | following function: |
||
71 | |||
72 | |||
73 | |||
74 | <PRE> |
||
75 | fftw_plan fftw_create_plan(int n, fftw_direction dir, int flags); |
||
76 | </PRE> |
||
77 | |||
78 | <P> |
||
79 | <A NAME="IDX18"></A> |
||
80 | <A NAME="IDX19"></A> |
||
81 | <A NAME="IDX20"></A> |
||
82 | |||
83 | |||
84 | <P> |
||
85 | The first argument, <CODE>n</CODE>, is the size of the transform you are |
||
86 | trying to compute. The size <CODE>n</CODE> can be any positive integer, but |
||
87 | sizes that are products of small factors are transformed most |
||
88 | efficiently. The second argument, <CODE>dir</CODE>, can be either |
||
89 | <CODE>FFTW_FORWARD</CODE> or <CODE>FFTW_BACKWARD</CODE>, and indicates the direction |
||
90 | of the transform you |
||
91 | <A NAME="IDX21"></A> |
||
92 | <A NAME="IDX22"></A> |
||
93 | are interested in. Alternatively, you can use the sign of the exponent |
||
94 | in the transform, -1 or +1, which corresponds to |
||
95 | <CODE>FFTW_FORWARD</CODE> or <CODE>FFTW_BACKWARD</CODE> respectively. The |
||
96 | <CODE>flags</CODE> argument is either <CODE>FFTW_MEASURE</CODE> or |
||
97 | <A NAME="IDX23"></A> |
||
98 | <CODE>FFTW_ESTIMATE</CODE>. <CODE>FFTW_MEASURE</CODE> means that FFTW actually runs |
||
99 | <A NAME="IDX24"></A> |
||
100 | and measures the execution time of several FFTs in order to find the |
||
101 | best way to compute the transform of size <CODE>n</CODE>. This may take some |
||
102 | time, depending on your installation and on the precision of the timer |
||
103 | in your machine. <CODE>FFTW_ESTIMATE</CODE>, on the contrary, does not run |
||
104 | any computation, and just builds a |
||
105 | <A NAME="IDX25"></A> |
||
106 | reasonable plan, which may be sub-optimal. In other words, if your |
||
107 | program performs many transforms of the same size and initialization |
||
108 | time is not important, use <CODE>FFTW_MEASURE</CODE>; otherwise use the |
||
109 | estimate. (A compromise between these two extremes exists. See Section <A HREF="fftw_2.html#SEC13">Words of Wisdom</A>.) |
||
110 | |||
111 | |||
112 | <P> |
||
113 | Once the plan has been created, you can use it as many times as you like |
||
114 | for transforms on arrays of the same size. When you are done with the |
||
115 | plan, you deallocate it by calling <CODE>fftw_destroy_plan(plan)</CODE>. |
||
116 | <A NAME="IDX26"></A> |
||
117 | |||
118 | |||
119 | <P> |
||
120 | The transform itself is computed by passing the plan along with the |
||
121 | input and output arrays to <CODE>fftw_one</CODE>: |
||
122 | |||
123 | |||
124 | |||
125 | <PRE> |
||
126 | void fftw_one(fftw_plan plan, fftw_complex *in, fftw_complex *out); |
||
127 | </PRE> |
||
128 | |||
129 | <P> |
||
130 | <A NAME="IDX27"></A> |
||
131 | |||
132 | |||
133 | <P> |
||
134 | Note that the transform is out of place: <CODE>in</CODE> and <CODE>out</CODE> must |
||
135 | point to distinct arrays. It operates on data of type |
||
136 | <CODE>fftw_complex</CODE>, a data structure with real (<CODE>in[i].re</CODE>) and |
||
137 | imaginary (<CODE>in[i].im</CODE>) floating-point components. The <CODE>in</CODE> |
||
138 | and <CODE>out</CODE> arrays should have the length specified when the plan was |
||
139 | created. An alternative function, <CODE>fftw</CODE>, allows you to |
||
140 | efficiently perform multiple and/or strided transforms (see Section <A HREF="fftw_3.html#SEC16">FFTW Reference</A>). |
||
141 | <A NAME="IDX28"></A> |
||
142 | |||
143 | |||
144 | <P> |
||
145 | The DFT results are stored in-order in the array <CODE>out</CODE>, with the |
||
146 | zero-frequency (DC) component in <CODE>out[0]</CODE>. |
||
147 | <A NAME="IDX29"></A> |
||
148 | The array <CODE>in</CODE> is not modified. Users should note that FFTW |
||
149 | computes an unnormalized DFT, the sign of whose exponent is given by the |
||
150 | <CODE>dir</CODE> parameter of <CODE>fftw_create_plan</CODE>. Thus, computing a |
||
151 | forward followed by a backward transform (or vice versa) results in the |
||
152 | original array scaled by <CODE>n</CODE>. See Section <A HREF="fftw_3.html#SEC23">What FFTW Really Computes</A>, |
||
153 | for the definition of DFT. |
||
154 | <A NAME="IDX30"></A> |
||
155 | |||
156 | |||
157 | <P> |
||
158 | A program using FFTW should be linked with <CODE>-lfftw -lm</CODE> on Unix |
||
159 | systems, or with the FFTW and standard math libraries in general. |
||
160 | <A NAME="IDX31"></A> |
||
161 | |||
162 | |||
163 | |||
164 | |||
165 | <H2><A NAME="SEC4">Complex Multi-dimensional Transforms Tutorial</A></H2> |
||
166 | <P> |
||
167 | <A NAME="IDX32"></A> |
||
168 | <A NAME="IDX33"></A> |
||
169 | |||
170 | |||
171 | <P> |
||
172 | FFTW can also compute transforms of any number of dimensions |
||
173 | (<EM>rank</EM>). The syntax is similar to that for the one-dimensional |
||
174 | <A NAME="IDX34"></A> |
||
175 | transforms, with <SAMP>`fftw_'</SAMP> replaced by <SAMP>`fftwnd_'</SAMP> (which stands |
||
176 | for "<CODE>fftw</CODE> in <CODE>N</CODE> dimensions"). |
||
177 | |||
178 | |||
179 | <P> |
||
180 | As before, we <CODE>#include <fftw.h></CODE> and create a plan for the |
||
181 | transforms, this time of type <CODE>fftwnd_plan</CODE>: |
||
182 | |||
183 | |||
184 | |||
185 | <PRE> |
||
186 | fftwnd_plan fftwnd_create_plan(int rank, const int *n, |
||
187 | fftw_direction dir, int flags); |
||
188 | </PRE> |
||
189 | |||
190 | <P> |
||
191 | <A NAME="IDX35"></A> |
||
192 | <A NAME="IDX36"></A> |
||
193 | <A NAME="IDX37"></A> |
||
194 | |||
195 | |||
196 | <P> |
||
197 | <CODE>rank</CODE> is the dimensionality of the array, and can be any |
||
198 | non-negative integer. The next argument, <CODE>n</CODE>, is a pointer to an |
||
199 | integer array of length <CODE>rank</CODE> containing the (positive) sizes of |
||
200 | each dimension of the array. (Note that the array will be stored in |
||
201 | row-major order. See Section <A HREF="fftw_2.html#SEC7">Multi-dimensional Array Format</A>, for information |
||
202 | on row-major order.) The last two parameters are the same as in |
||
203 | <CODE>fftw_create_plan</CODE>. We now, however, have an additional possible |
||
204 | flag, <CODE>FFTW_IN_PLACE</CODE>, since <CODE>fftwnd</CODE> supports true in-place |
||
205 | <A NAME="IDX38"></A> |
||
206 | <A NAME="IDX39"></A> |
||
207 | <A NAME="IDX40"></A> |
||
208 | transforms. Multiple flags are combined using a bitwise <EM>or</EM> |
||
209 | (<SAMP>`|'</SAMP>). (An <EM>in-place</EM> transform is one in which the output |
||
210 | data overwrite the input data. It thus requires half as much memory |
||
211 | as--and is often faster than--its opposite, an <EM>out-of-place</EM> |
||
212 | transform.) |
||
213 | <A NAME="IDX41"></A> |
||
214 | <A NAME="IDX42"></A> |
||
215 | |||
216 | |||
217 | <P> |
||
218 | For two- and three-dimensional transforms, FFTWND provides alternative |
||
219 | routines that accept the sizes of each dimension directly, rather than |
||
220 | indirectly through a rank and an array of sizes. These are otherwise |
||
221 | identical to <CODE>fftwnd_create_plan</CODE>, and are sometimes more |
||
222 | convenient: |
||
223 | |||
224 | |||
225 | |||
226 | <PRE> |
||
227 | fftwnd_plan fftw2d_create_plan(int nx, int ny, |
||
228 | fftw_direction dir, int flags); |
||
229 | fftwnd_plan fftw3d_create_plan(int nx, int ny, int nz, |
||
230 | fftw_direction dir, int flags); |
||
231 | </PRE> |
||
232 | |||
233 | <P> |
||
234 | <A NAME="IDX43"></A> |
||
235 | <A NAME="IDX44"></A> |
||
236 | |||
237 | |||
238 | <P> |
||
239 | Once the plan has been created, you can use it any number of times for |
||
240 | transforms of the same size. When you do not need a plan anymore, you |
||
241 | can deallocate the plan by calling <CODE>fftwnd_destroy_plan(plan)</CODE>. |
||
242 | <A NAME="IDX45"></A> |
||
243 | |||
244 | |||
245 | <P> |
||
246 | Given a plan, you can compute the transform of an array of data by |
||
247 | calling: |
||
248 | |||
249 | |||
250 | |||
251 | <PRE> |
||
252 | void fftwnd_one(fftwnd_plan plan, fftw_complex *in, fftw_complex *out); |
||
253 | </PRE> |
||
254 | |||
255 | <P> |
||
256 | <A NAME="IDX46"></A> |
||
257 | |||
258 | |||
259 | <P> |
||
260 | Here, <CODE>in</CODE> and <CODE>out</CODE> point to multi-dimensional arrays in |
||
261 | row-major order, of the size specified when the plan was created. In |
||
262 | the case of an in-place transform, the <CODE>out</CODE> parameter is ignored |
||
263 | and the output data are stored in the <CODE>in</CODE> array. The results are |
||
264 | stored in-order, unnormalized, with the zero-frequency component in |
||
265 | <CODE>out[0]</CODE>. |
||
266 | <A NAME="IDX47"></A> |
||
267 | A forward followed by a backward transform (or vice-versa) yields the |
||
268 | original data multiplied by the size of the array (i.e. the product of |
||
269 | the dimensions). See Section <A HREF="fftw_3.html#SEC28">What FFTWND Really Computes</A>, for a discussion |
||
270 | of what FFTWND computes. |
||
271 | <A NAME="IDX48"></A> |
||
272 | |||
273 | |||
274 | <P> |
||
275 | For example, code to perform an in-place FFT of a three-dimensional |
||
276 | array might look like: |
||
277 | |||
278 | |||
279 | |||
280 | <PRE> |
||
281 | #include <fftw.h> |
||
282 | ... |
||
283 | { |
||
284 | fftw_complex in[L][M][N]; |
||
285 | fftwnd_plan p; |
||
286 | ... |
||
287 | p = fftw3d_create_plan(L, M, N, FFTW_FORWARD, |
||
288 | FFTW_MEASURE | FFTW_IN_PLACE); |
||
289 | ... |
||
290 | fftwnd_one(p, &in[0][0][0], NULL); |
||
291 | ... |
||
292 | fftwnd_destroy_plan(p); |
||
293 | } |
||
294 | </PRE> |
||
295 | |||
296 | <P> |
||
297 | Note that <CODE>in</CODE> is a statically-declared array, which is |
||
298 | automatically in row-major order, but we must take the address of the |
||
299 | first element in order to fit the type expected by <CODE>fftwnd_one</CODE>. |
||
300 | (See Section <A HREF="fftw_2.html#SEC7">Multi-dimensional Array Format</A>.) |
||
301 | |||
302 | |||
303 | |||
304 | |||
305 | <H2><A NAME="SEC5">Real One-dimensional Transforms Tutorial</A></H2> |
||
306 | <P> |
||
307 | <A NAME="IDX49"></A> |
||
308 | <A NAME="IDX50"></A> |
||
309 | <A NAME="IDX51"></A> |
||
310 | |||
311 | |||
312 | <P> |
||
313 | If the input data are purely real, you can save roughly a factor of two |
||
314 | in both time and storage by using the <EM>rfftw</EM> transforms, which are |
||
315 | FFTs specialized for real data. The output of a such a transform is a |
||
316 | <EM>halfcomplex</EM> array, which consists of only half of the complex DFT |
||
317 | amplitudes (since the negative-frequency amplitudes for real data are |
||
318 | the complex conjugate of the positive-frequency amplitudes). |
||
319 | <A NAME="IDX52"></A> |
||
320 | |||
321 | |||
322 | <P> |
||
323 | In exchange for these speed and space advantages, the user sacrifices |
||
324 | some of the simplicity of FFTW's complex transforms. First of all, to |
||
325 | allow maximum performance, the output format of the one-dimensional real |
||
326 | transforms is different from that used by the multi-dimensional |
||
327 | transforms. Second, the inverse transform (halfcomplex to real) has the |
||
328 | side-effect of destroying its input array. Neither of these |
||
329 | inconveniences should pose a serious problem for users, but it is |
||
330 | important to be aware of them. (Both the inconvenient output format |
||
331 | and the side-effect of the inverse transform can be ameliorated for |
||
332 | one-dimensional transforms, at the expense of some performance, by using |
||
333 | instead the multi-dimensional transform routines with a rank of one.) |
||
334 | |||
335 | |||
336 | <P> |
||
337 | The computation of the plan is similar to that for the complex |
||
338 | transforms. First, you <CODE>#include <rfftw.h></CODE>. Then, you create a |
||
339 | plan (of type <CODE>rfftw_plan</CODE>) by calling: |
||
340 | |||
341 | |||
342 | |||
343 | <PRE> |
||
344 | rfftw_plan rfftw_create_plan(int n, fftw_direction dir, int flags); |
||
345 | </PRE> |
||
346 | |||
347 | <P> |
||
348 | <A NAME="IDX53"></A> |
||
349 | <A NAME="IDX54"></A> |
||
350 | <A NAME="IDX55"></A> |
||
351 | |||
352 | |||
353 | <P> |
||
354 | <CODE>n</CODE> is the length of the <EM>real</EM> array in the transform (even |
||
355 | for halfcomplex-to-real transforms), and can be any positive integer |
||
356 | (although sizes with small factors are transformed more efficiently). |
||
357 | <CODE>dir</CODE> is either <CODE>FFTW_REAL_TO_COMPLEX</CODE> or |
||
358 | <CODE>FFTW_COMPLEX_TO_REAL</CODE>. |
||
359 | <A NAME="IDX56"></A> |
||
360 | <A NAME="IDX57"></A> |
||
361 | The <CODE>flags</CODE> parameter is the same as in <CODE>fftw_create_plan</CODE>. |
||
362 | |||
363 | |||
364 | <P> |
||
365 | Once created, a plan can be used for any number of transforms, and is |
||
366 | deallocated when you are done with it by calling |
||
367 | <CODE>rfftw_destroy_plan(plan)</CODE>. |
||
368 | <A NAME="IDX58"></A> |
||
369 | |||
370 | |||
371 | <P> |
||
372 | Given a plan, a real-to-complex or complex-to-real transform is computed |
||
373 | by calling: |
||
374 | |||
375 | |||
376 | |||
377 | <PRE> |
||
378 | void rfftw_one(rfftw_plan plan, fftw_real *in, fftw_real *out); |
||
379 | </PRE> |
||
380 | |||
381 | <P> |
||
382 | <A NAME="IDX59"></A> |
||
383 | |||
384 | |||
385 | <P> |
||
386 | (Note that <CODE>fftw_real</CODE> is an alias for the floating-point type for |
||
387 | which FFTW was compiled.) Depending upon the direction of the plan, |
||
388 | either the input or the output array is halfcomplex, and is stored in |
||
389 | the following format: |
||
390 | <A NAME="IDX60"></A> |
||
391 | |||
392 | |||
393 | <p align=center> |
||
394 | r<sub>0</sub>, r<sub>1</sub>, r<sub>2</sub>, ..., r<sub>n/2</sub>, i<sub>(n+1)/2-1</sub>, ..., i<sub>2</sub>, i<sub>1</sub> |
||
395 | </p> |
||
396 | |||
397 | <P> |
||
398 | Here, |
||
399 | r<sub>k</sub> |
||
400 | is the real part of the kth output, and |
||
401 | i<sub>k</sub> |
||
402 | is the imaginary part. (We follow here the C convention that integer |
||
403 | division is rounded down, e.g. 7 / 2 = 3.) For a halfcomplex |
||
404 | array <CODE>hc[]</CODE>, the kth component has its real part in |
||
405 | <CODE>hc[k]</CODE> and its imaginary part in <CODE>hc[n-k]</CODE>, with the |
||
406 | exception of <CODE>k</CODE> <CODE>==</CODE> <CODE>0</CODE> or <CODE>n/2</CODE> (the latter only |
||
407 | if n is even)---in these two cases, the imaginary part is zero due to |
||
408 | symmetries of the real-complex transform, and is not stored. Thus, the |
||
409 | transform of <CODE>n</CODE> real values is a halfcomplex array of length |
||
410 | <CODE>n</CODE>, and vice versa. <A NAME="DOCF1" HREF="fftw_foot.html#FOOT1">(1)</A> This is actually only half of the DFT |
||
411 | spectrum of the data. Although the other half can be obtained by |
||
412 | complex conjugation, it is not required by many applications such as |
||
413 | convolution and filtering. |
||
414 | |||
415 | |||
416 | <P> |
||
417 | Like the complex transforms, the RFFTW transforms are unnormalized, so a |
||
418 | forward followed by a backward transform (or vice-versa) yields the |
||
419 | original data scaled by the length of the array, <CODE>n</CODE>. |
||
420 | <A NAME="IDX61"></A> |
||
421 | |||
422 | |||
423 | <P> |
||
424 | Let us reiterate here our warning that an <CODE>FFTW_COMPLEX_TO_REAL</CODE> |
||
425 | transform has the side-effect of destroying its (halfcomplex) input. |
||
426 | The <CODE>FFTW_REAL_TO_COMPLEX</CODE> transform, however, leaves its (real) |
||
427 | input untouched, just as you would hope. |
||
428 | |||
429 | |||
430 | <P> |
||
431 | As an example, here is an outline of how you might use RFFTW to compute |
||
432 | the power spectrum of a real array (i.e. the squares of the absolute |
||
433 | values of the DFT amplitudes): |
||
434 | <A NAME="IDX62"></A> |
||
435 | |||
436 | |||
437 | |||
438 | <PRE> |
||
439 | #include <rfftw.h> |
||
440 | ... |
||
441 | { |
||
442 | fftw_real in[N], out[N], power_spectrum[N/2+1]; |
||
443 | rfftw_plan p; |
||
444 | int k; |
||
445 | ... |
||
446 | p = rfftw_create_plan(N, FFTW_REAL_TO_COMPLEX, FFTW_ESTIMATE); |
||
447 | ... |
||
448 | rfftw_one(p, in, out); |
||
449 | power_spectrum[0] = out[0]*out[0]; /* DC component */ |
||
450 | for (k = 1; k < (N+1)/2; ++k) /* (k < N/2 rounded up) */ |
||
451 | power_spectrum[k] = out[k]*out[k] + out[N-k]*out[N-k]; |
||
452 | if (N % 2 == 0) /* N is even */ |
||
453 | power_spectrum[N/2] = out[N/2]*out[N/2]; /* Nyquist freq. */ |
||
454 | ... |
||
455 | rfftw_destroy_plan(p); |
||
456 | } |
||
457 | </PRE> |
||
458 | |||
459 | <P> |
||
460 | Programs using RFFTW should link with <CODE>-lrfftw -lfftw -lm</CODE> on Unix, |
||
461 | or with the FFTW, RFFTW, and math libraries in general. |
||
462 | <A NAME="IDX63"></A> |
||
463 | |||
464 | |||
465 | |||
466 | |||
467 | <H2><A NAME="SEC6">Real Multi-dimensional Transforms Tutorial</A></H2> |
||
468 | <P> |
||
469 | <A NAME="IDX64"></A> |
||
470 | |||
471 | |||
472 | <P> |
||
473 | FFTW includes multi-dimensional transforms for real data of any rank. |
||
474 | As with the one-dimensional real transforms, they save roughly a factor |
||
475 | of two in time and storage over complex transforms of the same size. |
||
476 | Also as in one dimension, these gains come at the expense of some |
||
477 | increase in complexity--the output format is different from the |
||
478 | one-dimensional RFFTW (and is more similar to that of the complex FFTW) |
||
479 | and the inverse (complex to real) transforms have the side-effect of |
||
480 | overwriting their input data. |
||
481 | |||
482 | |||
483 | <P> |
||
484 | To use the real multi-dimensional transforms, you first <CODE>#include |
||
485 | <rfftw.h></CODE> and then create a plan for the size and direction of |
||
486 | transform that you are interested in: |
||
487 | |||
488 | |||
489 | |||
490 | <PRE> |
||
491 | rfftwnd_plan rfftwnd_create_plan(int rank, const int *n, |
||
492 | fftw_direction dir, int flags); |
||
493 | </PRE> |
||
494 | |||
495 | <P> |
||
496 | <A NAME="IDX65"></A> |
||
497 | <A NAME="IDX66"></A> |
||
498 | |||
499 | |||
500 | <P> |
||
501 | The first two parameters describe the size of the real data (not the |
||
502 | halfcomplex data, which will have different dimensions). The last two |
||
503 | parameters are the same as those for <CODE>rfftw_create_plan</CODE>. Just as |
||
504 | for fftwnd, there are two alternate versions of this routine, |
||
505 | <CODE>rfftw2d_create_plan</CODE> and <CODE>rfftw3d_create_plan</CODE>, that are |
||
506 | sometimes more convenient for two- and three-dimensional transforms. |
||
507 | <A NAME="IDX67"></A> |
||
508 | <A NAME="IDX68"></A> |
||
509 | Also as in fftwnd, rfftwnd supports true in-place transforms, specified |
||
510 | by including <CODE>FFTW_IN_PLACE</CODE> in the flags. |
||
511 | |||
512 | |||
513 | <P> |
||
514 | Once created, a plan can be used for any number of transforms, and is |
||
515 | deallocated by calling <CODE>rfftwnd_destroy_plan(plan)</CODE>. |
||
516 | |||
517 | |||
518 | <P> |
||
519 | Given a plan, the transform is computed by calling one of the following |
||
520 | two routines: |
||
521 | |||
522 | |||
523 | |||
524 | <PRE> |
||
525 | void rfftwnd_one_real_to_complex(rfftwnd_plan plan, |
||
526 | fftw_real *in, fftw_complex *out); |
||
527 | void rfftwnd_one_complex_to_real(rfftwnd_plan plan, |
||
528 | fftw_complex *in, fftw_real *out); |
||
529 | </PRE> |
||
530 | |||
531 | <P> |
||
532 | <A NAME="IDX69"></A> |
||
533 | <A NAME="IDX70"></A> |
||
534 | |||
535 | |||
536 | <P> |
||
537 | As is clear from their names and parameter types, the former function is |
||
538 | for <CODE>FFTW_REAL_TO_COMPLEX</CODE> transforms and the latter is for |
||
539 | <CODE>FFTW_COMPLEX_TO_REAL</CODE> transforms. (We could have used only a |
||
540 | single routine, since the direction of the transform is encoded in the |
||
541 | plan, but we wanted to correctly express the datatypes of the |
||
542 | parameters.) The latter routine, as we discuss elsewhere, has the |
||
543 | side-effect of overwriting its input (except when the rank of the array |
||
544 | is one). In both cases, the <CODE>out</CODE> parameter is ignored for |
||
545 | in-place transforms. |
||
546 | |||
547 | |||
548 | <P> |
||
549 | The format of the complex arrays deserves careful attention. |
||
550 | <A NAME="IDX71"></A> |
||
551 | Suppose that the real data has dimensions |
||
552 | n<sub>1</sub> x n<sub>2</sub> x ... x n<sub>d</sub> |
||
553 | (in row-major order). Then, after a real-to-complex transform, the |
||
554 | output is an |
||
555 | n<sub>1</sub> x n<sub>2</sub> x ... x (n<sub>d</sub>/2+1) |
||
556 | array of <CODE>fftw_complex</CODE> values in row-major order, corresponding to |
||
557 | slightly over half of the output of the corresponding complex transform. |
||
558 | (Note that the division is rounded down.) The ordering of the data is |
||
559 | otherwise exactly the same as in the complex case. (In principle, the |
||
560 | output could be exactly half the size of the complex transform output, |
||
561 | but in more than one dimension this requires too complicated a format to |
||
562 | be practical.) Note that, unlike the one-dimensional RFFTW, the real |
||
563 | and imaginary parts of the DFT amplitudes are here stored together in |
||
564 | the natural way. |
||
565 | |||
566 | |||
567 | <P> |
||
568 | Since the complex data is slightly larger than the real data, some |
||
569 | complications arise for in-place transforms. In this case, the final |
||
570 | dimension of the real data must be padded with extra values to |
||
571 | accommodate the size of the complex data--two extra if the last |
||
572 | dimension is even and one if it is odd. |
||
573 | <A NAME="IDX72"></A> |
||
574 | That is, the last dimension of the real data must physically contain |
||
575 | 2 * (n<sub>d</sub>/2+1) |
||
576 | <CODE>fftw_real</CODE> values (exactly enough to hold the complex data). |
||
577 | This physical array size does not, however, change the <EM>logical</EM> |
||
578 | array size--only |
||
579 | n<sub>d</sub> |
||
580 | values are actually stored in the last dimension, and |
||
581 | n<sub>d</sub> |
||
582 | is the last dimension passed to <CODE>rfftwnd_create_plan</CODE>. |
||
583 | |||
584 | |||
585 | <P> |
||
586 | For example, consider the transform of a two-dimensional real array of |
||
587 | size <CODE>nx</CODE> by <CODE>ny</CODE>. The output of the <CODE>rfftwnd</CODE> transform |
||
588 | is a two-dimensional real array of size <CODE>nx</CODE> by <CODE>ny/2+1</CODE>, |
||
589 | where the <CODE>y</CODE> dimension has been cut nearly in half because of |
||
590 | redundancies in the output. Because <CODE>fftw_complex</CODE> is twice the |
||
591 | size of <CODE>fftw_real</CODE>, the output array is slightly bigger than the |
||
592 | input array. Thus, if we want to compute the transform in place, we |
||
593 | must <EM>pad</EM> the input array so that it is of size <CODE>nx</CODE> by |
||
594 | <CODE>2*(ny/2+1)</CODE>. If <CODE>ny</CODE> is even, then there are two padding |
||
595 | elements at the end of each row (which need not be initialized, as they |
||
596 | are only used for output). |
||
597 | The following illustration depicts the input and output arrays just |
||
598 | described, for both the out-of-place and in-place transforms (with the |
||
599 | arrows indicating consecutive memory locations): |
||
600 | |||
601 | <p align=center><img src="rfftwnd.gif" width=389 height=583> |
||
602 | |||
603 | |||
604 | <P> |
||
605 | Figure 1 depicts the input and output arrays just |
||
606 | described, for both the out-of-place and in-place transforms (with the |
||
607 | arrows indicating consecutive memory locations). |
||
608 | |||
609 | |||
610 | <P> |
||
611 | The RFFTWND transforms are unnormalized, so a forward followed by a |
||
612 | backward transform will result in the original data scaled by the number |
||
613 | of real data elements--that is, the product of the (logical) dimensions |
||
614 | of the real data. |
||
615 | <A NAME="IDX73"></A> |
||
616 | |||
617 | |||
618 | <P> |
||
619 | Below, we illustrate the use of RFFTWND by showing how you might use it |
||
620 | to compute the (cyclic) convolution of two-dimensional real arrays |
||
621 | <CODE>a</CODE> and <CODE>b</CODE> (using the identity that a convolution corresponds |
||
622 | to a pointwise product of the Fourier transforms). For variety, |
||
623 | in-place transforms are used for the forward FFTs and an out-of-place |
||
624 | transform is used for the inverse transform. |
||
625 | <A NAME="IDX74"></A> |
||
626 | <A NAME="IDX75"></A> |
||
627 | |||
628 | |||
629 | |||
630 | <PRE> |
||
631 | #include <rfftw.h> |
||
632 | ... |
||
633 | { |
||
634 | fftw_real a[M][2*(N/2+1)], b[M][2*(N/2+1)], c[M][N]; |
||
635 | fftw_complex *A, *B, C[M][N/2+1]; |
||
636 | rfftwnd_plan p, pinv; |
||
637 | fftw_real scale = 1.0 / (M * N); |
||
638 | int i, j; |
||
639 | ... |
||
640 | p = rfftw2d_create_plan(M, N, FFTW_REAL_TO_COMPLEX, |
||
641 | FFTW_ESTIMATE | FFTW_IN_PLACE); |
||
642 | pinv = rfftw2d_create_plan(M, N, FFTW_COMPLEX_TO_REAL, |
||
643 | FFTW_ESTIMATE); |
||
644 | |||
645 | /* aliases for accessing complex transform outputs: */ |
||
646 | A = (fftw_complex*) &a[0][0]; |
||
647 | B = (fftw_complex*) &b[0][0]; |
||
648 | ... |
||
649 | for (i = 0; i < M; ++i) |
||
650 | for (j = 0; j < N; ++j) { |
||
651 | a[i][j] = ... ; |
||
652 | b[i][j] = ... ; |
||
653 | } |
||
654 | ... |
||
655 | rfftwnd_one_real_to_complex(p, &a[0][0], NULL); |
||
656 | rfftwnd_one_real_to_complex(p, &b[0][0], NULL); |
||
657 | |||
658 | for (i = 0; i < M; ++i) |
||
659 | for (j = 0; j < N/2+1; ++j) { |
||
660 | int ij = i*(N/2+1) + j; |
||
661 | C[i][j].re = (A[ij].re * B[ij].re |
||
662 | - A[ij].im * B[ij].im) * scale; |
||
663 | C[i][j].im = (A[ij].re * B[ij].im |
||
664 | + A[ij].im * B[ij].re) * scale; |
||
665 | } |
||
666 | |||
667 | /* inverse transform to get c, the convolution of a and b; |
||
668 | this has the side effect of overwriting C */ |
||
669 | rfftwnd_one_complex_to_real(pinv, &C[0][0], &c[0][0]); |
||
670 | ... |
||
671 | rfftwnd_destroy_plan(p); |
||
672 | rfftwnd_destroy_plan(pinv); |
||
673 | } |
||
674 | </PRE> |
||
675 | |||
676 | <P> |
||
677 | We access the complex outputs of the in-place transforms by casting |
||
678 | each real array to a <CODE>fftw_complex</CODE> pointer. Because this is a |
||
679 | "flat" pointer, we have to compute the row-major index <CODE>ij</CODE> |
||
680 | explicitly in the convolution product loop. |
||
681 | <A NAME="IDX76"></A> |
||
682 | In order to normalize the convolution, we must multiply by a scale |
||
683 | factor--we can do so either before or after the inverse transform, and |
||
684 | choose the former because it obviates the necessity of an additional |
||
685 | loop. |
||
686 | <A NAME="IDX77"></A> |
||
687 | Notice the limits of the loops and the dimensions of the various arrays. |
||
688 | |||
689 | |||
690 | <P> |
||
691 | As with the one-dimensional RFFTW, an out-of-place |
||
692 | <CODE>FFTW_COMPLEX_TO_REAL</CODE> transform has the side-effect of overwriting |
||
693 | its input array. (The real-to-complex transform, on the other hand, |
||
694 | leaves its input array untouched.) If you use RFFTWND for a rank-one |
||
695 | transform, however, this side-effect does not occur. Because of this |
||
696 | fact (and the simpler output format), users may find the RFFTWND |
||
697 | interface more convenient than RFFTW for one-dimensional transforms. |
||
698 | However, RFFTWND in one dimension is slightly slower than RFFTW because |
||
699 | RFFTWND uses an extra buffer array internally. |
||
700 | |||
701 | |||
702 | |||
703 | |||
704 | <H2><A NAME="SEC7">Multi-dimensional Array Format</A></H2> |
||
705 | |||
706 | <P> |
||
707 | This section describes the format in which multi-dimensional arrays are |
||
708 | stored. We felt that a detailed discussion of this topic was necessary, |
||
709 | since it is often a source of confusion among users and several |
||
710 | different formats are common. Although the comments below refer to |
||
711 | <CODE>fftwnd</CODE>, they are also applicable to the <CODE>rfftwnd</CODE> routines. |
||
712 | |||
713 | |||
714 | |||
715 | |||
716 | <H3><A NAME="SEC8">Row-major Format</A></H3> |
||
717 | <P> |
||
718 | <A NAME="IDX78"></A> |
||
719 | |||
720 | |||
721 | <P> |
||
722 | The multi-dimensional arrays passed to <CODE>fftwnd</CODE> are expected to be |
||
723 | stored as a single contiguous block in <EM>row-major</EM> order (sometimes |
||
724 | called "C order"). Basically, this means that as you step through |
||
725 | adjacent memory locations, the first dimension's index varies most |
||
726 | slowly and the last dimension's index varies most quickly. |
||
727 | |||
728 | |||
729 | <P> |
||
730 | To be more explicit, let us consider an array of rank d whose |
||
731 | dimensions are |
||
732 | n<sub>1</sub> x n<sub>2</sub> x n<sub>3</sub> x ... x n<sub>d</sub>. |
||
733 | Now, we specify a location in the array by a sequence of (zero-based) indices, |
||
734 | one for each dimension: |
||
735 | (i<sub>1</sub>, i<sub>2</sub>, i<sub>3</sub>,..., i<sub>d</sub>). |
||
736 | If the array is stored in row-major |
||
737 | order, then this element is located at the position |
||
738 | i<sub>d</sub> + n<sub>d</sub> * (i<sub>d-1</sub> + n<sub>d-1</sub> * (... + n<sub>2</sub> * i<sub>1</sub>)). |
||
739 | |||
740 | |||
741 | <P> |
||
742 | Note that each element of the array must be of type <CODE>fftw_complex</CODE>; |
||
743 | i.e. a (real, imaginary) pair of (double-precision) numbers. Note also |
||
744 | that, in <CODE>fftwnd</CODE>, the expression above is multiplied by the stride |
||
745 | to get the actual array index--this is useful in situations where each |
||
746 | element of the multi-dimensional array is actually a data structure or |
||
747 | another array, and you just want to transform a single field. In most |
||
748 | cases, however, you use a stride of 1. |
||
749 | <A NAME="IDX79"></A> |
||
750 | |||
751 | |||
752 | |||
753 | |||
754 | <H3><A NAME="SEC9">Column-major Format</A></H3> |
||
755 | <P> |
||
756 | <A NAME="IDX80"></A> |
||
757 | |||
758 | |||
759 | <P> |
||
760 | Readers from the Fortran world are used to arrays stored in |
||
761 | <EM>column-major</EM> order (sometimes called "Fortran order"). This is |
||
762 | essentially the exact opposite of row-major order in that, here, the |
||
763 | <EM>first</EM> dimension's index varies most quickly. |
||
764 | |||
765 | |||
766 | <P> |
||
767 | If you have an array stored in column-major order and wish to transform |
||
768 | it using <CODE>fftwnd</CODE>, it is quite easy to do. When creating the plan, |
||
769 | simply pass the dimensions of the array to <CODE>fftwnd_create_plan</CODE> in |
||
770 | <EM>reverse order</EM>. For example, if your array is a rank three |
||
771 | <CODE>N x M x L</CODE> matrix in column-major order, you should pass the |
||
772 | dimensions of the array as if it were an <CODE>L x M x N</CODE> matrix (which |
||
773 | it is, from perspective of <CODE>fftwnd</CODE>). This is done for you |
||
774 | automatically by the FFTW Fortran wrapper routines (see Section <A HREF="fftw_5.html#SEC62">Calling FFTW from Fortran</A>). |
||
775 | <A NAME="IDX81"></A> |
||
776 | |||
777 | |||
778 | |||
779 | |||
780 | <H3><A NAME="SEC10">Static Arrays in C</A></H3> |
||
781 | <P> |
||
782 | <A NAME="IDX82"></A> |
||
783 | |||
784 | |||
785 | <P> |
||
786 | Multi-dimensional arrays declared statically (that is, at compile time, |
||
787 | not necessarily with the <CODE>static</CODE> keyword) in C are <EM>already</EM> |
||
788 | in row-major order. You don't have to do anything special to transform |
||
789 | them. (See Section <A HREF="fftw_2.html#SEC4">Complex Multi-dimensional Transforms Tutorial</A>, for an |
||
790 | example of this sort of code.) |
||
791 | |||
792 | |||
793 | |||
794 | |||
795 | <H3><A NAME="SEC11">Dynamic Arrays in C</A></H3> |
||
796 | |||
797 | <P> |
||
798 | Often, especially for large arrays, it is desirable to allocate the |
||
799 | arrays dynamically, at runtime. This isn't too hard to do, although it |
||
800 | is not as straightforward for multi-dimensional arrays as it is for |
||
801 | one-dimensional arrays. |
||
802 | |||
803 | |||
804 | <P> |
||
805 | Creating the array is simple: using a dynamic-allocation routine like |
||
806 | <CODE>malloc</CODE>, allocate an array big enough to store N <CODE>fftw_complex</CODE> |
||
807 | values, where N is the product of the sizes of the array dimensions |
||
808 | (i.e. the total number of complex values in the array). For example, |
||
809 | here is code to allocate a 5x12x27 rank 3 array: |
||
810 | <A NAME="IDX83"></A> |
||
811 | |||
812 | |||
813 | |||
814 | <PRE> |
||
815 | fftw_complex *an_array; |
||
816 | |||
817 | an_array = (fftw_complex *) malloc(5 * 12 * 27 * sizeof(fftw_complex)); |
||
818 | </PRE> |
||
819 | |||
820 | <P> |
||
821 | Accessing the array elements, however, is more tricky--you can't simply |
||
822 | use multiple applications of the <SAMP>`[]'</SAMP> operator like you could for |
||
823 | static arrays. Instead, you have to explicitly compute the offset into |
||
824 | the array using the formula given earlier for row-major arrays. For |
||
825 | example, to reference the (i,j,k)-th element of the array |
||
826 | allocated above, you would use the expression <CODE>an_array[k + 27 * (j |
||
827 | + 12 * i)]</CODE>. |
||
828 | |||
829 | |||
830 | <P> |
||
831 | This pain can be alleviated somewhat by defining appropriate macros, or, |
||
832 | in C++, creating a class and overloading the <SAMP>`()'</SAMP> operator. |
||
833 | |||
834 | |||
835 | |||
836 | |||
837 | <H3><A NAME="SEC12">Dynamic Arrays in C--The Wrong Way</A></H3> |
||
838 | |||
839 | <P> |
||
840 | A different method for allocating multi-dimensional arrays in C is often |
||
841 | suggested that is incompatible with <CODE>fftwnd</CODE>: <EM>using it will |
||
842 | cause FFTW to die a painful death</EM>. We discuss the technique here, |
||
843 | however, because it is so commonly known and used. This method is to |
||
844 | create arrays of pointers of arrays of pointers of ...etcetera. For |
||
845 | example, the analogue in this method to the example above is: |
||
846 | |||
847 | |||
848 | |||
849 | <PRE> |
||
850 | int i,j; |
||
851 | fftw_complex ***a_bad_array; /* another way to make a 5x12x27 array */ |
||
852 | |||
853 | a_bad_array = (fftw_complex ***) malloc(5 * sizeof(fftw_complex **)); |
||
854 | for (i = 0; i < 5; ++i) { |
||
855 | a_bad_array[i] = |
||
856 | (fftw_complex **) malloc(12 * sizeof(fftw_complex *)); |
||
857 | for (j = 0; j < 12; ++j) |
||
858 | a_bad_array[i][j] = |
||
859 | (fftw_complex *) malloc(27 * sizeof(fftw_complex)); |
||
860 | } |
||
861 | </PRE> |
||
862 | |||
863 | <P> |
||
864 | As you can see, this sort of array is inconvenient to allocate (and |
||
865 | deallocate). On the other hand, it has the advantage that the |
||
866 | (i,j,k)-th element can be referenced simply by |
||
867 | <CODE>a_bad_array[i][j][k]</CODE>. |
||
868 | |||
869 | |||
870 | <P> |
||
871 | If you like this technique and want to maximize convenience in accessing |
||
872 | the array, but still want to pass the array to FFTW, you can use a |
||
873 | hybrid method. Allocate the array as one contiguous block, but also |
||
874 | declare an array of arrays of pointers that point to appropriate places |
||
875 | in the block. That sort of trick is beyond the scope of this |
||
876 | documentation; for more information on multi-dimensional arrays in C, |
||
877 | see the <CODE>comp.lang.c</CODE> |
||
878 | <A HREF="http://www.eskimo.com/~scs/C-faq/s6.html">FAQ</A>. |
||
879 | |||
880 | |||
881 | |||
882 | |||
883 | <H2><A NAME="SEC13">Words of Wisdom</A></H2> |
||
884 | <P> |
||
885 | <A NAME="IDX84"></A> |
||
886 | <A NAME="IDX85"></A> |
||
887 | |||
888 | |||
889 | <P> |
||
890 | FFTW implements a method for saving plans to disk and restoring them. |
||
891 | In fact, what FFTW does is more general than just saving and loading |
||
892 | plans. The mechanism is called <EM><CODE>wisdom</CODE></EM>. Here, we describe |
||
893 | this feature at a high level. See Section <A HREF="fftw_3.html#SEC16">FFTW Reference</A>, for a less casual |
||
894 | (but more complete) discussion of how to use <CODE>wisdom</CODE> in FFTW. |
||
895 | |||
896 | |||
897 | <P> |
||
898 | Plans created with the <CODE>FFTW_MEASURE</CODE> option produce near-optimal |
||
899 | FFT performance, but it can take a long time to compute a plan because |
||
900 | FFTW must actually measure the runtime of many possible plans and select |
||
901 | the best one. This is designed for the situations where so many |
||
902 | transforms of the same size must be computed that the start-up time is |
||
903 | irrelevant. For short initialization times but slightly slower |
||
904 | transforms, we have provided <CODE>FFTW_ESTIMATE</CODE>. The <CODE>wisdom</CODE> |
||
905 | mechanism is a way to get the best of both worlds. There are, however, |
||
906 | certain caveats that the user must be aware of in using <CODE>wisdom</CODE>. |
||
907 | For this reason, <CODE>wisdom</CODE> is an optional feature which is not |
||
908 | enabled by default. |
||
909 | |||
910 | |||
911 | <P> |
||
912 | At its simplest, <CODE>wisdom</CODE> provides a way of saving plans to disk so |
||
913 | that they can be reused in other program runs. You create a plan with |
||
914 | the flags <CODE>FFTW_MEASURE</CODE> and <CODE>FFTW_USE_WISDOM</CODE>, and then save |
||
915 | the <CODE>wisdom</CODE> using <CODE>fftw_export_wisdom</CODE>: |
||
916 | <A NAME="IDX86"></A> |
||
917 | |||
918 | |||
919 | |||
920 | <PRE> |
||
921 | plan = fftw_create_plan(..., ... | FFTW_MEASURE | FFTW_USE_WISDOM); |
||
922 | fftw_export_wisdom(...); |
||
923 | </PRE> |
||
924 | |||
925 | <P> |
||
926 | <A NAME="IDX87"></A> |
||
927 | |||
928 | |||
929 | <P> |
||
930 | The next time you run the program, you can restore the <CODE>wisdom</CODE> |
||
931 | with <CODE>fftw_import_wisdom</CODE>, and then recreate the plan using the |
||
932 | same flags as before. This time, however, the same optimal plan will be |
||
933 | created very quickly without measurements. (FFTW still needs some time |
||
934 | to compute trigonometric tables, however.) The basic outline is: |
||
935 | |||
936 | |||
937 | |||
938 | <PRE> |
||
939 | fftw_import_wisdom(...); |
||
940 | plan = fftw_create_plan(..., ... | FFTW_USE_WISDOM); |
||
941 | </PRE> |
||
942 | |||
943 | <P> |
||
944 | <A NAME="IDX88"></A> |
||
945 | |||
946 | |||
947 | <P> |
||
948 | Wisdom is more than mere rote memorization, however. FFTW's |
||
949 | <CODE>wisdom</CODE> encompasses all of the knowledge and measurements that |
||
950 | were used to create the plan for a given size. Therefore, existing |
||
951 | <CODE>wisdom</CODE> is also applied to the creation of other plans of |
||
952 | different sizes. |
||
953 | |||
954 | |||
955 | <P> |
||
956 | Whenever a plan is created with the <CODE>FFTW_MEASURE</CODE> and |
||
957 | <CODE>FFTW_USE_WISDOM</CODE> flags, <CODE>wisdom</CODE> is generated. Thereafter, |
||
958 | plans for any transform with a similar factorization will be computed |
||
959 | more quickly, so long as they use the <CODE>FFTW_USE_WISDOM</CODE> flag. In |
||
960 | fact, for transforms with the same factors and of equal or lesser size, |
||
961 | no measurements at all need to be made and an optimal plan can be |
||
962 | created with negligible delay! |
||
963 | |||
964 | |||
965 | <P> |
||
966 | For example, suppose that you create a plan for |
||
967 | N = 2<sup>16</sup>. |
||
968 | Then, for any equal or smaller power of two, FFTW can create a |
||
969 | plan (with the same direction and flags) quickly, using the |
||
970 | precomputed <CODE>wisdom</CODE>. Even for larger powers of two, or sizes that |
||
971 | are a power of two times some other prime factors, plans will be |
||
972 | computed more quickly than they would otherwise (although some |
||
973 | measurements still have to be made). |
||
974 | |||
975 | |||
976 | <P> |
||
977 | The <CODE>wisdom</CODE> is cumulative, and is stored in a global, private data |
||
978 | structure managed internally by FFTW. The storage space required is |
||
979 | minimal, proportional to the logarithm of the sizes the <CODE>wisdom</CODE> was |
||
980 | generated from. The <CODE>wisdom</CODE> can be forgotten (and its associated |
||
981 | memory freed) by a call to <CODE>fftw_forget_wisdom()</CODE>; otherwise, it is |
||
982 | remembered until the program terminates. It can also be exported to a |
||
983 | file, a string, or any other medium using <CODE>fftw_export_wisdom</CODE> and |
||
984 | restored during a subsequent execution of the program (or a different |
||
985 | program) using <CODE>fftw_import_wisdom</CODE> (these functions are described |
||
986 | below). |
||
987 | |||
988 | |||
989 | <P> |
||
990 | Because <CODE>wisdom</CODE> is incorporated into FFTW at a very low level, the |
||
991 | same <CODE>wisdom</CODE> can be used for one-dimensional transforms, |
||
992 | multi-dimensional transforms, and even the parallel extensions to FFTW. |
||
993 | Just include <CODE>FFTW_USE_WISDOM</CODE> in the flags for whatever plans you |
||
994 | create (i.e., always plan wisely). |
||
995 | |||
996 | |||
997 | <P> |
||
998 | Plans created with the <CODE>FFTW_ESTIMATE</CODE> plan can use <CODE>wisdom</CODE>, |
||
999 | but cannot generate it; only <CODE>FFTW_MEASURE</CODE> plans actually produce |
||
1000 | <CODE>wisdom</CODE>. Also, plans can only use <CODE>wisdom</CODE> generated from |
||
1001 | plans created with the same direction and flags. For example, a size |
||
1002 | <CODE>42</CODE> <CODE>FFTW_BACKWARD</CODE> transform will not use <CODE>wisdom</CODE> |
||
1003 | produced by a size <CODE>42</CODE> <CODE>FFTW_FORWARD</CODE> transform. The only |
||
1004 | exception to this rule is that <CODE>FFTW_ESTIMATE</CODE> plans can use |
||
1005 | <CODE>wisdom</CODE> from <CODE>FFTW_MEASURE</CODE> plans. |
||
1006 | |||
1007 | |||
1008 | |||
1009 | |||
1010 | <H3><A NAME="SEC14">Caveats in Using Wisdom</A></H3> |
||
1011 | <P> |
||
1012 | <A NAME="IDX89"></A> |
||
1013 | |||
1014 | |||
1015 | |||
1016 | <BLOCKQUOTE> |
||
1017 | <i> |
||
1018 | <P> |
||
1019 | For in much wisdom is much grief, and he that increaseth knowledge |
||
1020 | increaseth sorrow. |
||
1021 | </i> |
||
1022 | [Ecclesiastes 1:18] |
||
1023 | <A NAME="IDX90"></A> |
||
1024 | </BLOCKQUOTE> |
||
1025 | |||
1026 | <P> |
||
1027 | There are pitfalls to using <CODE>wisdom</CODE>, in that it can negate FFTW's |
||
1028 | ability to adapt to changing hardware and other conditions. For example, |
||
1029 | it would be perfectly possible to export <CODE>wisdom</CODE> from a program |
||
1030 | running on one processor and import it into a program running on another |
||
1031 | processor. Doing so, however, would mean that the second program would |
||
1032 | use plans optimized for the first processor, instead of the one it is |
||
1033 | running on. |
||
1034 | |||
1035 | |||
1036 | <P> |
||
1037 | It should be safe to reuse <CODE>wisdom</CODE> as long as the hardware and |
||
1038 | program binaries remain unchanged. (Actually, the optimal plan may |
||
1039 | change even between runs of the same binary on identical hardware, due |
||
1040 | to differences in the virtual memory environment, etcetera. Users |
||
1041 | seriously interested in performance should worry about this problem, |
||
1042 | too.) It is likely that, if the same <CODE>wisdom</CODE> is used for two |
||
1043 | different program binaries, even running on the same machine, the plans |
||
1044 | may be sub-optimal because of differing code alignments. It is |
||
1045 | therefore wise to recreate <CODE>wisdom</CODE> every time an application is |
||
1046 | recompiled. The more the underlying hardware and software changes |
||
1047 | between the creation of <CODE>wisdom</CODE> and its use, the greater grows the |
||
1048 | risk of sub-optimal plans. |
||
1049 | |||
1050 | |||
1051 | |||
1052 | |||
1053 | <H3><A NAME="SEC15">Importing and Exporting Wisdom</A></H3> |
||
1054 | <P> |
||
1055 | <A NAME="IDX91"></A> |
||
1056 | |||
1057 | |||
1058 | |||
1059 | <PRE> |
||
1060 | void fftw_export_wisdom_to_file(FILE *output_file); |
||
1061 | fftw_status fftw_import_wisdom_from_file(FILE *input_file); |
||
1062 | </PRE> |
||
1063 | |||
1064 | <P> |
||
1065 | <A NAME="IDX92"></A> |
||
1066 | <A NAME="IDX93"></A> |
||
1067 | |||
1068 | |||
1069 | <P> |
||
1070 | <CODE>fftw_export_wisdom_to_file</CODE> writes the <CODE>wisdom</CODE> to |
||
1071 | <CODE>output_file</CODE>, which must be a file open for |
||
1072 | writing. <CODE>fftw_import_wisdom_from_file</CODE> reads the <CODE>wisdom</CODE> |
||
1073 | from <CODE>input_file</CODE>, which must be a file open for reading, and |
||
1074 | returns <CODE>FFTW_SUCCESS</CODE> if successful and <CODE>FFTW_FAILURE</CODE> |
||
1075 | otherwise. In both cases, the file is left open and must be closed by |
||
1076 | the caller. It is perfectly fine if other data lie before or after the |
||
1077 | <CODE>wisdom</CODE> in the file, as long as the file is positioned at the |
||
1078 | beginning of the <CODE>wisdom</CODE> data before import. |
||
1079 | |||
1080 | |||
1081 | |||
1082 | <PRE> |
||
1083 | char *fftw_export_wisdom_to_string(void); |
||
1084 | fftw_status fftw_import_wisdom_from_string(const char *input_string) |
||
1085 | </PRE> |
||
1086 | |||
1087 | <P> |
||
1088 | <A NAME="IDX94"></A> |
||
1089 | <A NAME="IDX95"></A> |
||
1090 | |||
1091 | |||
1092 | <P> |
||
1093 | <CODE>fftw_export_wisdom_to_string</CODE> allocates a string, exports the |
||
1094 | <CODE>wisdom</CODE> to it in <CODE>NULL</CODE>-terminated format, and returns a |
||
1095 | pointer to the string. If there is an error in allocating or writing |
||
1096 | the data, it returns <CODE>NULL</CODE>. The caller is responsible for |
||
1097 | deallocating the string (with <CODE>fftw_free</CODE>) when she is done with |
||
1098 | it. <CODE>fftw_import_wisdom_from_string</CODE> imports the <CODE>wisdom</CODE> from |
||
1099 | <CODE>input_string</CODE>, returning <CODE>FFTW_SUCCESS</CODE> if successful and |
||
1100 | <CODE>FFTW_FAILURE</CODE> otherwise. |
||
1101 | |||
1102 | |||
1103 | <P> |
||
1104 | Exporting <CODE>wisdom</CODE> does not affect the store of <CODE>wisdom</CODE>. Imported |
||
1105 | <CODE>wisdom</CODE> supplements the current store rather than replacing it |
||
1106 | (except when there is conflicting <CODE>wisdom</CODE>, in which case the older |
||
1107 | <CODE>wisdom</CODE> is discarded). The format of the exported <CODE>wisdom</CODE> is |
||
1108 | "nerd-readable" LISP-like ASCII text; we will not document it here |
||
1109 | except to note that it is insensitive to white space (interested users |
||
1110 | can contact us for more details). |
||
1111 | <A NAME="IDX96"></A> |
||
1112 | <A NAME="IDX97"></A> |
||
1113 | |||
1114 | |||
1115 | <P> |
||
1116 | See Section <A HREF="fftw_3.html#SEC16">FFTW Reference</A>, for more information, and for a description of |
||
1117 | how you can implement <CODE>wisdom</CODE> import/export for other media |
||
1118 | besides files and strings. |
||
1119 | |||
1120 | |||
1121 | <P> |
||
1122 | The following is a brief example in which the <CODE>wisdom</CODE> is read from |
||
1123 | a file, a plan is created (possibly generating more <CODE>wisdom</CODE>), and |
||
1124 | then the <CODE>wisdom</CODE> is exported to a string and printed to |
||
1125 | <CODE>stdout</CODE>. |
||
1126 | |||
1127 | |||
1128 | |||
1129 | <PRE> |
||
1130 | { |
||
1131 | fftw_plan plan; |
||
1132 | char *wisdom_string; |
||
1133 | FILE *input_file; |
||
1134 | |||
1135 | /* open file to read wisdom from */ |
||
1136 | input_file = fopen("sample.wisdom", "r"); |
||
1137 | if (FFTW_FAILURE == fftw_import_wisdom_from_file(input_file)) |
||
1138 | printf("Error reading wisdom!\n"); |
||
1139 | fclose(input_file); /* be sure to close the file! */ |
||
1140 | |||
1141 | /* create a plan for N=64, possibly creating and/or using wisdom */ |
||
1142 | plan = fftw_create_plan(64,FFTW_FORWARD, |
||
1143 | FFTW_MEASURE | FFTW_USE_WISDOM); |
||
1144 | |||
1145 | /* ... do some computations with the plan ... */ |
||
1146 | |||
1147 | /* always destroy plans when you are done */ |
||
1148 | fftw_destroy_plan(plan); |
||
1149 | |||
1150 | /* write the wisdom to a string */ |
||
1151 | wisdom_string = fftw_export_wisdom_to_string(); |
||
1152 | if (wisdom_string != NULL) { |
||
1153 | printf("Accumulated wisdom: %s\n",wisdom_string); |
||
1154 | |||
1155 | /* Just for fun, destroy and restore the wisdom */ |
||
1156 | fftw_forget_wisdom(); /* all gone! */ |
||
1157 | fftw_import_wisdom_from_string(wisdom_string); |
||
1158 | /* wisdom is back! */ |
||
1159 | |||
1160 | fftw_free(wisdom_string); /* deallocate it since we're done */ |
||
1161 | } |
||
1162 | } |
||
1163 | </PRE> |
||
1164 | |||
1165 | <P><HR><P> |
||
1166 | Go to the <A HREF="fftw_1.html">first</A>, <A HREF="fftw_1.html">previous</A>, <A HREF="fftw_3.html">next</A>, <A HREF="fftw_10.html">last</A> section, <A HREF="fftw_toc.html">table of contents</A>. |
||
1167 | </BODY> |
||
1168 | </HTML> |