WebSVN - shark - Blame - Rev 3 - /shark/trunk/ports/fftw/news

Rev	Author	Line No.	Line
2	pj	1	Version 2.1.2
		2
		3	* Fixed incompatibility between our MPI test programs and MPICH with
		4	the p4 device (TCP/IP). (The 2.1.1 transforms worked, but the test
		5	programs crashed.)
		6
		7	* Added missing fftw_f77_threads_init function to the Fortran wrappers
		8	for the multi-threaded transforms. Thanks to V. Sundararajan for
		9	the bug report.
		10
		11	* The codelet generator can now output efficient hard-coded DCT/DST
		12	transforms. As a side effect of this work, we slightly reduced the
		13	code size of rfftw.
		14
		15	* Test programs now support GNU-style long options when used with glibc.
		16
		17	* Added some more ideas to our TODO list.
		18
		19	* Improved codelet generator speed.
		20
		21	Version 2.1.1
		22
		23	* Fixed bug in the complex transforms for certain sizes with
		24	intermediate-length prime factors (17-97), which under some
		25	(hopefully rare) circumstances could cause incorrect results.
		26	Thanks to Ming-Chang Liu for the bug report and patch. (The test
		27	program will now catch this sort of problem when it is run in
		28	paranoid mode.)
		29
		30	Version 2.1
		31
		32	* Added Fortran-callable wrapper routines for the multi-threaded
		33	transforms.
		34
		35	* Documentation fixes and improvements.
		36
		37	Version 2.1-beta1
		38
		39	* The --enable-type-prefix option to configure makes it easy to install
		40	both single- and double-precision versions of FFTW on the same
		41	(Unix) system. (See the installation section of the manual.)
		42
		43	* The MPI FFTW routines now include parallel one-dimensional transforms
		44	for complex data. (See the fftw_mpi documentation in the FFTW
		45	manual.)
		46
		47	* The MPI FFTW routines now include parallel multi-dimensional transforms
		48	specialized for real data. (See the rfftwnd_mpi documentation in the
		49	FFTW manual.)
		50
		51	* The MPI FFTW routines are now documented in the main
		52	manual (in the doc directory). On Unix systems, they are also
		53	automatically configured, compiled, and installed along with the main
		54	FFTW library when you include --enable-mpi in the flags to the
		55	configure script. (See the FFTW manual.)
		56
		57	* Largely-rewritten MPI code. It is now cleaner and (sometimes) faster.
		58	It also supports the option of a user-supplied workspace for (often)
		59	greater performance (using the MPI_Alltoall primitive). Beware that
		60	the interfaces have changed slightly, however.
		61
		62	* The multi-threaded FFTW routines now include parallel one- and
		63	multi-dimensional transforms of real data. (See the rfftw_threads
		64	documentation in the FFTW manual.)
		65
		66	* The multi-threaded FFTW routines are now documented in the main
		67	manual (in the doc directory). On Unix systems, they are also
		68	automatically configured, compiled, and installed along with the main
		69	FFTW library when you include --enable-threads in the flags to the
		70	configure script. (See the FFTW manual.)
		71
		72	* The multi-threaded FFTW routines now include support for Mach C
		73	threads (used, for example, in Apple's MacOS X).
		74
		75	* The Fortran-callable wrapper routines are now incorporated into
		76	the ordinary FFTW libraries by default (although you can
		77	disable this with the --disable-fortran option to configure) and
		78	are documented in the main FFTW manual.
		79
		80	* Added an illustration of the data layout to the rfftwnd tutorial
		81	section of the manual, in the hope of preventing future confusion
		82	on this subject.
		83
		84	* The test programs now allow you to specify multidimensional sizes
		85	(e.g. 128x54x81) for the -c and -s correctness and speed test options.
		86
		87	Version 2.0.1
		88
		89	* (bug fix) Due to a poorly-parenthesized expression, rfftwnd overflowed
		90	32-bit integer precision for rank > 1 transforms with a final
		91	dimension >= 65536. This is now fixed. (Thanks to Walter Brisken
		92	for the bug report.)
		93
		94	* (bug fix) Added definition of FFTW_OUT_OF_PLACE to fftw.h. The
		95	flag is mentioned several times in the documentation, but its
		96	definition was accidentally omitted since FFTW_OUT_OF_PLACE is the
		97	default behavior.
		98
		99	* Corrected various small errors in the documentation. Thanks to
		100	Geir Thomassen and Jeremy Buhler for their comments.
		101
		102	* Improved speed of the codelet generator by orders of magnitude,
		103	since a user needed a hard-coded fft of size 101.
		104
		105	* Modified buffering in multidimensional transforms for some speed
		106	improvements (only when fftwnd_create_plan_specific is used).
		107	Thanks to Geert van Kempen for his tips.
		108
		109	* Added Andrew Sterian's patch to allow FFTW to be used as a shared
		110	library more easily on Win32.
		111
		112	Version 2.0
		113
		114	* Completely rewritten real-complex transforms, now using
		115	specialized codelets and an inherently real-complex algorithm for
		116	greatly increased speed. Also, rfftw can now handle odd sizes and
		117	strided transforms. Beware that the output format for 1D rfftw
		118	transforms has changed. See the manual for more details.
		119
		120	* The complex transforms now use a fast algorithm for large prime
		121	factors, working in O(N lg N) time even for prime sizes.
		122	(Previously, the complexity contained an O(p^2) term, where p is
		123	the largest prime factor of N. This is still the case for the
		124	rfftw transforms.) Small prime factors are still more efficient,
		125	however.
		126
		127	* Added functions fftw_one, fftwnd_one, rfftw_one, etcetera, to
		128	simplify and clarify the use of fftw for single, unit-stride
		129	transforms.
		130
		131	* Renamed FFTW_COMPLEX, FFTW_REAL to fftw_complex, fftw_real (for
		132	greater consistency in capitalization). The all-caps names will
		133	continue to be supported indefinitely, but are deprecated. (Also,
		134	support for the COMPLEX and REAL types from FFTW 1.0 is now
		135	disabled by default.)
		136
		137	* There are now Fortran-callable wrappers for the rfftw real-complex
		138	transforms.
		139
		140	* New section of the manual discussing the use of FFTW with multiple
		141	threads, and a new FFTW_THREADSAFE flag (described therein).
		142
		143	* Added shared library support. Use configure --enable-shared to
		144	produce a shared library instead of a static library (the default).
		145
		146	* Dropped support for the operation-count (*_op_count) routines
		147	introduced in v1.3, as these were little-used and were a pain to
		148	keep up-to-date as FFTW changed internally.
		149
		150	* Made it easier to support floating-point types other than float
		151	and double (e.g. long double). (See the file fftw-int.h.)
		152
		153	Version 1.3
		154
		155	* Multi-dimensional transforms contain significant performance
		156	improvements for dimensions >= 3.
		157
		158	* Performance improvements in multi-dimensional transforms
		159	with howmany > 1 and stride > dist.
		160
		161	* Improved parallelization and performance in the threads
		162	code for dimensions >= 3.
		163
		164	* Changed the wisdom import/export format (the new wisdom remembers
		165	the stride of the plan that generated it, for use with the new
		166	create_plan_specific functions). (You should regenerate any stored
		167	wisdom you have anyway, since this is a new version of FFTW.)
		168
		169	* Several small fixes to aid compilation on some systems.
		170
		171	Version 1.3b1
		172
		173	* Fixed a bug in the MPI transform (in the transpose routine) that
		174	caused errors for some array sizes.
		175
		176	* Fixed the (hopefully) last few things causing problems with C++
		177	compilers.
		178
		179	* Hack for x86/gcc to properly align local double-precision variables.
		180
		181	* Completely rewritten codelet generator. Now it produces
		182	better code for non powers of 2, and is ready to produce
		183	real->complex transforms.
		184
		185	* Testing algorithm is now more robust, and has a more rigorous
		186	theoretical foundation. (Bugs in testing large transforms or
		187	in single precision are now fixed--these bugs were only in the
		188	test programs and not in the FFTW library itself.)
		189
		190	* Added "specific" planners, which allow plan optimization for a
		191	specific array/stride. They also reduce the memory requirements
		192	of the planner, and permit new optimizations in the multi-dimensional
		193	case. (See the *_create_plan_specific functions.)
		194
		195	* FFTW can now compute a count of the number of arithmetic operations
		196	it requires, which is useful for some academic purposes. (See the
		197	*_count_plan_ops functions.)
		198
		199	* Adapted for use with GNU autoconf to aid installation on UNIX systems.
		200	(Installation on non-UNIX systems should be the same as before.)
		201
		202	* Used gettimeofday function if available. (This function typically
		203	has much higher accuracy than clock(), permitting plans to be
		204	created much more quickly than before on many machines.)
		205
		206	* Made timing algorithm (hopefully) more robust in the face of
		207	system interrupts, etc.
		208
		209	* Added wrapper routines for calling FFTW from MATLAB (in the
		210	matlab/ directory).
		211
		212	* Added wrapper routines for calling FFTW from Fortran (in the
		213	fortran/ directory). (These were available separately before.)
		214
		215	Version 1.2.1
		216
		217	* Fixed a third bug in the mpi transpose routines (sheesh!) that
		218	could cause problems when re-using a transpose plan. Thanks
		219	to Eric Skyllingstad for the bug reports.
		220
		221	* Fixed another bug in the mpi transpose routines. This bug produced
		222	a memory leak and also occasionally tries to free a null pointer,
		223	which causes problems on some systems. The mpi transpose/fft routines
		224	now pass all of our malloc paranoia tests.
		225
		226	* Fixed bug in mpi transpose routines, where wrong results
		227	could be given for some large 2D arrays.
		228
		229	Version 1.2:
		230
		231	* Added a FAQ (in the FAQ/ directory).
		232
		233	* Fixed bug in rfftwnd routines where a block was accidentally
		234	allocated to be too small, causing random memory to be
		235	overwritten (yikes!). (Amazingly, this bug only caused the
		236	test program to fail on one system that we could find. Our
		237	test suite can now catch this sort of bug.)
		238
		239	* Abstractified taking differences of times (with fftw_time_diff
		240	macro/function) to allow more general timer data structures.
		241
		242	* Added "wisdom" mechanism for saving plans & related info.
		243
		244	* Made timing mechanism more robust and maintainable. (Instead of
		245	using a fixed number of iterations, we now repeatedly double
		246	the number of iterations until a specified time interval
		247	(FFTW_TIME_MIN) is reached.)
		248
		249	* Fixed header files to prevent difficulties when a mix of C and
		250	C++ compilers is used, and to prevent problems with multiple
		251	inclusions.
		252
		253	* Added experimental distributed-memory transforms using MPI.
		254
		255	* Fixed memory leak in fftwnd_destroy_plan (reported by Richard
		256	Sullivan). Our test programs now all check for leaks.
		257
		258	Version 1.1:
		259
		260	* Improved speed (yes!) [Some clever tricks with twiddle factors
		261	and better code generator]
		262
		263	* Renamed `blocks' to `codelets', just to be fashionable
		264
		265	* Rewritten planner and executor--much simpler and more readable
		266	code. Reference-counter garbage collection employed throughout.
		267
		268	* Much improved codelet generator. The ML code should be now
		269	readable by humans, and easier to modify.
		270
		271	* Support for Prime Factor transforms in the codelet generator.
		272
		273	* Renamed COMPLEX -> FFTW_COMPLEX to avoid clashes with
		274	existing packages. COMPLEX is still supported
		275	for compatibility with 1.0
		276
		277	* Added experimental real->complex transform (quick hack,
		278	use at your own risk).
		279
		280	* Added experimental parallel transforms using Cilk.
		281
		282	* Added experimental parallel transforms using threads (currently,
		283	POSIX threads and Solaris threads are implemented and tested).
		284
		285	* Added DOS support, in the sense that we now support 8.3 filenames.
		286
		287	Version 1.0: First release

Subversion Repositories shark

(root)/shark/trunk/ports/fftw/news @ 896 - Rev 3