Design Philosophy and Features for v3+¶
Here we describe in broad terms the design philosophy of the new 21cmFAST
,
and some of its new features.
This is useful to get an initial bearing of how to go about using 21cmFAST
, though
most likely the tutorials will be better for that.
It is also useful for those who have used the “old” 21cmFAST
(versions 2.1 and less)
and want to know why they should use this new version (and how to convert).
In doing so, we’ll go over some of the key features of 21cmFAST
v3+.
To get a more in-depth view of all the options and features available, look at the
very thorough API Reference.
Design Philosophy¶
The goal of v3 of 21cmFAST
is to provide the same computational efficiency and
scientific features of the previous generations, but packaged in a form that adopts the
best modern programming standards, including:
simple installation
comprehensive documentation
comprehensive test suite
more modular code
standardised code formatting
truly open-source and collaborative design (via Github)
Partly to enable these standards, and partly due to the many extra benefits it brings, v3 also has the major distinction of being wrapped entirely in Python. The extra benefits brought by this include:
a native python library interface (eg. get your output box directly as a
numpy
array).better file-writing, into the HDF5 format, which saves metadata along with the box data.
a caching system so that the same data never has to be calculated twice.
reproducibility: know which exact version of
21cmFAST
, with what parameters, produced a given dataset.significantly improved warnings and error checking/propagation.
simplicity for adding new additional effects, and inserting them in the calculation pipeline.
We hope that additional features and benefits will be found by the growing community
of 21cmFAST
developers and users.
How it Works¶
v3 is not a complete rewrite of 21cmFAST
. Most of the C-code of previous versions
is kept, though it has been modularised and modified in many places. The fundamental
routines are the same (barring bugfixes!).
The major programs of the original version (init
, perturb
, ionize
etc.) have
been converted into modular functions in C. Furthermore, most of the global parameters
(and, more often than not, global #define
options) have been modularised and converted
into a series of input “parameter” structs
. These get passed into the functions.
Furthermore, each C function, instead of writing a bunch of files, returns an output
struct
containing all the stuff it computed.
Each of these functions and structs are wrapped in Python using the cffi
package.
CFFI compiles the C code once upon installation. Due to the fact that parameters are
now passed around to the different functions, rather than being global defines, we no
longer need to re-compile every time an option is changed. Python itself can handle
changing the parameters, and can use the outputs in whatever way the user desires.
To maintain continuity with previous versions, a CLI interface is provided (see below) that acts in a similar fashion to previous versions.
High-level configuration of 21cmFAST
can be set using the py21cmfast.config
object. It is essentially a dictionary with its key/values the parameters. To make any
changes in the object permanent, use the py21cmfast.config.write()
method.
One global configuration option is direc
, which specifies the directory in which
21cmFAST
will cache results by default (this can be overriden directly in any
function, see below for details).
Finally, 21cmFAST
contains a more robust cataloguing/caching method. Instead of
saving data with a selection of the dependent parameters written into the filename –
a method which is prone to error if a parameter which is not part of that selection is
modified – 21cmFAST
writes all data into a configurable central directory with a hash
filename unique to all parameters upon which the data depends. Each kind of dataset has
attached methods which efficiently search this central directory for matching data to be
read when necessary.
Several arguments are available for all library functions which produce such datasets
that control this output. In this way, the data that is being retrieved is always
reliably produced with the desired parameters, and users need not concern themselves
with how and where the data is saved – it can be retrieved merely by creating an empty
object with the desired parameters and calling .read()
, or even better, by calling
the function to produce the given dataset, which will by default just read it in if
available.
CLI¶
The CLI interface always starts with the command 21cmfast
, and has a number of
subcommands. To list the available subcommands, use:
$ 21cmfast --help
To get help on any subcommand, simply use:
$ 21cmfast <subcommand> --help
Any subcommand which runs some aspect of 21cmFAST
will have a --config
option,
which specifies a configuration file. This config file specifies the parameters of the
run. Furthermore, any particular parameter that can be specified in the config file can
be alternatively specified on the command line by appending the command with the
parameter name, eg.:
$ 21cmfast init --config=my_config.yml --HII_DIM=40 hlittle=0.7 --DIM 100 SIGMA_8 0.9
The above command shows the numerous ways in which these parameters can be specified (with or without leading dashes, and with or without “=”).
The CLI interface, while simple to use, does have the limitation that more complex
arguments than can be passed to the library functions are disallowed. For example,
one cannot pass a previously calculated initial conditions box to the perturb
command. However, if such a box has been calculated with the default option to write it
to disk, then it will automatically be found and used in such a situation, i.e. the
following will not re-calculate the init box:
$ 21cmfast init
$ 21cmfast perturb redshift=8.0
This means that almost all of the functionality provided in the library is accessible via the CLI.