Introduction to fitting
In addition to the simulation of grazing incidence X-ray and neutron scattering by multilayered samples, BornAgain also offers the option to fit the numerical model to reference data by modifying a selection of sample parameters from the numerical model.
- Implementation in BornAgain
- Stages to run fitting procedure
- How to get the right answer from fitting
Fitting in BornAgain deals with estimating the optimum parameters in the numerical model by minimizing the difference between numerical and reference data. The features include
- a variety of multidimensional minimization algorithms and strategies.
- the choice over possible fitting parameters, their properties and correlations.
- the full control on objective function calculations, including applications of different normalizations and assignments of different masks and weights to different areas of reference data.
- the possibility to fit simultaneously an arbitrary number of data sets.
Figure shows the general work flow of a typical fitting procedure. Before running the fitting the user is required to prepare some data and to configure the fitting kernel of BornAgain. The required stages are
- Preparing the sample and the simulation description (multilayer, beam, detector parameters).
- Choosing the fitting parameters.
- Loading the reference data.
- Defining the minimization settings.
FitSuite contains the main functionalities to be used for the fit and serves as the main interface between the user and the fitting work flow. The later involves iterations
- The minimizer makes an assumption about the optimal sample parameters.
- These parameters are propagated to the sample.
- The simulation is performed for the given state of the sample.
- The simulated data (intensities) are propagated to the χ2 module.
- The later calculates χ2 using the simulated and reference data.
- The value of χ2 is propagated to the minimizer, which makes new assumptions about optimal sample parameters.
The iteration process is going on under the control of the selected minimization algorithm, without any intervention from the user. It stops
- when the maximum number of iteration steps has been exceeded
- when the function’s minimum has been reached within the tolerance window,
- if the minimizer could not improve the values of the parameters.
After the control is returned, fitting results can be retrieved. They consist in the best χ2 value found, the corresponding optimal sample parameters and the intensity map simulated with this set of parameters.
Preparing the sample and the simulation description
This step is similar for any simulation using BornAgain (see simulation tutorial). It consists in first characterizing the geometry of the system: the particles (shapes, sizes, refractive indices), the different layers (thickness, order, refractive index, a possible roughness of the interface), the interference between the particles and theway they are distributed in the layers (buried particles or particles sitting on top of a layer). Then we specify the parameters of the input beam and of the output detector.
Choice of parameters to be fitted
Every parameter used in the construction of the sample can be used as a fitting parameter. For example, the particles’ heights, radii or the layer’s roughness or thickness could be selected using the parameter pool mechanism. This mechanism is explained in detail in section Working with sample parameters and it is therefore recommended to read it before proceeding any further.
The user specifies selected sample parameters as fit parameters using
FitSuite and its
fit_suite = FitSuite() fit_suite.addFitParameter(<name>, <initial value>, <limits>, <step>)
The second parameter <initial value> correspond to the initial value of the fitting parameter. The third parameter <AttLimits> corresponds to the boundaries imposed on parameter value. It can be
<max_value> are double values corresponding to the lower and higher boundary, respectively.
The last parameter
<step>, responsible for the initial iteration steps size, is optional. The majority of minimizers make they own assumption about step size and only
Simulated Annealing minimizers respects this value.
Associating reference and simulated data
The minimization procedure deals with a pair of reference data (normally associated with experimental data) and the theoretical model (presented by the sample and the simulation descriptions).
We assume that the experimental data are a two-dimensional intensity matrix as function of the output scattering angles αf and φf. The user is required to provide the data in the form of an ASCII file containing an axes binning description and the intensity data itself. Examples how to convert the experimental data to the intensity matrix from MARIA (MLZ) ascii and tiff format (used at GALAXI or in DESY) can be found in the
Examples folder of the
To associate the simulation and the reference data to the fitting engine, method
addSimulationAndRealData has to be used as shown
fit_suite = FitSuite() fit_suite.addSimulationAndRealData (<simulation>, <reference>)
<simulation> corresponds to a BornAgain simulation object with the sample, beam and detector fully defined,
<reference> corresponds to the experimental data object obtained from the ASCII file.
It is possible to call this given method more than once to submit more than one pair of
<reference> to the fitting procedure. In this way, simultaneous fits of
some combined data sets are performed.
BornAgain contains a variety of minimization engines from
GSL libraries. They are listed in the table below. By default,
Minuit2 minimizer with default settings will be used and no additional configuration needs to be done. The remainder of this section explains some of the expert settings, which can be applied to get better fit results.
The default minimization algorithm can be changed using
setMinimizer method as shown below
fit_suite = FitSuite() fit_suite.setMinimizer(<minimizer name>, <algorithm_name>)
<Minimizer name> and
<algorithm name> can be chosen from the first and second column of the table, respectively.
||According to the tutorial, best minimizer for nearly all functions, variable-metric method with inexact line search, a stable metric updating scheme, and checks for positive-definiteness.|
||Simplex method of Nelder and Mead usually slower than
||Minimization with Migrad, but switches to Simplex if Migrad fails to converge.|
||Not intended to minimize, just scans the function, one parameter at a time, retains the best value after each scan.|
||Optimized method for least square and log likelihood minimizations.|
||Fletcher-Reeves conjugate gradient algorithm.|
||Polak-Ribiere conjugate gradient algorithm.|
||Improved version of
||Follows the downhill gradient of the function at each step.|
|GSLSimAn||Simulated Annealing Algorithm|
There are several options common to every minimization algorithm, which can be changed before starting the minimization. They are handled by
In the above code snippet, a number of “maximum function calls”, namely the maximum number of times theminimizer is allowed to call the simulation, is limited to 10.
There are also expert-level options common for all minimizers as well as a number of options to tune individual minimization algorithms. They will be explained in due course.
Running the fitting and retrieving the results
After the initial configuration of
FitSuite has been performed, the fitting can be started using the command
Depending on the complexity of the sample and the number of free sample parameters the fitting process can take from tens to thousands of iterations. The results of the fit can
be printed on the screen using the command
This example gives more details on accessing the fitting results.
One of the main difficulties in fitting the data with the model is the presence of multiple local minima in the objective function. Many problems can cause the fit to fail, for example:
- an unreliable physical model
- an unappropriate choice of objective function
- multiple local minima
- an unphysical behavior of the objective function, unphysical regions in the parameters space
- an unreliable parameter error calculation in the presence of limits on the parameter value
- an exponential behavior of the objective function and the corresponding numerical inaccuracies, excessive numerical roundoff in the calculation of its value and derivatives
- large correlations between parameters
- very different scales of parameters involved in the calculation
- not positive definite error matrix even at minimum
The given list, of course, is not only related to BornAgain fitting. It remains applicable to any fitting program and any kind of theoretical model. Below we give some recommendations which might help the user to achieve reliable fit results.
- Initially choose a small number of free fitting parameters
- Eliminate redundant parameters
- Provide a good initial guess for the fit parameters
- Start from the default minimizer settings and perform some fine tuning after some experience has been acquired
- Repeat the fit using different starting values for the parameters or their limits
- Repeat the fit, fixing and varying different groups of parameter