Performing Fits and Analyzing Outputs¶
As shown in the previous chapter, a simple fit can be performed with the
minimize()
function. For more sophisticated modeling, the
Minimizer
class can be used to gain a bit more control, especially
when using complicated constraints or comparing results from related fits.
The minimize()
function¶
The minimize()
function is a wrapper around Minimizer
for
running an optimization problem. It takes an objective function (the
function that calculates the array to be minimized), a Parameters
object, and several optional arguments. See Writing a Fitting Function for
details on writing the objective function.

minimize
(fcn, params, method='leastsq', args=None, kws=None, scale_covar=True, iter_cb=None, reduce_fcn=None, **fit_kws)¶ Perform a fit of a set of parameters by minimizing an objective (or cost) function using one of the several available methods.
The minimize function takes an objective function to be minimized, a dictionary (
Parameters
) containing the model parameters, and several optional arguments.Parameters:  fcn (callable) – Objective function to be minimized. When method is leastsq or least_squares, the objective function should return an array of residuals (difference between model and data) to be minimized in a leastsquares sense. With the scalar methods the objective function can either return the residuals array or a single scalar value. The function must have the signature: fcn(params, *args, **kws)
 params (
Parameters
) – Contains the Parameters for the model.  method (str, optional) –
Name of the fitting method to use. Valid values are:
 ’leastsq’: LevenbergMarquardt (default)
 ’least_squares’: LeastSquares minimization, using Trust Region Reflective method by default
 ’differential_evolution’: differential evolution
 ’brute’: brute force method
 ’nelder’: NelderMead
 ’lbfgsb’: LBFGSB
 ’powell’: Powell
 ’cg’: ConjugateGradient
 ’newton’: NewtonCongugateGradient
 ’cobyla’: Cobyla
 ’tnc’: Truncate Newton
 ’trustncg’: Trust NewtonCongugateGradient
 ’dogleg’: Dogleg
 ’slsqp’: Sequential Linear Squares Programming
In most cases, these methods wrap and use the method of the same name from scipy.optimize, or use scipy.optimize.minimize with the same method argument. Thus ‘leastsq’ will use scipy.optimize.leastsq, while ‘powell’ will use scipy.optimize.minimizer(…, method=’powell’)
For more details on the fitting methods please refer to the SciPy docs.
 args (tuple, optional) – Positional arguments to pass to fcn.
 kws (dict, optional) – Keyword arguments to pass to fcn.
 iter_cb (callable, optional) – Function to be called at each fit iteration. This function should have the signature iter_cb(params, iter, resid, *args, **kws), where params will have the current parameter values, iter the iteration number, resid the current residual array, and *args and **kws as passed to the objective function.
 scale_covar (bool, optional) – Whether to automatically scale the covariance matrix (default is True, leastsq only).
 reduce_fcn (str or callable, optional) – Function to convert a residual array to a scalar value for the scalar minimizers. See notes in Minimizer.
 **fit_kws (dict, optional) – Options to pass to the minimizer being used.
Returns: Object containing the optimized parameter and several goodnessoffit statistics.
Return type: Changed in version 0.9.0: Return value changed to
MinimizerResult
.Notes
The objective function should return the value to be minimized. For the LevenbergMarquardt algorithm from leastsq(), this returned value must be an array, with a length greater than or equal to the number of fitting variables in the model. For the other methods, the return value can either be a scalar or an array. If an array is returned, the sum of squares of the array will be sent to the underlying fitting method, effectively doing a leastsquares optimization of the return values.
A common use for args and kws would be to pass in other data needed to calculate the residual, including such things as the data array, dependent variable, uncertainties in the data, and other data structures for the model calculation.
On output, params will be unchanged. The bestfit values and, where appropriate, estimated uncertainties and correlations, will all be contained in the returned
MinimizerResult
. See MinimizerResult – the optimization result for further details.This function is simply a wrapper around
Minimizer
and is equivalent to:fitter = Minimizer(fcn, params, fcn_args=args, fcn_kws=kws, iter_cb=iter_cb, scale_covar=scale_covar, **fit_kws) fitter.minimize(method=method)
Writing a Fitting Function¶
An important component of a fit is writing a function to be minimized – the objective function. Since this function will be called by other routines, there are fairly stringent requirements for its call signature and return value. In principle, your function can be any Python callable, but it must look like this:

func(params, *args, **kws):
Calculate objective residual to be minimized from parameters.
Parameters:  params (
Parameters
) – Parameters.  args – Positional arguments. Must match
args
argument tominimize()
.  kws – Keyword arguments. Must match
kws
argument tominimize()
.
Returns: Residual array (generally datamodel) to be minimized in the leastsquares sense.
Return type: numpy.ndarray. The length of this array cannot change between calls.
 params (
A common use for the positional and keyword arguments would be to pass in other data needed to calculate the residual, including things as the data array, dependent variable, uncertainties in the data, and other data structures for the model calculation.
The objective function should return the value to be minimized. For the
LevenbergMarquardt algorithm from leastsq()
, this returned value must be an
array, with a length greater than or equal to the number of fitting variables in the
model. For the other methods, the return value can either be a scalar or an array. If an
array is returned, the sum of squares of the array will be sent to the underlying fitting
method, effectively doing a leastsquares optimization of the return values.
Since the function will be passed in a dictionary of Parameters
, it is advisable
to unpack these to get numerical values at the top of the function. A
simple way to do this is with Parameters.valuesdict()
, as shown below:
def residual(pars, x, data=None, eps=None):
from numpy import exp, sign, sin, pi
# unpack parameters: extract .value attribute for each parameter
parvals = pars.valuesdict()
period = parvals['period']
shift = parvals['shift']
decay = parvals['decay']
if abs(shift) > pi/2:
shift = shift  sign(shift)*pi
if abs(period) < 1.e10:
period = sign(period)*1.e10
model = parvals['amp'] * sin(shift + x/period) * exp(x*x*decay*decay)
if data is None:
return model
if eps is None:
return model  data
return (modeldata) / eps
In this example, x is a positional (required) argument, while the
data array is actually optional (so that the function returns the model
calculation if the data is neglected). Also note that the model
calculation will divide x by the value of the period
Parameter. It
might be wise to ensure this parameter cannot be 0. It would be possible
to use bounds on the Parameter
to do this:
params['period'] = Parameter(value=2, min=1.e10)
but putting this directly in the function with:
if abs(period) < 1.e10:
period = sign(period)*1.e10
is also a reasonable approach. Similarly, one could place bounds on the
decay
parameter to take values only between pi/2
and pi/2
.
Choosing Different Fitting Methods¶
By default, the LevenbergMarquardt algorithm is used for fitting. While often criticized, including the fact it finds a local minima, this approach has some distinct advantages. These include being fast, and wellbehaved for most curvefitting needs, and making it easy to estimate uncertainties for and correlations between pairs of fit variables, as discussed in MinimizerResult – the optimization result.
Alternative algorithms can also be used by providing the method
keyword to the minimize()
function or Minimizer.minimize()
class as listed in the Table of Supported Fitting Methods.
Table of Supported Fitting Methods:
Fitting Method method
arg tominimize()
orMinimizer.minimize()
LevenbergMarquardt leastsq
orleast_squares
NelderMead nelder
LBFGSB lbfgsb
Powell powell
Conjugate Gradient cg
NewtonCG newton
COBYLA cobyla
Truncated Newton tnc
Dogleg dogleg
Sequential Linear Squares Programming slsqp
Differential Evolution differential_evolution
Brute force method brute
Note
The objective function for the LevenbergMarquardt method must return an array, with more elements than variables. All other methods can return either a scalar value or an array.
Warning
Much of this documentation assumes that the LevenbergMarquardt method is used. Many of the fit statistics and estimates for uncertainties in parameters discussed in MinimizerResult – the optimization result are done only for this method.
MinimizerResult
– the optimization result¶
New in version 0.9.0.
An optimization with minimize()
or Minimizer.minimize()
will return a MinimizerResult
object. This is an otherwise
plain container object (that is, with no methods of its own) that
simply holds the results of the minimization. These results will
include several pieces of informational data such as status and error
messages, fit statistics, and the updated parameters themselves.
Importantly, the parameters passed in to Minimizer.minimize()
will be not be changed. To find the bestfit values, uncertainties
and so on for each parameter, one must use the
MinimizerResult.params
attribute. For example, to print the
fitted values, bounds and other parameter attributes in a
wellformatted text tables you can execute:
result.params.pretty_print()
with results being a MinimizerResult object. Note that the method
pretty_print()
accepts several arguments
for customizing the output (e.g., column width, numeric format, etcetera).

class
MinimizerResult
(**kws)¶ The results of a minimization.
Minimization results include data such as status and error messages, fit statistics, and the updated (i.e., bestfit) parameters themselves in the
params
attribute.The list of (possible) MinimizerResult attributes is given below:

params
¶ Parameters
– The bestfit parameters resulting from the fit.

status
¶ int – Termination status of the optimizer. Its value depends on the underlying solver. Refer to message for details.

var_names
¶ list – Ordered list of variable parameter names used in optimization, and useful for understanding the values in
init_vals
andcovar
.

covar
¶ numpy.ndarray – Covariance matrix from minimization (leastsq only), with rows and columns corresponding to
var_names
.

init_values
¶ dict – Dictionary of initial values for variable parameters.

nfev
¶ int – Number of function evaluations.

success
¶ bool – True if the fit succeeded, otherwise False.

errorbars
¶ bool – True if uncertainties were estimated, otherwise False.

message
¶ str – Message about fit success.

ier
¶ int – Integer error value from scipy.optimize.leastsq (leastsq only).

lmdif_message
¶ str – Message from scipy.optimize.leastsq (leastsq only).

nvarys
¶ int – Number of variables in fit: \(N_{\rm varys}\).

ndata
¶ int – Number of data points: \(N\).

nfree
¶ int – Degrees of freedom in fit: \(N  N_{\rm varys}\).

residual
¶ numpy.ndarray – Residual array \({\rm Resid_i}\). Return value of the objective function when using the bestfit values of the parameters.

chisqr
¶ float – Chisquare: \(\chi^2 = \sum_i^N [{\rm Resid}_i]^2\).

redchi
¶ float – Reduced chisquare: \(\chi^2_{\nu}= {\chi^2} / {(N  N_{\rm varys})}\).

aic
¶ float – Akaike Information Criterion statistic: \(N \ln(\chi^2/N) + 2 N_{\rm varys}\).

bic
¶ float – Bayesian Information Criterion statistic: \(N \ln(\chi^2/N) + \ln(N) N_{\rm varys}\).

flatchain
¶ pandas.DataFrame – A flatchain view of the sampling chain from the emcee method.

show_candidates
()¶ Pretty_print() representation of candidates from the brute method.

GoodnessofFit Statistics¶
Table of Fit Results: These values, including the standard GoodnessofFit statistics, are all attributes of theMinimizerResult
object returned byminimize()
orMinimizer.minimize()
.
Attribute Name  Description / Formula 

nfev  number of function evaluations 
nvarys  number of variables in fit \(N_{\rm varys}\) 
ndata  number of data points: \(N\) 
nfree  degrees of freedom in fit: \(N  N_{\rm varys}\) 
residual  residual array, returned by the objective function: \(\{\rm Resid_i\}\) 
chisqr  chisquare: \(\chi^2 = \sum_i^N [{\rm Resid}_i]^2\) 
redchi  reduced chisquare: \(\chi^2_{\nu}= {\chi^2} / {(N  N_{\rm varys})}\) 
aic  Akaike Information Criterion statistic (see below) 
bic  Bayesian Information Criterion statistic (see below) 
var_names  ordered list of variable parameter names used for init_vals and covar 
covar  covariance matrix (with rows/columns using var_names) 
init_vals  list of initial values for variable parameters 
Note that the calculation of chisquare and reduced chisquare assume that the returned residual function is scaled properly to the uncertainties in the data. For these statistics to be meaningful, the person writing the function to be minimized must scale them properly.
After a fit using the leastsq()
method has completed
successfully, standard errors for the fitted variables and correlations
between pairs of fitted variables are automatically calculated from the
covariance matrix. The standard error (estimated \(1\sigma\)
errorbar) goes into the stderr
attribute of the Parameter. The
correlations with all other variables will be put into the
correl
attribute of the Parameter – a dictionary with keys for
all other Parameters and values of the corresponding correlation.
In some cases, it may not be possible to estimate the errors and
correlations. For example, if a variable actually has no practical effect
on the fit, it will likely cause the covariance matrix to be singular,
making standard errors impossible to estimate. Placing bounds on varied
Parameters makes it more likely that errors cannot be estimated, as being
near the maximum or minimum value makes the covariance matrix singular. In
these cases, the errorbars
attribute of the fit result
(Minimizer
object) will be False
.
Akaike and Bayesian Information Criteria¶
The MinimizerResult
includes the traditional chisquare and reduced chisquare statistics:
where \(r\) is the residual array returned by the objective function
(likely to be (datamodel)/uncertainty
for data modeling usages),
\(N\) is the number of data points (ndata
), and \(N_{\rm
varys}\) is number of variable parameters.
Also included are the Akaike Information Criterion, and
Bayesian Information Criterion statistics,
held in the aic
and bic
attributes, respectively. These give slightly
different measures of the relative quality for a fit, trying to balance
quality of fit with the number of variable parameters used in the fit.
These are calculated as:
When comparing fits with different numbers of varying parameters, one typically selects the model with lowest reduced chisquare, Akaike information criterion, and/or Bayesian information criterion. Generally, the Bayesian information criterion is considered the most conservative of these statistics.
Using a Iteration Callback Function¶
An iteration callback function is a function to be called at each iteration, just after the objective function is called. The iteration callback allows usersupplied code to be run at each iteration, and can be used to abort a fit.

iter_cb(params, iter, resid, *args, **kws):
Usersupplied function to be run at each iteration.
Parameters:  params (
Parameters
) – Parameters.  iter (int) – Iteration number.
 resid (numpy.ndarray) – Residual array.
 args – Positional arguments. Must match
args
argument tominimize()
 kws – Keyword arguments. Must match
kws
argument tominimize()
Returns: Residual array (generally datamodel) to be minimized in the leastsquares sense.
Return type: None for normal behavior, any value like True to abort the fit.
 params (
Normally, the iteration callback would have no return value or return
None
. To abort a fit, have this function return a value that is
True
(including any nonzero integer). The fit will also abort if any
exception is raised in the iteration callback. When a fit is aborted this
way, the parameters will have the values from the last iteration. The fit
statistics are not likely to be meaningful, and uncertainties will not be computed.
Using the Minimizer
class¶
For full control of the fitting process, you will want to create a
Minimizer
object.

class
Minimizer
(userfcn, params, fcn_args=None, fcn_kws=None, iter_cb=None, scale_covar=True, nan_policy='raise', reduce_fcn=None, **kws)¶ A general minimizer for curve fitting and optimization.
Parameters:  userfcn (callable) –
Objective function that returns the residual (difference between model and data) to be minimized in a leastsquares sense. This function must have the signature:
userfcn(params, *fcn_args, **fcn_kws)
 params (
Parameters
) – Contains the Parameters for the model.  fcn_args (tuple, optional) – Positional arguments to pass to userfcn.
 fcn_kws (dict, optional) – Keyword arguments to pass to userfcn.
 iter_cb (callable, optional) –
Function to be called at each fit iteration. This function should have the signature:
iter_cb(params, iter, resid, *fcn_args, **fcn_kws)
where params will have the current parameter values, iter the iteration number, resid the current residual array, and *fcn_args and **fcn_kws are passed to the objective function.
 scale_covar (bool, optional) – Whether to automatically scale the covariance matrix (default is True, leastsq only).
 nan_policy (str, optional) –
Specifies action if userfcn (or a Jacobian) returns NaN values. One of:
 ’raise’ : a ValueError is raised
 ’propagate’ : the values returned from userfcn are unaltered
 ’omit’ : nonfinite values are filtered
 reduce_fcn (str or callable, optional) –
Function to convert a residual array to a scalar value for the scalar minimizers. Optional values are (where r is the residual array):
 None : sum of squares of residual [default]= (r*r).sum()
 ’negentropy’ : neg entropy, using normal distribution= rho*log(rho).sum()`, where rho = exp(r*r/2)/(sqrt(2*pi))
 ’neglogcauchy’: neg log likelihood, using Cauchy distribution= log(1/(pi*(1+r*r))).sum()
 callable : must take one argument (r) and return a float.
 None : sum of squares of residual [default]
 **kws (dict, optional) – Options to pass to the minimizer being used.
Notes
The objective function should return the value to be minimized. For the LevenbergMarquardt algorithm from
leastsq()
orleast_squares()
, this returned value must be an array, with a length greater than or equal to the number of fitting variables in the model. For the other methods, the return value can either be a scalar or an array. If an array is returned, the sum of squares of the array will be sent to the underlying fitting method, effectively doing a leastsquares optimization of the return values. If the objective function returns nonfinite values then a ValueError will be raised because the underlying solvers cannot deal with them.A common use for the fcn_args and fcn_kws would be to pass in other data needed to calculate the residual, including such things as the data array, dependent variable, uncertainties in the data, and other data structures for the model calculation.
 userfcn (callable) –
The Minimizer object has a few public methods:

Minimizer.
minimize
(method='leastsq', params=None, **kws)¶ Perform the minimization.
Parameters:  method (str, optional) –
Name of the fitting method to use. Valid values are:
 ’leastsq’: LevenbergMarquardt (default)
 ’least_squares’: LeastSquares minimization, using Trust Region Reflective method by default
 ’differential_evolution’: differential evolution
 ’brute’: brute force method
 ’nelder’: NelderMead
 ’lbfgsb’: LBFGSB
 ’powell’: Powell
 ’cg’: ConjugateGradient
 ’newton’: NewtonCG
 ’cobyla’: Cobyla
 ’tnc’: Truncate Newton
 ’trustncg’: Trust NewtonCGn
 ’dogleg’: Dogleg
 ’slsqp’: Sequential Linear Squares Programming
In most cases, these methods wrap and use the method with the same name from scipy.optimize, or use scipy.optimize.minimize with the same method argument. Thus ‘leastsq’ will use scipy.optimize.leastsq, while ‘powell’ will use scipy.optimize.minimizer(…, method=’powell’)
For more details on the fitting methods please refer to the SciPy docs.
 params (
Parameters
, optional) – Parameters of the model to use as starting values.  **kws (optional) – Additional arguments are passed to the underlying minimization method.
Returns: Object containing the optimized parameter and several goodnessoffit statistics.
Return type: Changed in version 0.9.0: Return value changed to
MinimizerResult
. method (str, optional) –

Minimizer.
leastsq
(params=None, **kws)¶ Use LevenbergMarquardt minimization to perform a fit.
It assumes that the input Parameters have been initialized, and a function to minimize has been properly set up. When possible, this calculates the estimated uncertainties and variable correlations from the covariance matrix.
This method calls scipy.optimize.leastsq. By default, numerical derivatives are used, and the following arguments are set:
leastsq()
argDefault Value Description xtol 1.e7 Relative error in the approximate solution ftol 1.e7 Relative error in the desired sum of squares maxfev 2000*(nvar+1) Maximum number of function calls (nvar= # of variables) Dfun None Function to call for Jacobian calculation Parameters:  params (
Parameters
, optional) – Parameters to use as starting point.  **kws (dict, optional) – Minimizer options to pass to scipy.optimize.leastsq.
Returns: Object containing the optimized parameter and several goodnessoffit statistics.
Return type: Changed in version 0.9.0: Return value changed to
MinimizerResult
. params (

Minimizer.
least_squares
(params=None, **kws)¶ Use the least_squares (new in scipy 0.17) to perform a fit.
It assumes that the input Parameters have been initialized, and a function to minimize has been properly set up. When possible, this calculates the estimated uncertainties and variable correlations from the covariance matrix.
This method wraps scipy.optimize.least_squares, which has inbuilt support for bounds and robust loss functions.
Parameters:  params (
Parameters
, optional) – Parameters to use as starting point.  **kws (dict, optional) – Minimizer options to pass to scipy.optimize.least_squares.
Returns: Object containing the optimized parameter and several goodnessoffit statistics.
Return type: Changed in version 0.9.0: Return value changed to
MinimizerResult
. params (

Minimizer.
scalar_minimize
(method='NelderMead', params=None, **kws)¶ Scalar minimization using scipy.optimize.minimize.
Perform fit with any of the scalar minimization algorithms supported by scipy.optimize.minimize. Default argument values are:
scalar_minimize()
argDefault Value Description method NelderMead
fitting method tol 1.e7 fitting and parameter tolerance hess None Hessian of objective function Parameters:  method (str, optional) –
Name of the fitting method to use. One of:
 ’NelderMead’ (default)
 ’LBFGSB’
 ’Powell’
 ’CG’
 ’NewtonCG’
 ’COBYLA’
 ’TNC’
 ’trustncg’
 ’dogleg’
 ’SLSQP’
 ’differential_evolution’
 params (
Parameters
, optional) – Parameters to use as starting point.  **kws (dict, optional) – Minimizer options pass to scipy.optimize.minimize.
Returns: Object containing the optimized parameter and several goodnessoffit statistics.
Return type: Changed in version 0.9.0: Return value changed to
MinimizerResult
.Notes
If the objective function returns a NumPy array instead of the expected scalar, the sum of squares of the array will be used.
Note that bounds and constraints can be set on Parameters for any of these methods, so are not supported separately for those designed to use bounds. However, if you use the differential_evolution method you must specify finite (min, max) for each varying Parameter.
 method (str, optional) –

Minimizer.
prepare_fit
(params=None)¶ Prepare parameters for fitting.
Prepares and initializes model and Parameters for subsequent fitting. This routine prepares the conversion of
Parameters
into fit variables, organizes parameter bounds, and parses, “compiles” and checks constrain expressions. The method also creates and returns a new instance of aMinimizerResult
object that contains the copy of the Parameters that will actually be varied in the fit.Parameters: params ( Parameters
, optional) – Contains the Parameters for the model; if None, then the Parameters used to initialize the Minimizer object are used.Returns: Return type: MinimizerResult
Notes
This method is called directly by the fitting methods, and it is generally not necessary to call this function explicitly.
Changed in version 0.9.0: Return value changed to
MinimizerResult
.

Minimizer.
brute
(params=None, Ns=20, keep=50)¶ Use the brute method to find the global minimum of a function.
The following parameters are passed to scipy.optimize.brute and cannot be changed:
brute()
argValue Description full_output 1 Return the evaluation grid and the objective function’s values on it. finish None No “polishing” function is to be used after the grid search. disp False Do not print convergence messages (when finish is not None). It assumes that the input Parameters have been initialized, and a function to minimize has been properly set up.
Parameters:  params (
Parameters
object, optional) – Contains the Parameters for the model. If None, then the Parameters used to initialize the Minimizer object are used.  Ns (int, optional) – Number of grid points along the axes, if not otherwise specified (see Notes).
 keep (int, optional) – Number of best candidates from the brute force method that are
stored in the
candidates
attribute. If ‘all’, then all grid points from scipy.optimize.brute are stored as candidates.
Returns: Object containing the parameters from the brute force method. The return values (x0, fval, grid, Jout) from scipy.optimize.brute are stored as brute_<parname> attributes. The MinimizerResult also contains the candidates attribute and show_candidates() method. The candidates attribute contains the parameters and chisqr from the brute force method as a namedtuple, (‘Candidate’, [‘params’, ‘score’]), sorted on the (lowest) chisqr value. To access the values for a particular candidate one can use result.candidate[#].params or result.candidate[#].score, where a lower # represents a better candidate. The show_candidates(#) uses the
pretty_print()
method to show a specific candidate# or all candidates when no number is specified.Return type: New in version 0.9.6.
Notes
The
brute()
method evalutes the function at each point of a multidimensional grid of points. The grid points are generated from the parameter ranges using Ns and (optional) brute_step. The implementation in scipy.optimize.brute requires finite bounds and the range is specified as a twotuple (min, max) or sliceobject (min, max, brute_step). A sliceobject is used directly, whereas a twotuple is converted to a slice object that interpolates Ns points from min to max, inclusive.In addition, the
brute()
method in lmfit, handles three other scenarios given below with their respective sliceobject: lower bound (
min
) andbrute_step
are specified:  range = (min, min + Ns * brute_step, brute_step).
 lower bound (
 upper bound (
max
) andbrute_step
are specified:  range = (max  Ns * brute_step, max, brute_step).
 upper bound (
 numerical value (
value
) andbrute_step
are specified:  range = (value  (Ns//2) * brute_step, value + (Ns//2) * brute_step, brute_step).
 numerical value (
 params (
For more information, check the examples in examples/lmfit_brute.py
.

Minimizer.
emcee
(params=None, steps=1000, nwalkers=100, burn=0, thin=1, ntemps=1, pos=None, reuse_sampler=False, workers=1, float_behavior='posterior', is_weighted=True, seed=None)¶ Bayesian sampling of the posterior distribution using emcee.
Bayesian sampling of the posterior distribution for the parameters using the emcee Markov Chain Monte Carlo package. The method assumes that the prior is Uniform. You need to have emcee installed to use this method.
Parameters:  params (
Parameters
, optional) – Parameters to use as starting point. If this is not specified then the Parameters used to initialize the Minimizer object are used.  steps (int, optional) – How many samples you would like to draw from the posterior distribution for each of the walkers?
 nwalkers (int, optional) – Should be set so \(nwalkers >> nvarys\), where nvarys are the number of parameters being varied during the fit. “Walkers are the members of the ensemble. They are almost like separate MetropolisHastings chains but, of course, the proposal distribution for a given walker depends on the positions of all the other walkers in the ensemble.”  from the emcee webpage.
 burn (int, optional) – Discard this many samples from the start of the sampling regime.
 thin (int, optional) – Only accept 1 in every thin samples.
 ntemps (int, optional) – If ntemps > 1 perform a Parallel Tempering.
 pos (numpy.ndarray, optional) – Specify the initial positions for the sampler. If ntemps == 1 then pos.shape should be (nwalkers, nvarys). Otherwise, (ntemps, nwalkers, nvarys). You can also initialise using a previous chain that had the same ntemps, nwalkers and nvarys. Note that nvarys may be one larger than you expect it to be if your userfcn returns an array and is_weighted is False.
 reuse_sampler (bool, optional) – If you have already run emcee on a given Minimizer object then
it possesses an internal
sampler
attribute. You can continue to draw from the same sampler (retaining the chain history) if you set this option to True. Otherwise a new sampler is created. The nwalkers, ntemps, pos, and params keywords are ignored with this option. Important: the Parameters used to create the sampler must not change inbetween calls to emcee. Alteration of Parameters would include changedmin
,max
,vary
andexpr
attributes. This may happen, for example, if you use an altered Parameters object and call the minimize method inbetween calls to emcee.  workers (Poollike or int, optional) – For parallelization of sampling. It can be any Poollike object with a map method that follows the same calling sequence as the builtin map function. If int is given as the argument, then a multiprocessingbased pool is spawned internally with the corresponding number of parallel processes. ‘mpi4py’based parallelization and ‘joblib’based parallelization pools can also be used here. Note: because of multiprocessing overhead it may only be worth parallelising if the objective function is expensive to calculate, or if there are a large number of objective evaluations per step (ntemps * nwalkers * nvarys).
 float_behavior (str, optional) –
Specifies meaning of the objective function output if it returns a float. One of:
 ’posterior’  objective function returns a logposterior probability
 ’chi2’  objective function returns \(\chi^2\)
See Notes for further details.
 is_weighted (bool, optional) – Has your objective function been weighted by measurement uncertainties? If is_weighted is True then your objective function is assumed to return residuals that have been divided by the true measurement uncertainty (data  model) / sigma. If is_weighted is False then the objective function is assumed to return unweighted residuals, data  model. In this case emcee will employ a positive measurement uncertainty during the sampling. This measurement uncertainty will be present in the output params and output chain with the name __lnsigma. A side effect of this is that you cannot use this parameter name yourself. Important this parameter only has any effect if your objective function returns an array. If your objective function returns a float, then this parameter is ignored. See Notes for more details.
 seed (int or numpy.random.RandomState, optional) – If seed is an int, a new numpy.random.RandomState instance is used, seeded with seed. If seed is already a numpy.random.RandomState instance, then that numpy.random.RandomState instance is used. Specify seed for repeatable minimizations.
Returns: MinimizerResult object containing updated params, statistics, etc. The updated params represent the median (50th percentile) of all the samples, whilst the parameter uncertainties are half of the difference between the 15.87 and 84.13 percentiles. The MinimizerResult also contains the
chain
,flatchain
andlnprob
attributes. Thechain
andflatchain
attributes contain the samples and have the shape (nwalkers, (steps  burn) // thin, nvarys) or (ntemps, nwalkers, (steps  burn) // thin, nvarys), depending on whether Parallel tempering was used or not. nvarys is the number of parameters that are allowed to vary. Theflatchain
attribute is a pandas.DataFrame of the flattened chain, chain.reshape(1, nvarys). To access flattened chain values for a particular parameter use result.flatchain[parname]. Thelnprob
attribute contains the log probability for each sample inchain
. The sample with the highest probability corresponds to the maximum likelihood estimate.Return type: Notes
This method samples the posterior distribution of the parameters using Markov Chain Monte Carlo. To do so it needs to calculate the logposterior probability of the model parameters, F, given the data, D, \(\ln p(F_{true}  D)\). This ‘posterior probability’ is calculated as:
\[\ln p(F_{true}  D) \propto \ln p(D  F_{true}) + \ln p(F_{true})\]where \(\ln p(D  F_{true})\) is the ‘loglikelihood’ and \(\ln p(F_{true})\) is the ‘logprior’. The default logprior encodes prior information already known about the model. This method assumes that the logprior probability is numpy.inf (impossible) if the one of the parameters is outside its limits. The logprior probability term is zero if all the parameters are inside their bounds (known as a uniform prior). The loglikelihood function is given by [1]:
\[\ln p(DF_{true}) = \frac{1}{2}\sum_n \left[\frac{(g_n(F_{true})  D_n)^2}{s_n^2}+\ln (2\pi s_n^2)\right]\]The first summand in the square brackets represents the residual for a given datapoint (\(g\) being the generative model, \(D_n\) the data and \(s_n\) the standard deviation, or measurement uncertainty, of the datapoint). This term represents \(\chi^2\) when summed over all data points. Ideally the objective function used to create lmfit.Minimizer should return the logposterior probability, \(\ln p(F_{true}  D)\). However, since the inbuilt logprior term is zero, the objective function can also just return the loglikelihood, unless you wish to create a nonuniform prior.
If a float value is returned by the objective function then this value is assumed by default to be the logposterior probability, i.e. float_behavior is ‘posterior’. If your objective function returns \(\chi^2\), then you should use a value of ‘chi2’ for float_behavior. emcee will then multiply your \(\chi^2\) value by 0.5 to obtain the posterior probability.
However, the default behaviour of many objective functions is to return a vector of (possibly weighted) residuals. Therefore, if your objective function returns a vector, res, then the vector is assumed to contain the residuals. If is_weighted is True then your residuals are assumed to be correctly weighted by the standard deviation (measurement uncertainty) of the data points (res = (data  model) / sigma) and the loglikelihood (and logposterior probability) is calculated as: 0.5 * numpy.sum(res**2). This ignores the second summand in the square brackets. Consequently, in order to calculate a fully correct logposterior probability value your objective function should return a single value. If is_weighted is False then the data uncertainty, s_n, will be treated as a nuisance parameter and will be marginalized out. This is achieved by employing a strictly positive uncertainty (homoscedasticity) for each data point, \(s_n = \exp(\_\_lnsigma)\). __lnsigma will be present in MinimizerResult.params, as well as Minimizer.chain, nvarys will also be increased by one.
References
[1] http://dan.iel.fm/emcee/current/user/line/  params (
Minimizer.emcee()
 calculating the posterior probability distribution of parameters¶
Minimizer.emcee()
can be used to obtain the posterior probability distribution of
parameters, given a set of experimental data. An example problem is a double
exponential decay. A small amount of Gaussian noise is also added in:
>>> import numpy as np
>>> import lmfit
>>> import matplotlib.pyplot as plt
>>> x = np.linspace(1, 10, 250)
>>> np.random.seed(0)
>>> y = 3.0 * np.exp(x / 2)  5.0 * np.exp((x  0.1) / 10.) + 0.1 * np.random.randn(len(x))
>>> plt.plot(x, y)
>>> plt.show()
Create a Parameter set for the initial guesses:
>>> p = lmfit.Parameters()
>>> p.add_many(('a1', 4.), ('a2', 4.), ('t1', 3.), ('t2', 3., True))
>>> def residual(p):
... v = p.valuesdict()
... return v['a1'] * np.exp(x / v['t1']) + v['a2'] * np.exp((x  0.1) / v['t2'])  y
Solving with minimize()
gives the Maximum Likelihood solution:
>>> mi = lmfit.minimize(residual, p, method='Nelder', nan_policy='omit')
>>> lmfit.printfuncs.report_fit(mi.params, min_correl=0.5)
[[Variables]]
a1: 2.98623689 (init = 4)
a2: 4.33525597 (init = 4)
t1: 1.30993186 (init = 3)
t2: 11.8240752 (init = 3)
>>> plt.plot(x, y)
>>> plt.plot(x, residual(mi.params) + y, 'r')
>>> plt.show()
However, this doesn’t give a probability distribution for the parameters.
Furthermore, we wish to deal with the data uncertainty. This is called
marginalisation of a nuisance parameter. emcee
requires a function that returns
the logposterior probability. The logposterior probability is a sum of the
logprior probability and loglikelihood functions. The logprior probability is
assumed to be zero if all the parameters are within their bounds and np.inf
if any of the parameters are outside their bounds.
>>> # add a noise parameter
>>> mi.params.add('noise', value=1, min=0.001, max=2)
>>> # This is the loglikelihood probability for the sampling. We're going to estimate the
>>> # size of the uncertainties on the data as well.
>>> def lnprob(p):
... noise = p['noise']
... return 0.5 * np.sum((residual(p) / noise)**2 + np.log(2 * np.pi * noise**2))
Now we have to set up the minimizer and do the sampling:
>>> mini = lmfit.Minimizer(lnprob, mi.params)
>>> res = mini.emcee(burn=300, steps=1000, thin=20, params=mi.params)
Lets have a look at those posterior distributions for the parameters. This requires installation of the corner package:
>>> import corner
>>> corner.corner(res.flatchain, labels=res.var_names, truths=list(res.params.valuesdict().values()))
The values reported in the MinimizerResult
are the medians of the
probability distributions and a 1 \(\sigma\) quantile, estimated as half the
difference between the 15.8 and 84.2 percentiles. The median value is not
necessarily the same as the Maximum Likelihood Estimate. We’ll get that as well.
You can see that we recovered the right uncertainty level on the data:
>>> print("median of posterior probability distribution")
>>> print('')
>>> lmfit.report_fit(res.params)
median of posterior probability distribution

[[Variables]]
a1: 2.99342394 +/ 0.15851315 (5.30%) (init = 2.986237)
a2: 4.34384999 +/ 0.12454831 (2.87%) (init = 4.335256)
t1: 1.32338403 +/ 0.14120290 (10.67%) (init = 1.309932)
t2: 11.7962437 +/ 0.48632272 (4.12%) (init = 11.82408)
noise: 0.09761521 +/ 0.00431795 (4.42%) (init = 1)
[[Correlations]] (unreported correlations are < 0.100)
C(a2, t1) = 0.965
C(a2, t2) = 0.959
C(t1, t2) = 0.927
C(a1, a2) = 0.241
C(a1, t2) = 0.168
C(a2, noise) = 0.116
C(t1, noise) = 0.107
>>> # find the maximum likelihood solution
>>> highest_prob = np.argmax(res.lnprob)
>>> hp_loc = np.unravel_index(highest_prob, res.lnprob.shape)
>>> mle_soln = res.chain[hp_loc]
>>> for i, par in enumerate(p):
... p[par].value = mle_soln[i]
>>> print("\nMaximum likelihood Estimation")
>>> print('')
>>> print(p)
Maximum likelihood Estimation

Parameters([('a1', <Parameter 'a1', 2.9684811738216754, bounds=[inf:inf]>),
('a2', <Parameter 'a2', 4.355238699173162, bounds=[inf:inf]>),
('t1', <Parameter 't1', 1.3337647386777762, bounds=[inf:inf]>),
('t2', <Parameter 't2', 11.758394302818514, bounds=[inf:inf]>)])
>>> # Finally lets work out a 1 and 2sigma error estimate for 't1'
>>> quantiles = np.percentile(res.flatchain['t1'], [2.28, 15.9, 50, 84.2, 97.7])
>>> print("1 sigma spread", 0.5 * (quantiles[3]  quantiles[1]))
>>> print("2 sigma spread", 0.5 * (quantiles[4]  quantiles[0]))
1 sigma spread 0.1414604069179637
2 sigma spread 0.453234685099423
Getting and Printing Fit Reports¶

fit_report
(inpars, modelpars=None, show_correl=True, min_correl=0.1, sort_pars=False)¶ Generate a report of the fitting results.
The report contains the bestfit values for the parameters and their uncertainties and correlations.
Parameters:  inpars (Parameters) – Input Parameters from fit or MinimizerResult returned from a fit.
 modelpars (Parameters, optional) – Known Model Parameters.
 show_correl (bool, optional) – Whether to show list of sorted correlations (default is True).
 min_correl (float, optional) – Smallest correlation in absolute value to show (default is 0.1).
 sort_pars (bool or callable, optional) – Whether to show parameter names sorted in alphanumerical order. If False (default), then the parameters will be listed in the order they were added to the Parameters dictionary. If callable, then this (one argument) function is used to extract a comparison key from each list element.
Returns: Multiline text of fit report.
Return type: string
An example using this to write out a fit report would be:
#!/usr/bin/env python
# <examples/doc_fitting_withreport.py>
from numpy import exp, linspace, pi, random, sign, sin
from lmfit import Parameters, fit_report, minimize
p_true = Parameters()
p_true.add('amp', value=14.0)
p_true.add('period', value=5.46)
p_true.add('shift', value=0.123)
p_true.add('decay', value=0.032)
def residual(pars, x, data=None):
vals = pars.valuesdict()
amp = vals['amp']
per = vals['period']
shift = vals['shift']
decay = vals['decay']
if abs(shift) > pi/2:
shift = shift  sign(shift)*pi
model = amp * sin(shift + x/per) * exp(x*x*decay*decay)
if data is None:
return model
return model  data
n = 1001
xmin = 0.
xmax = 250.0
random.seed(0)
noise = random.normal(scale=0.7215, size=n)
x = linspace(xmin, xmax, n)
data = residual(p_true, x) + noise
fit_params = Parameters()
fit_params.add('amp', value=13.0)
fit_params.add('period', value=2)
fit_params.add('shift', value=0.0)
fit_params.add('decay', value=0.02)
out = minimize(residual, fit_params, args=(x,), kws={'data': data})
print(fit_report(out))
# <end of examples/doc_fitting_withreport.py>
which would write out:
[[Fit Statistics]]
# function evals = 85
# data points = 1001
# variables = 4
chisquare = 498.81176
reduced chisquare = 0.50031
Akaike info crit = 689.22252
Bayesian info crit = 669.58750
[[Variables]]
amp: 13.9121944 +/ 0.141202 (1.01%) (init= 13)
period: 5.48507044 +/ 0.026664 (0.49%) (init= 2)
shift: 0.16203676 +/ 0.014056 (8.67%) (init= 0)
decay: 0.03264538 +/ 0.000380 (1.16%) (init= 0.02)
[[Correlations]] (unreported correlations are < 0.100)
C(period, shift) = 0.797
C(amp, decay) = 0.582
C(amp, shift) = 0.297
C(amp, period) = 0.243
C(shift, decay) = 0.182
C(period, decay) = 0.150