Built-in Fitting Models in the models
module¶
Lmfit provides several built-in fitting models in the models
module.
These pre-defined models each subclass from the Model
class of the
previous chapter and wrap relatively well-known functional forms, such as
Gaussian, Lorentzian, and Exponential that are used in a wide range of
scientific domains. In fact, all the models are based on simple, plain
Python functions defined in the lineshapes
module. In addition to
wrapping a function into a Model
, these models also provide a
guess()
method that is intended to give a reasonable
set of starting values from a data array that closely approximates the
data to be fit.
As shown in the previous chapter, a key feature of the Model
class
is that models can easily be combined to give a composite
CompositeModel
. Thus, while some of the models listed here may
seem pretty trivial (notably, ConstantModel
and LinearModel
),
the main point of having these is to be able to use them in composite models. For
example, a Lorentzian plus a linear background might be represented as:
from lmfit.models import LinearModel, LorentzianModel
peak = LorentzianModel()
background = LinearModel()
model = peak + background
Almost all the models listed below are one-dimensional, with an independent
variable named x
. Many of these models represent a function with a
distinct peak, and so share common features. To maintain uniformity,
common parameter names are used whenever possible. Thus, most models have
a parameter called amplitude
that represents the overall intensity (or
area of) a peak or function and a sigma
parameter that gives a
characteristic width.
After a list of built-in models, a few examples of their use are given.
Peak-like models¶
There are many peak-like models available. These include
GaussianModel
, LorentzianModel
, VoigtModel
,
PseudoVoigtModel
, and some less commonly used variations. Most of
these models are unit-normalized and share the same parameter names so
that you can easily switch between models and interpret the results. The
amplitude
parameter is the multiplicative factor for the
unit-normalized peak lineshape, and so will represent the strength of that
peak or the area under that curve. The center
parameter will be the
centroid x
value. The sigma
parameter is the characteristic width
of the peak, with many functions using \((x-\mu)/\sigma\) where
\(\mu\) is the centroid value. Most of these peak functions will have
two additional parameters derived from and constrained by the other
parameters. The first of these is fwhm
which will hold the estimated
“Full Width at Half Max” for the peak, which is often easier to compare
between different models than sigma
. The second of these is height
which will contain the maximum value of the peak, typically the value at
\(x = \mu\). Finally, each of these models has a guess()
method
that uses data to make a fairly crude but usually sufficient guess for the
value of amplitude
, center
, and sigma
, and sets a lower bound
of 0 on the value of sigma
.
GaussianModel
¶
- class GaussianModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a Gaussian or normal distribution lineshape.
The model has three Parameters: amplitude, center, and sigma. In addition, parameters fwhm and height are included as constraints to report full width at half maximum and maximum peak height, respectively.
\[f(x; A, \mu, \sigma) = \frac{A}{\sigma\sqrt{2\pi}} e^{[{-{(x-\mu)^2}/{{2\sigma}^2}}]}\]where the parameter amplitude corresponds to \(A\), center to \(\mu\), and sigma to \(\sigma\). The full width at half maximum is \(2\sigma\sqrt{2\ln{2}}\), approximately \(2.3548\sigma\).
For more information, see: https://en.wikipedia.org/wiki/Normal_distribution
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
LorentzianModel
¶
- class LorentzianModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a Lorentzian or Cauchy-Lorentz distribution function.
The model has three Parameters: amplitude, center, and sigma. In addition, parameters fwhm and height are included as constraints to report full width at half maximum and maximum peak height, respectively.
\[f(x; A, \mu, \sigma) = \frac{A}{\pi} \big[\frac{\sigma}{(x - \mu)^2 + \sigma^2}\big]\]where the parameter amplitude corresponds to \(A\), center to \(\mu\), and sigma to \(\sigma\). The full width at half maximum is \(2\sigma\).
For more information, see: https://en.wikipedia.org/wiki/Cauchy_distribution
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
SplitLorentzianModel
¶
- class SplitLorentzianModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a Lorentzian or Cauchy-Lorentz distribution function.
The model has four parameters: amplitude, center, sigma, and sigma_r. In addition, parameters fwhm and height are included as constraints to report full width at half maximum and maximum peak height, respectively.
‘Split’ means that the width of the distribution is different between left and right slopes.
\[f(x; A, \mu, \sigma, \sigma_r) = \frac{2 A}{\pi (\sigma+\sigma_r)} \big[\frac{\sigma^2}{(x - \mu)^2 + \sigma^2} * H(\mu-x) + \frac{\sigma_r^2}{(x - \mu)^2 + \sigma_r^2} * H(x-\mu)\big]\]where the parameter amplitude corresponds to \(A\), center to \(\mu\), sigma to \(\sigma\), sigma_l to \(\sigma_l\), and \(H(x)\) is a Heaviside step function:
\[H(x) = 0 | x < 0, 1 | x \geq 0\]The full width at half maximum is \(\sigma_l+\sigma_r\). Just as with the Lorentzian model, integral of this function from -.inf to +.inf equals to amplitude.
For more information, see: https://en.wikipedia.org/wiki/Cauchy_distribution
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
VoigtModel
¶
- class VoigtModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a Voigt distribution function.
The model has four Parameters: amplitude, center, sigma, and gamma. By default, gamma is constrained to have a value equal to sigma, though it can be varied independently. In addition, parameters fwhm and height are included as constraints to report full width at half maximum and maximum peak height, respectively. The definition for the Voigt function used here is:
\[f(x; A, \mu, \sigma, \gamma) = \frac{A \textrm{Re}[w(z)]}{\sigma\sqrt{2 \pi}}\]where
\begin{eqnarray*} z &=& \frac{x-\mu +i\gamma}{\sigma\sqrt{2}} \\ w(z) &=& e^{-z^2}{\operatorname{erfc}}(-iz) \end{eqnarray*}and
erfc()
is the complementary error function. As above, amplitude corresponds to \(A\), center to \(\mu\), and sigma to \(\sigma\). The parameter gamma corresponds to \(\gamma\). If gamma is kept at the default value (constrained to sigma), the full width at half maximum is approximately \(3.6013\sigma\).For more information, see: https://en.wikipedia.org/wiki/Voigt_profile
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
PseudoVoigtModel
¶
- class PseudoVoigtModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a pseudo-Voigt distribution function.
This is a weighted sum of a Gaussian and Lorentzian distribution function that share values for amplitude (\(A\)), center (\(\mu\)), and full width at half maximum fwhm (and so has constrained values of sigma (\(\sigma\)) and height (maximum peak height). The parameter fraction (\(\alpha\)) controls the relative weight of the Gaussian and Lorentzian components, giving the full definition of:
\[f(x; A, \mu, \sigma, \alpha) = \frac{(1-\alpha)A}{\sigma_g\sqrt{2\pi}} e^{[{-{(x-\mu)^2}/{{2\sigma_g}^2}}]} + \frac{\alpha A}{\pi} \big[\frac{\sigma}{(x - \mu)^2 + \sigma^2}\big]\]where \(\sigma_g = {\sigma}/{\sqrt{2\ln{2}}}\) so that the full width at half maximum of each component and of the sum is \(2\sigma\). The
guess()
function always sets the starting value for fraction at 0.5.For more information, see: https://en.wikipedia.org/wiki/Voigt_profile#Pseudo-Voigt_Approximation
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
MoffatModel
¶
- class MoffatModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on the Moffat distribution function.
The model has four Parameters: amplitude (\(A\)), center (\(\mu\)), a width parameter sigma (\(\sigma\)), and an exponent beta (\(\beta\)). In addition, parameters fwhm and height are included as constraints to report full width at half maximum and maximum peak height, respectively.
\[f(x; A, \mu, \sigma, \beta) = A \big[(\frac{x-\mu}{\sigma})^2+1\big]^{-\beta}\]the full width at half maximum is \(2\sigma\sqrt{2^{1/\beta}-1}\). The
guess()
function always sets the starting value for beta to 1.Note that for (\(\beta=1\)) the Moffat has a Lorentzian shape. For more information, see: https://en.wikipedia.org/wiki/Moffat_distribution
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
Pearson4Model
¶
- class Pearson4Model(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a Pearson IV distribution.
The model has five parameters: amplitude (\(A\)), center (\(\mu\)), sigma (\(\sigma\)), expon (\(m\)) and skew (\(\nu\)). In addition, parameters fwhm, height and position are included as constraints to report estimates for the approximate full width at half maximum (20% error), the peak height, and the peak position (the position of the maximal function value), respectively. The fwhm value has an error of about 20% in the parameter range expon: (0.5, 1000], skew: [-1000, 1000].
\[f(x;A,\mu,\sigma,m,\nu)=A \frac{\left|\frac{\Gamma(m+i\tfrac{\nu}{2})}{\Gamma(m)}\right|^2}{\sigma\beta(m-\tfrac{1}{2},\tfrac{1}{2})}\left[1+\frac{(x-\mu)^2}{\sigma^2}\right]^{-m}\exp\left(-\nu \arctan\left(\frac{x-\mu}{\sigma}\right)\right)\]where \(\beta\) is the beta function (see scipy.special.beta). The
guess()
function always gives a starting value of 1.5 for expon, and 0 for skew.For more information, see: https://en.wikipedia.org/wiki/Pearson_distribution#The_Pearson_type_IV_distribution
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
Pearson7Model
¶
- class Pearson7Model(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a Pearson VII distribution.
The model has four parameters: amplitude (\(A\)), center (\(\mu\)), sigma (\(\sigma\)), and exponent (\(m\)). In addition, parameters fwhm and height are included as constraints to report estimates for the full width at half maximum and maximum peak height, respectively.
\[f(x; A, \mu, \sigma, m) = \frac{A}{\sigma{\beta(m-\frac{1}{2}, \frac{1}{2})}} \bigl[1 + \frac{(x-\mu)^2}{\sigma^2} \bigr]^{-m}\]where \(\beta\) is the beta function (see scipy.special.beta). The
guess()
function always gives a starting value for exponent of 1.5. In addition, parameters fwhm and height are included as constraints to report full width at half maximum and maximum peak height, respectively.For more information, see: https://en.wikipedia.org/wiki/Pearson_distribution#The_Pearson_type_VII_distribution
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
StudentsTModel
¶
- class StudentsTModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a Student’s t-distribution function.
The model has three Parameters: amplitude (\(A\)), center (\(\mu\)), and sigma (\(\sigma\)). In addition, parameters fwhm and height are included as constraints to report full width at half maximum and maximum peak height, respectively.
\[f(x; A, \mu, \sigma) = \frac{A \Gamma(\frac{\sigma+1}{2})} {\sqrt{\sigma\pi}\,\Gamma(\frac{\sigma}{2})} \Bigl[1+\frac{(x-\mu)^2}{\sigma}\Bigr]^{-\frac{\sigma+1}{2}}\]where \(\Gamma(x)\) is the gamma function.
For more information, see: https://en.wikipedia.org/wiki/Student%27s_t-distribution
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
BreitWignerModel
¶
- class BreitWignerModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a Breit-Wigner-Fano function.
The model has four Parameters: amplitude (\(A\)), center (\(\mu\)), sigma (\(\sigma\)), and q (\(q\)).
\[f(x; A, \mu, \sigma, q) = \frac{A (q\sigma/2 + x - \mu)^2}{(\sigma/2)^2 + (x - \mu)^2}\]For more information, see: https://en.wikipedia.org/wiki/Fano_resonance
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
LognormalModel
¶
- class LognormalModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on the Log-normal distribution function.
The modal has three Parameters amplitude (\(A\)), center (\(\mu\)), and sigma (\(\sigma\)). In addition, parameters fwhm and height are included as constraints to report estimates of full width at half maximum and maximum peak height, respectively.
\[f(x; A, \mu, \sigma) = \frac{A}{\sigma\sqrt{2\pi}}\frac{e^{-(\ln(x) - \mu)^2/ 2\sigma^2}}{x}\]For more information, see: https://en.wikipedia.org/wiki/Lognormal
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
DampedOscillatorModel
¶
- class DampedOscillatorModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on the Damped Harmonic Oscillator Amplitude.
The model has three Parameters: amplitude (\(A\)), center (\(\mu\)), and sigma (\(\sigma\)). In addition, the parameter height is included as a constraint to report the maximum peak height.
\[f(x; A, \mu, \sigma) = \frac{A}{\sqrt{ [1 - (x/\mu)^2]^2 + (2\sigma x/\mu)^2}}\]For more information, see: https://en.wikipedia.org/wiki/Harmonic_oscillator#Amplitude_part
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
DampedHarmonicOscillatorModel
¶
- class DampedHarmonicOscillatorModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a variation of the Damped Harmonic Oscillator.
The model follows the definition given in DAVE/PAN (see: https://www.ncnr.nist.gov/dave) and has four Parameters: amplitude (\(A\)), center (\(\mu\)), sigma (\(\sigma\)), and gamma (\(\gamma\)). In addition, parameters fwhm and height are included as constraints to report estimates for full width at half maximum and maximum peak height, respectively.
\[f(x; A, \mu, \sigma, \gamma) = \frac{A\sigma}{\pi [1 - \exp(-x/\gamma)]} \Big[ \frac{1}{(x-\mu)^2 + \sigma^2} - \frac{1}{(x+\mu)^2 + \sigma^2} \Big]\]where \(\gamma=kT\), k is the Boltzmann constant in \(evK^-1\), and T is the temperature in \(K\).
For more information, see: https://en.wikipedia.org/wiki/Harmonic_oscillator
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
ExponentialGaussianModel
¶
- class ExponentialGaussianModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model of an Exponentially modified Gaussian distribution.
The model has four Parameters: amplitude (\(A\)), center (\(\mu\)), sigma (\(\sigma\)), and gamma (\(\gamma\)).
\[f(x; A, \mu, \sigma, \gamma) = \frac{A\gamma}{2} \exp\bigl[\gamma({\mu - x + \gamma\sigma^2/2})\bigr] {\operatorname{erfc}}\Bigl(\frac{\mu + \gamma\sigma^2 - x}{\sqrt{2}\sigma}\Bigr)\]where
erfc()
is the complementary error function.For more information, see: https://en.wikipedia.org/wiki/Exponentially_modified_Gaussian_distribution
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
SkewedGaussianModel
¶
- class SkewedGaussianModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A skewed Gaussian model, using a skewed normal distribution.
The model has four Parameters: amplitude (\(A\)), center (\(\mu\)), sigma (\(\sigma\)), and gamma (\(\gamma\)).
\[f(x; A, \mu, \sigma, \gamma) = \frac{A}{\sigma\sqrt{2\pi}} e^{[{-{(x-\mu)^2}/{{2\sigma}^2}}]} \Bigl\{ 1 + {\operatorname{erf}}\bigl[ \frac{{\gamma}(x-\mu)}{\sigma\sqrt{2}} \bigr] \Bigr\}\]where
erf()
is the error function.For more information, see: https://en.wikipedia.org/wiki/Skew_normal_distribution
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
SkewedVoigtModel
¶
- class SkewedVoigtModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A skewed Voigt model, modified using a skewed normal distribution.
The model has five Parameters amplitude (\(A\)), center (\(\mu\)), sigma (\(\sigma\)), and gamma (\(\gamma\)), as usual for a Voigt distribution, and adds a new Parameter skew.
\[f(x; A, \mu, \sigma, \gamma, \rm{skew}) = {\rm{Voigt}}(x; A, \mu, \sigma, \gamma) \Bigl\{ 1 + {\operatorname{erf}}\bigl[ \frac{{\rm{skew}}(x-\mu)}{\sigma\sqrt{2}} \bigr] \Bigr\}\]where
erf()
is the error function.For more information, see: https://en.wikipedia.org/wiki/Skew_normal_distribution
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
ThermalDistributionModel
¶
- class ThermalDistributionModel(independent_vars=['x', 'form'], prefix='', nan_policy='raise', form='bose', **kwargs)¶
Return a thermal distribution function.
Variable form defines the kind of distribution as below with three Parameters: amplitude (\(A\)), center (\(x_0\)), and kt (\(kt\)). The following distributions are available:
‘bose’ : Bose-Einstein distribution (default)
‘maxwell’ : Maxwell-Boltzmann distribution
‘fermi’ : Fermi-Dirac distribution
The functional forms are defined as:
\begin{eqnarray*} & f(x; A, x_0, kt, {\mathrm{form={}'bose{}'}}) & = \frac{1}{A \exp(\frac{x - x_0}{kt}) - 1} \\ & f(x; A, x_0, kt, {\mathrm{form={}'maxwell{}'}}) & = \frac{1}{A \exp(\frac{x - x_0}{kt})} \\ & f(x; A, x_0, kt, {\mathrm{form={}'fermi{}'}}) & = \frac{1}{A \exp(\frac{x - x_0}{kt}) + 1} ] \end{eqnarray*}Notes
kt should be defined in the same units as x (\(k_B = 8.617\times10^{-5}\) eV/K).
set \(kt<0\) to implement the energy loss convention common in scattering research.
For more information, see: http://hyperphysics.phy-astr.gsu.edu/hbase/quantum/disfcn.html
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
DoniachModel
¶
- class DoniachModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model of an Doniach Sunjic asymmetric lineshape.
This model is used in photo-emission and has four Parameters: amplitude (\(A\)), center (\(\mu\)), sigma (\(\sigma\)), and gamma (\(\gamma\)). In addition, parameter height is included as a constraint to report maximum peak height.
\[f(x; A, \mu, \sigma, \gamma) = \frac{A}{\sigma^{1-\gamma}} \frac{\cos\bigl[\pi\gamma/2 + (1-\gamma) \arctan{((x - \mu)}/\sigma)\bigr]} {\bigr[1 + (x-\mu)/\sigma\bigl]^{(1-\gamma)/2}}\]For more information, see: https://www.casaxps.com/help_manual/line_shapes.htm
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
Linear and Polynomial Models¶
These models correspond to polynomials of some degree. Of course, lmfit is a very inefficient way to do linear regression (see numpy.polyfit or scipy.stats.linregress), but these models may be useful as one of many components of a composite model. The SplineModel below corresponds to a cubic spline.
ConstantModel
¶
- class ConstantModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
Constant model, with a single Parameter: c.
Note that this is ‘constant’ in the sense of having no dependence on the independent variable x, not in the sense of being non-varying. To be clear, c will be a Parameter that will be varied in the fit (by default, of course).
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
LinearModel
¶
- class LinearModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
Linear model, with two Parameters: intercept and slope.
Defined as:
\[f(x; m, b) = m x + b\]with slope for \(m\) and intercept for \(b\).
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
QuadraticModel
¶
- class QuadraticModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A quadratic model, with three Parameters: a, b, and c.
Defined as:
\[f(x; a, b, c) = a x^2 + b x + c\]- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
PolynomialModel
¶
- class PolynomialModel(degree=7, independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A polynomial model with up to 7 Parameters, specified by degree.
\[f(x; c_0, c_1, \ldots, c_7) = \sum_{i=0, 7} c_i x^i\]with parameters c0, c1, …, c7. The supplied degree will specify how many of these are actual variable parameters. This uses numpy.polyval for its calculation of the polynomial.
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
SplinelModel
¶
- class SplineModel(xknots, independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A 1-D cubic spline model with a variable number of knots and parameters s0, s1, …, sN, for N knots.
The user must supply a list or ndarray xknots: the x values for the ‘knots’ which control the flexibility of the spline function.
The parameters s0, …, sN (where N is the size of xknots) will correspond to the y values for the spline knots at the x=xknots positions where the highest order derivative will be discontinuous. The resulting curve will not necessarily pass through these knot points, but for finely-spaced knots, the spline parameter values will be very close to the y values of the resulting curve.
The maximum number of knots supported is 300.
Using the guess() method to initialize parameter values is highly recommended.
- Parameters:
xknots (
list
of floats orndarray
, required) – x-values of knots for spline.independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
Notes
There must be at least 4 knot points, and not more than 300.
- nan_policy sets what to do when a NaN or missing value is seen in
the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
Periodic Models¶
These models correspond to periodic functions.
SineModel
¶
- class SineModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a sinusoidal lineshape.
The model has three Parameters: amplitude, frequency, and shift.
\[f(x; A, \phi, f) = A \sin (f x + \phi)\]where the parameter amplitude corresponds to \(A\), frequency to \(f\), and shift to \(\phi\). All are constrained to be non-negative, and shift additionally to be smaller than \(2\pi\).
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
Step-like models¶
Two models represent step-like functions, and share many characteristics.
StepModel
¶
- class StepModel(independent_vars=['x', 'form'], prefix='', nan_policy='raise', form='linear', **kwargs)¶
A model based on a Step function.
The model has three Parameters: amplitude (\(A\)), center (\(\mu\)), and sigma (\(\sigma\)).
There are four choices for form:
‘linear’ (default)
‘atan’ or ‘arctan’ for an arc-tangent function
‘erf’ for an error function
‘logistic’ for a logistic function (for more information, see: https://en.wikipedia.org/wiki/Logistic_function)
The step function starts with a value 0 and ends with a value of \(\tt{sign}(\sigma)A\) rising or falling to \(A/2\) at \(\mu\), with \(\sigma\) setting the characteristic width of the step. The functional forms are defined as:
\begin{eqnarray*} & f(x; A, \mu, \sigma, {\mathrm{form={}'linear{}'}}) & = A \min{[1, \max{(0, \alpha + 1/2)}]} \\ & f(x; A, \mu, \sigma, {\mathrm{form={}'arctan{}'}}) & = A [1/2 + \arctan{(\alpha)}/{\pi}] \\ & f(x; A, \mu, \sigma, {\mathrm{form={}'erf{}'}}) & = A [1 + {\operatorname{erf}}(\alpha)]/2 \\ & f(x; A, \mu, \sigma, {\mathrm{form={}'logistic{}'}})& = A \left[1 - \frac{1}{1 + e^{\alpha}} \right] \end{eqnarray*}where \(\alpha = (x - \mu)/{\sigma}\).
Note that \(\sigma > 0\) gives a rising step, while \(\sigma < 0\) gives a falling step.
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
RectangleModel
¶
- class RectangleModel(independent_vars=['x', 'form'], prefix='', nan_policy='raise', form='linear', **kwargs)¶
A model based on a Step-up and Step-down function.
The model has five Parameters: amplitude (\(A\)), center1 (\(\mu_1\)), center2 (\(\mu_2\)), sigma1 (\(\sigma_1\)), and sigma2 (\(\sigma_2\)).
There are four choices for form, which is used for both the Step up and the Step down:
‘linear’ (default)
‘atan’ or ‘arctan’ for an arc-tangent function
‘erf’ for an error function
‘logistic’ for a logistic function (for more information, see: https://en.wikipedia.org/wiki/Logistic_function)
The function starts with a value 0 and transitions to a value of \(A\), taking the value \(A/2\) at \(\mu_1\), with \(\sigma_1\) setting the characteristic width. The function then transitions again to the value \(A/2\) at \(\mu_2\), with \(\sigma_2\) setting the characteristic width. The functional forms are defined as:
\begin{eqnarray*} &f(x; A, \mu, \sigma, {\mathrm{form={}'linear{}'}}) &= A \{ \min{[1, \max{(-1, \alpha_1)}]} + \min{[1, \max{(-1, \alpha_2)}]} \}/2 \\ &f(x; A, \mu, \sigma, {\mathrm{form={}'arctan{}'}}) &= A [\arctan{(\alpha_1)} + \arctan{(\alpha_2)}]/{\pi} \\ &f(x; A, \mu, \sigma, {\mathrm{form={}'erf{}'}}) &= A \left[{\operatorname{erf}}(\alpha_1) + {\operatorname{erf}}(\alpha_2)\right]/2 \\ &f(x; A, \mu, \sigma, {\mathrm{form={}'logistic{}'}}) &= A \left[1 - \frac{1}{1 + e^{\alpha_1}} - \frac{1}{1 + e^{\alpha_2}} \right] \end{eqnarray*}where \(\alpha_1 = (x - \mu_1)/{\sigma_1}\) and \(\alpha_2 = -(x - \mu_2)/{\sigma_2}\).
Note that, unlike a StepModel, \(\sigma_1 > 0\) is enforced, giving a rising initial step, and \(\sigma_2 > 0\) gives a falling final step.
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
Exponential and Power law models¶
ExponentialModel
¶
- class ExponentialModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on an exponential decay function.
The model has two Parameters: amplitude (\(A\)) and decay (\(\tau\)) and is defined as:
\[f(x; A, \tau) = A e^{-x/\tau}\]For more information, see: https://en.wikipedia.org/wiki/Exponential_decay
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
PowerLawModel
¶
- class PowerLawModel(independent_vars=['x'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a Power Law.
The model has two Parameters: amplitude (\(A\)) and exponent (\(k\)) and is defined as:
\[f(x; A, k) = A x^k\]For more information, see: https://en.wikipedia.org/wiki/Power_law
- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
Two dimensional Peak-like models¶
The one example of a two-dimensional peak is a two-dimensional Gaussian.
Gaussian2dModel
¶
- class Gaussian2dModel(independent_vars=['x', 'y'], prefix='', nan_policy='raise', **kwargs)¶
A model based on a two-dimensional Gaussian function.
The model has two independent variables x and y and five Parameters: amplitude, centerx, sigmax, centery, and sigmay. In addition, parameters fwhmx, fwhmy, and height are included as constraints to report the maximum peak height and the two full width at half maxima, respectively.
\[f(x, y; A, \mu_x, \sigma_x, \mu_y, \sigma_y) = A g(x; A=1, \mu_x, \sigma_x) g(y; A=1, \mu_y, \sigma_y)\]where subfunction \(g(x; A, \mu, \sigma)\) is a Gaussian lineshape:
\[g(x; A, \mu, \sigma) = \frac{A}{\sigma\sqrt{2\pi}} e^{[{-{(x-\mu)^2}/{{2\sigma}^2}}]}.\]- Parameters:
independent_vars (
list
ofstr
, optional) – Arguments to the model function that are independent variables default is [‘x’, ‘y’]).prefix (str, optional) – String to prepend to parameter names, needed to add two Models that have parameter names in common.
nan_policy ({'raise', 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kwargs (optional) – Keyword arguments to pass to
Model
.
Notes
1. nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
User-defined Models¶
As shown in the previous chapter (Modeling Data and Curve Fitting), it is fairly straightforward to build fitting models from parametrized Python functions. The number of model classes listed so far in the present chapter should make it clear that this process is not too difficult. Still, it is sometimes desirable to build models from a user-supplied function. This may be especially true if model-building is built-in to some larger library or application for fitting in which the user may not be able to easily build and use a new model from Python code.
The ExpressionModel
allows a model to be built from a
user-supplied expression. This uses the asteval module also used for
mathematical constraints as discussed in Using Mathematical Constraints.
ExpressionModel
¶
- class ExpressionModel(expr, independent_vars=None, init_script=None, nan_policy='raise', **kws)¶
ExpressionModel class.
Generate a model from user-supplied expression.
- Parameters:
expr (str) – Mathematical expression for model.
independent_vars (
list
ofstr
or None, optional) – Variable names to use as independent variables.init_script (str or None, optional) – Initial script to run in asteval interpreter.
nan_policy ({'raise, 'propagate', 'omit'}, optional) – How to handle NaN and missing values in data. See Notes below.
**kws (optional) – Keyword arguments to pass to
Model
.
Notes
each instance of ExpressionModel will create and use its own version of an asteval interpreter.
prefix is not supported for ExpressionModel.
nan_policy sets what to do when a NaN or missing value is seen in the data. Should be one of:
‘raise’ : raise a ValueError (default)
‘propagate’ : do nothing
‘omit’ : drop missing data
Since the point of this model is that an arbitrary expression will be
supplied, the determination of what are the parameter names for the model
happens when the model is created. To do this, the expression is parsed,
and all symbol names are found. Names that are already known (there are
over 500 function and value names in the asteval namespace, including most
Python built-ins, more than 200 functions inherited from NumPy, and more
than 20 common lineshapes defined in the lineshapes
module) are not
converted to parameters. Unrecognized names are expected to be names of either
parameters or independent variables. If independent_vars
is the
default value of None
, and if the expression contains a variable named
x
, that will be used as the independent variable. Otherwise,
independent_vars
must be given.
For example, if one creates an ExpressionModel
as:
from lmfit.models import ExpressionModel
mod = ExpressionModel('off + amp * exp(-x/x0) * sin(x*phase)')
The name exp
will be recognized as the exponent function, so the model
will be interpreted to have parameters named off
, amp
, x0
and
phase
. In addition, x
will be assumed to be the sole independent variable.
In general, there is no obvious way to set default parameter values or
parameter hints for bounds, so this will have to be handled explicitly.
To evaluate this model, you might do the following:
from numpy import exp, linspace, sin
x = linspace(0, 10, 501)
params = mod.make_params(off=0.25, amp=1.0, x0=2.0, phase=0.04)
y = mod.eval(params, x=x)
While many custom models can be built with a single line expression
(especially since the names of the lineshapes like gaussian
, lorentzian
and so on, as well as many NumPy functions, are available), more complex
models will inevitably require multiple line functions. You can include
such Python code with the init_script
argument. The text of this script
is evaluated when the model is initialized (and before the actual
expression is parsed), so that you can define functions to be used
in your expression.
As a probably unphysical example, to make a model that is the derivative of a Gaussian function times the logarithm of a Lorentzian function you may could to define this in a script:
script = """
def mycurve(x, amp, cen, sig):
loren = lorentzian(x, amplitude=amp, center=cen, sigma=sig)
gauss = gaussian(x, amplitude=amp, center=cen, sigma=sig)
return log(loren) * gradient(gauss) / gradient(x)
"""
and then use this with ExpressionModel
as:
mod = ExpressionModel('mycurve(x, height, mid, wid)', init_script=script,
independent_vars=['x'])
As above, this will interpret the parameter names to be height
, mid
,
and wid
, and build a model that can be used to fit data.
Example 1: Fit Peak data to Gaussian, Lorentzian, and Voigt profiles¶
Here, we will fit data to three similar lineshapes, in order to decide which
might be the better model. We will start with a Gaussian profile, as in
the previous chapter, but use the built-in GaussianModel
instead
of writing one ourselves. This is a slightly different version from the
one in previous example in that the parameter names are different, and have
built-in default values. We will simply use:
from numpy import loadtxt
from lmfit.models import GaussianModel
data = loadtxt('test_peak.dat')
x = data[:, 0]
y = data[:, 1]
mod = GaussianModel()
pars = mod.guess(y, x=x)
out = mod.fit(y, pars, x=x)
print(out.fit_report(min_correl=0.25))
which prints out the results:
[[Model]]
Model(gaussian)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 25
# data points = 401
# variables = 3
chi-square = 29.9943157
reduced chi-square = 0.07536260
Akaike info crit = -1033.77437
Bayesian info crit = -1021.79248
R-squared = 0.99045513
[[Variables]]
amplitude: 30.3135789 +/- 0.15712752 (0.52%) (init = 43.62238)
center: 9.24277046 +/- 0.00737497 (0.08%) (init = 9.25)
sigma: 1.23218496 +/- 0.00737506 (0.60%) (init = 1.35)
fwhm: 2.90157379 +/- 0.01736695 (0.60%) == '2.3548200*sigma'
height: 9.81457271 +/- 0.05087308 (0.52%) == '0.3989423*amplitude/max(1e-15, sigma)'
[[Correlations]] (unreported correlations are < 0.250)
C(amplitude, sigma) = +0.5774
We see a few interesting differences from the results of the previous
chapter. First, the parameter names are longer. Second, there are fwhm
and height
parameters, to give the full-width-at-half-maximum and
maximum peak height, respectively. And third, the automated initial guesses
are pretty good. A plot of the fit:
shows a decent match to the data – the fit worked with no explicit setting
of initial parameter values. Looking more closely, the fit is not perfect,
especially in the tails of the peak, suggesting that a different peak
shape, with longer tails, should be used. Perhaps a Lorentzian would be
better? To do this, we simply replace GaussianModel
with
LorentzianModel
to get a LorentzianModel
:
from lmfit.models import LorentzianModel
mod = LorentzianModel()
with the rest of the script as above. Perhaps predictably, the first thing we try gives results that are worse by comparing the fit statistics:
[[Model]]
Model(lorentzian)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 25
# data points = 401
# variables = 3
chi-square = 53.7535387
reduced chi-square = 0.13505914
Akaike info crit = -799.830322
Bayesian info crit = -787.848438
R-squared = 0.98289441
[[Variables]]
amplitude: 38.9726380 +/- 0.31386754 (0.81%) (init = 54.52798)
center: 9.24439393 +/- 0.00927645 (0.10%) (init = 9.25)
sigma: 1.15483177 +/- 0.01315708 (1.14%) (init = 1.35)
fwhm: 2.30966354 +/- 0.02631416 (1.14%) == '2.0000000*sigma'
height: 10.7421504 +/- 0.08634317 (0.80%) == '0.3183099*amplitude/max(1e-15, sigma)'
[[Correlations]] (unreported correlations are < 0.250)
C(amplitude, sigma) = +0.7087
and also by visual inspection of the fit to the data (figure below).
The tails are now too big, and the value for \(\chi^2\) almost doubled.
A Voigt model does a better job. Using VoigtModel
, this is as simple as using:
from lmfit.models import VoigtModel
mod = VoigtModel()
with all the rest of the script as above. This gives:
[[Model]]
Model(voigt)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 25
# data points = 401
# variables = 3
chi-square = 14.5448627
reduced chi-square = 0.03654488
Akaike info crit = -1324.00615
Bayesian info crit = -1312.02427
R-squared = 0.99537150
[[Variables]]
amplitude: 35.7553799 +/- 0.13861559 (0.39%) (init = 65.43358)
center: 9.24411179 +/- 0.00505496 (0.05%) (init = 9.25)
sigma: 0.73015485 +/- 0.00368473 (0.50%) (init = 0.8775)
gamma: 0.73015485 +/- 0.00368473 (0.50%) == 'sigma'
fwhm: 2.62949983 +/- 0.01326979 (0.50%) == '1.0692*gamma+sqrt(0.8664*gamma**2+5.545083*sigma**2)'
height: 10.2204068 +/- 0.03959933 (0.39%) == '(amplitude/(max(1e-15, sigma*sqrt(2*pi))))*real(wofz((1j*gamma)/(max(1e-15, sigma*sqrt(2)))))'
[[Correlations]] (unreported correlations are < 0.250)
C(amplitude, sigma) = +0.6513
which has a much better value for \(\chi^2\) and the other goodness-of-fit measures, and an obviously better match to the data as seen in the figure below (left).
Fit to peak with Voigt model (left) and Voigt model with gamma
varying independently of sigma
(right).
Can we do better? The Voigt function has a \(\gamma\) parameter
(gamma
) that can be distinct from sigma
. The default behavior used
above constrains gamma
to have exactly the same value as sigma
. If
we allow these to vary separately, does the fit improve? To do this, we
have to change the gamma
parameter from a constrained expression and
give it a starting value using something like:
mod = VoigtModel()
pars = mod.guess(y, x=x)
pars['gamma'].set(value=0.7, vary=True, expr='')
which gives:
[[Model]]
Model(voigt)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 26
# data points = 401
# variables = 4
chi-square = 10.9301767
reduced chi-square = 0.02753193
Akaike info crit = -1436.57602
Bayesian info crit = -1420.60017
R-squared = 0.99652177
[[Variables]]
amplitude: 34.1914716 +/- 0.17946974 (0.52%) (init = 65.43358)
center: 9.24374846 +/- 0.00441904 (0.05%) (init = 9.25)
sigma: 0.89518951 +/- 0.01415479 (1.58%) (init = 0.8775)
gamma: 0.52540156 +/- 0.01857994 (3.54%) (init = 0.7)
fwhm: 2.72573678 +/- 0.01363994 (0.50%) == '1.0692*gamma+sqrt(0.8664*gamma**2+5.545083*sigma**2)'
height: 10.0872197 +/- 0.03482126 (0.35%) == '(amplitude/(max(1e-15, sigma*sqrt(2*pi))))*real(wofz((1j*gamma)/(max(1e-15, sigma*sqrt(2)))))'
[[Correlations]] (unreported correlations are < 0.250)
C(sigma, gamma) = -0.9285
C(amplitude, gamma) = +0.8210
C(amplitude, sigma) = -0.6512
and the fit shown on the right above.
Comparing the two fits with the Voigt function, we see that \(\chi^2\)
is definitely improved with a separately varying gamma
parameter. In
addition, the two values for gamma
and sigma
differ significantly
– well outside the estimated uncertainties. More compelling, reduced
\(\chi^2\) is improved even though a fourth variable has been added to
the fit. In the simplest statistical sense, this suggests that gamma
is a significant variable in the model. In addition, we can use both the
Akaike or Bayesian Information Criteria (see
Akaike and Bayesian Information Criteria) to assess how likely the model with
variable gamma
is to explain the data than the model with gamma
fixed to the value of sigma
. According to theory,
\(\exp(-(\rm{AIC1}-\rm{AIC0})/2)\) gives the probability that a model with
AIC1 is more likely than a model with AIC0. For the two models here, with
AIC values of -1436 and -1324 (Note: if we had more carefully set the value
for weights
based on the noise in the data, these values might be
positive, but there difference would be roughly the same), this says that
the model with gamma
fixed to sigma
has a probability less than 5.e-25
of being the better model.
Example 2: Fit data to a Composite Model with pre-defined models¶
Here, we repeat the point made at the end of the last chapter that
instances of Model
class can be added together to make a
composite model. By using the large number of built-in models available,
it is therefore very simple to build models that contain multiple peaks and
various backgrounds. An example of a simple fit to a noisy step function
plus a constant:
# <examples/doc_builtinmodels_stepmodel.py>
import matplotlib.pyplot as plt
import numpy as np
from lmfit.models import LinearModel, StepModel
x = np.linspace(0, 10, 201)
y = np.ones_like(x)
y[:48] = 0.0
y[48:77] = np.arange(77-48)/(77.0-48)
np.random.seed(0)
y = 110.2 * (y + 9e-3*np.random.randn(x.size)) + 12.0 + 2.22*x
step_mod = StepModel(form='erf', prefix='step_')
line_mod = LinearModel(prefix='line_')
pars = line_mod.make_params(intercept=y.min(), slope=0)
pars += step_mod.guess(y, x=x, center=2.5)
mod = step_mod + line_mod
out = mod.fit(y, pars, x=x)
print(out.fit_report())
plt.plot(x, y)
plt.plot(x, out.init_fit, '--', label='initial fit')
plt.plot(x, out.best_fit, '-', label='best fit')
plt.legend()
plt.show()
# <end examples/doc_builtinmodels_stepmodel.py>
After constructing step-like data, we first create a StepModel
telling it to use the erf
form (see details above), and a
ConstantModel
. We set initial values, in one case using the data
and guess()
method for the initial step function parameters, and
make_params()
arguments for the linear component.
After making a composite model, we run fit()
and report the
results, which gives:
[[Model]]
(Model(step, prefix='step_', form='erf') + Model(linear, prefix='line_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 55
# data points = 201
# variables = 5
chi-square = 593.709621
reduced chi-square = 3.02913072
Akaike info crit = 227.700173
Bayesian info crit = 244.216698
R-squared = 0.99897798
[[Variables]]
line_slope: 1.87162051 +/- 0.09318576 (4.98%) (init = 0)
line_intercept: 12.0964556 +/- 0.27606000 (2.28%) (init = 11.58574)
step_amplitude: 112.858605 +/- 0.65391558 (0.58%) (init = 134.7378)
step_center: 3.13494787 +/- 0.00516601 (0.16%) (init = 2.5)
step_sigma: 0.67393526 +/- 0.01091153 (1.62%) (init = 1)
[[Correlations]] (unreported correlations are < 0.100)
C(line_slope, step_amplitude) = -0.8791
C(step_amplitude, step_sigma) = +0.5643
C(line_slope, step_sigma) = -0.4569
C(line_intercept, step_center) = +0.4269
C(line_slope, line_intercept) = -0.3093
C(line_slope, step_center) = -0.2338
C(line_intercept, step_sigma) = -0.1372
C(line_intercept, step_amplitude) = -0.1173
C(step_amplitude, step_center) = +0.1095
with a plot of
Example 3: Fitting Multiple Peaks – and using Prefixes¶
As shown above, many of the models have similar parameter names. For
composite models, this could lead to a problem of having parameters for
different parts of the model having the same name. To overcome this, each
Model
can have a prefix
attribute (normally set to a blank
string) that will be put at the beginning of each parameter name. To
illustrate, we fit one of the classic datasets from the NIST StRD suite
involving a decaying exponential and two Gaussians.
# <examples/doc_builtinmodels_nistgauss.py>
import matplotlib.pyplot as plt
import numpy as np
from lmfit.models import ExponentialModel, GaussianModel
dat = np.loadtxt('NIST_Gauss2.dat')
x = dat[:, 1]
y = dat[:, 0]
exp_mod = ExponentialModel(prefix='exp_')
pars = exp_mod.guess(y, x=x)
gauss1 = GaussianModel(prefix='g1_')
pars.update(gauss1.make_params(center=dict(value=105, min=75, max=125),
sigma=dict(value=15, min=0),
amplitude=dict(value=2000, min=0)))
gauss2 = GaussianModel(prefix='g2_')
pars.update(gauss2.make_params(center=dict(value=155, min=125, max=175),
sigma=dict(value=15, min=0),
amplitude=dict(value=2000, min=0)))
mod = gauss1 + gauss2 + exp_mod
init = mod.eval(pars, x=x)
out = mod.fit(y, pars, x=x)
print(out.fit_report(correl_mode='table'))
fig, axes = plt.subplots(1, 2, figsize=(12.8, 4.8))
axes[0].plot(x, y)
axes[0].plot(x, init, '--', label='initial fit')
axes[0].plot(x, out.best_fit, '-', label='best fit')
axes[0].legend()
comps = out.eval_components(x=x)
axes[1].plot(x, y)
axes[1].plot(x, comps['g1_'], '--', label='Gaussian component 1')
axes[1].plot(x, comps['g2_'], '--', label='Gaussian component 2')
axes[1].plot(x, comps['exp_'], '--', label='Exponential component')
axes[1].legend()
plt.show()
# <end examples/doc_builtinmodels_nistgauss.py>
where we give a separate prefix to each model (they all have an
amplitude
parameter). The prefix
values are attached transparently
to the models.
Note that the calls to make_param()
used the bare name, without the
prefix. We could have used the prefixes, but because we used the
individual model gauss1
and gauss2
, there was no need.
Note also in the example here that we explicitly set bounds on many of the parameter values.
The fit results printed out are:
[[Model]]
((Model(gaussian, prefix='g1_') + Model(gaussian, prefix='g2_')) + Model(exponential, prefix='exp_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 46
# data points = 250
# variables = 8
chi-square = 1247.52821
reduced chi-square = 5.15507524
Akaike info crit = 417.864631
Bayesian info crit = 446.036318
R-squared = 0.99648654
[[Variables]]
exp_amplitude: 99.0183278 +/- 0.53748593 (0.54%) (init = 162.2102)
exp_decay: 90.9508853 +/- 1.10310778 (1.21%) (init = 93.24905)
g1_amplitude: 4257.77360 +/- 42.3836478 (1.00%) (init = 2000)
g1_center: 107.030956 +/- 0.15006851 (0.14%) (init = 105)
g1_sigma: 16.6725772 +/- 0.16048381 (0.96%) (init = 15)
g1_fwhm: 39.2609181 +/- 0.37791049 (0.96%) == '2.3548200*g1_sigma'
g1_height: 101.880230 +/- 0.59217173 (0.58%) == '0.3989423*g1_amplitude/max(1e-15, g1_sigma)'
g2_amplitude: 2493.41735 +/- 36.1697789 (1.45%) (init = 2000)
g2_center: 153.270102 +/- 0.19466802 (0.13%) (init = 155)
g2_sigma: 13.8069464 +/- 0.18679695 (1.35%) (init = 15)
g2_fwhm: 32.5128735 +/- 0.43987320 (1.35%) == '2.3548200*g2_sigma'
g2_height: 72.0455941 +/- 0.61722243 (0.86%) == '0.3989423*g2_amplitude/max(1e-15, g2_sigma)'
[[Correlations]] (unreported correlations are < 0.100)
C(g1_amplitude, g1_sigma) = +0.8243
C(g2_amplitude, g2_sigma) = +0.8154
C(exp_amplitude, exp_decay) = -0.6946
C(g1_sigma, g2_center) = +0.6842
C(g1_center, g2_amplitude) = -0.6689
C(g1_center, g2_sigma) = -0.6520
C(g1_amplitude, g2_center) = +0.6477
C(g1_center, g2_center) = +0.6205
C(g1_center, g1_sigma) = +0.5075
C(exp_decay, g1_amplitude) = -0.5074
C(g1_sigma, g2_amplitude) = -0.4915
C(g2_center, g2_sigma) = -0.4889
C(g1_sigma, g2_sigma) = -0.4826
C(g2_amplitude, g2_center) = -0.4763
C(exp_decay, g2_amplitude) = -0.4270
C(g1_amplitude, g1_center) = +0.4183
C(g1_amplitude, g2_sigma) = -0.4010
C(g1_amplitude, g2_amplitude) = -0.3071
C(exp_amplitude, g2_amplitude) = +0.2821
C(exp_decay, g1_sigma) = -0.2520
C(exp_decay, g2_sigma) = -0.2329
C(exp_amplitude, g2_sigma) = +0.1714
C(exp_decay, g2_center) = -0.1514
C(exp_amplitude, g1_amplitude) = +0.1478
C(exp_decay, g1_center) = +0.1055
We get a very good fit to this problem (described at the NIST site as of average difficulty, but the tests there are generally deliberately challenging) by applying reasonable initial guesses and putting modest but explicit bounds on the parameter values. The overall fit is shown on the left, with its individual components displayed on the right:
One final point on setting initial values. From looking at the data
itself, we can see the two Gaussian peaks are reasonably well separated but
do overlap. Furthermore, we can tell that the initial guess for the
decaying exponential component was poorly estimated because we used the
full data range. We can simplify the initial parameter values by using
this, and by defining an index_of()
function to limit the data range.
That is, with:
def index_of(arrval, value):
"""Return index of array *at or below* value."""
if value < min(arrval):
return 0
return max(np.where(arrval <= value)[0])
ix1 = index_of(x, 75)
ix2 = index_of(x, 135)
ix3 = index_of(x, 175)
exp_mod.guess(y[:ix1], x=x[:ix1])
gauss1.guess(y[ix1:ix2], x=x[ix1:ix2])
gauss2.guess(y[ix2:ix3], x=x[ix2:ix3])
we can get a better initial estimate (see below).
The fit converges to the same answer, giving to identical values (to the precision printed out in the report), but in fewer steps, and without any bounds on parameters at all:
[[Model]]
((Model(gaussian, prefix='g1_') + Model(gaussian, prefix='g2_')) + Model(exponential, prefix='exp_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 37
# data points = 250
# variables = 8
chi-square = 1247.52821
reduced chi-square = 5.15507524
Akaike info crit = 417.864631
Bayesian info crit = 446.036318
R-squared = 0.99648654
[[Variables]]
exp_amplitude: 99.0183265 +/- 0.53748764 (0.54%) (init = 94.53724)
exp_decay: 90.9508884 +/- 1.10310753 (1.21%) (init = 111.1985)
g1_amplitude: 4257.77384 +/- 42.3839276 (1.00%) (init = 3189.648)
g1_center: 107.030957 +/- 0.15006934 (0.14%) (init = 106.5)
g1_sigma: 16.6725783 +/- 0.16048220 (0.96%) (init = 14.5)
g1_fwhm: 39.2609209 +/- 0.37790669 (0.96%) == '2.3548200*g1_sigma'
g1_height: 101.880228 +/- 0.59216965 (0.58%) == '0.3989423*g1_amplitude/max(1e-15, g1_sigma)'
g2_amplitude: 2493.41698 +/- 36.1699974 (1.45%) (init = 2818.337)
g2_center: 153.270103 +/- 0.19466966 (0.13%) (init = 150)
g2_sigma: 13.8069440 +/- 0.18680331 (1.35%) (init = 15)
g2_fwhm: 32.5128679 +/- 0.43988818 (1.35%) == '2.3548200*g2_sigma'
g2_height: 72.0455954 +/- 0.61722288 (0.86%) == '0.3989423*g2_amplitude/max(1e-15, g2_sigma)'
[[Correlations]] (unreported correlations are < 0.100)
C(g1_amplitude, g1_sigma) = +0.8243
C(g2_amplitude, g2_sigma) = +0.8154
C(exp_amplitude, exp_decay) = -0.6946
C(g1_sigma, g2_center) = +0.6842
C(g1_center, g2_amplitude) = -0.6689
C(g1_center, g2_sigma) = -0.6521
C(g1_amplitude, g2_center) = +0.6477
C(g1_center, g2_center) = +0.6205
C(g1_center, g1_sigma) = +0.5075
C(exp_decay, g1_amplitude) = -0.5074
C(g1_sigma, g2_amplitude) = -0.4914
C(g2_center, g2_sigma) = -0.4890
C(g1_sigma, g2_sigma) = -0.4826
C(g2_amplitude, g2_center) = -0.4763
C(exp_decay, g2_amplitude) = -0.4270
C(g1_amplitude, g1_center) = +0.4183
C(g1_amplitude, g2_sigma) = -0.4011
C(g1_amplitude, g2_amplitude) = -0.3071
C(exp_amplitude, g2_amplitude) = +0.2821
C(exp_decay, g1_sigma) = -0.2520
C(exp_decay, g2_sigma) = -0.2329
C(exp_amplitude, g2_sigma) = +0.1714
C(exp_decay, g2_center) = -0.1514
C(exp_amplitude, g1_amplitude) = +0.1478
C(exp_decay, g1_center) = +0.1055
This script is in the file doc_builtinmodels_nistgauss2.py
in the examples folder,
and the figure above shows an improved initial estimate of the data.
Example 4: Using a Spline Model¶
In the example above, the two peaks might represent the interesting part of the data, and the exponential decay could be viewed a “background” which might be due to other physical effects or part of some response of the instrumentation used to make the measurement. That is, the background might be well-understood to be modeled as an exponential decay, as in the example above and so easily included in the full analysis. As the results above show, there is some – but not huge – correlation of the parameters between the peak amplitudes and the decay of the exponential function. That means that it is helpful to include all of those components in a single fit, as the uncertainties in the peak amplitudes (which would be interpreted as “line strength” or “area”) will reflect some of the uncertainty in how well we modeled the background.
Sometimes a background is more complex or at least has a less obvious
functional form. In these cases, it can be useful to use a spline to
model part of the curve. Just for completeness, a spline is a piecewise
continuous polynomial function (typically made of cubic polynomials) that
has a series of x
values known as “knots” at which the highest order
derivative is allowed to be discontinuous. By adding more knots, the
spline function has more flexibility to follow a particular function.
As an example (see the example file “doc_builtinmodels_splinemodel.py”), we start with data with a single peak and a background that is hard to characterize clearly as a simple decay, oscillatory structure.
import numpy as np
import matplotlib.pyplot as plt
from lmfit.models import SplineModel, GaussianModel
data = np.loadtxt('test_splinepeak.dat')
x = data[:, 0]
y = data[:, 1]
plt.plot(x, y, label='data')
plt.legend()
plt.show()
which shows (figure below):
There is definitely a peak there, so we could start with building a model for a Gaussian peak, say with:
model = GaussianModel(prefix='peak_')
params = model.make_params(amplitude=8, center=16, sigma=1)
To account for that changing background, we’ll use a spline, but need to
know where to put the “knots”. Picking points away from the peak makes
sense – we don’t want to fit the peak – but we want it to have some
flexibility near the peak. Let’s try spacing knot points at x=1, 3, ...,
13
, then skip over the peak at around x=16
and then pick up knots points
at x=19, 21, 23, 25
.
knot_xvals = np.array([1, 3, 5, 7, 9, 11, 13, 19, 21, 23, 25])
bkg = SplineModel(prefix='bkg_', xknots=knot_xvals)
params.update(bkg.guess(y, x))
Note that we used bkg.guess()
to guess the initial values of the spline
parameters and then update the params
Parameters object with these 11
parameters to account for the spline. These will be very close to the y
values at the knot x
values. The precise definition of the spline knot
parameters is not “the y-values through which the resulting spline curve
goes”, but these values are pretty good estimates for the resulting spline
values. You’ll see below that these initial values are close.
With a spline background defined, we can create a composite model, and run a fit.
model = model + bkg
params['peak_amplitude'].min = 0
params['peak_center'].min = 10
params['peak_center'].max = 20
out = model.fit(y, params, x=x)
print(out.fit_report(min_correl=0.3))
You’ll see that we first set some “sanity bounds” on the peak parameters to prevent the peak from going completely wrong. This really is not necessary in this case, but it is often a reasonable thing to do - the general advice for this is to be generous in the bounds, not overly restrictive.
This fit will print out a report of
[[Model]]
(Model(gaussian, prefix='peak_') + Model(spline_model, prefix='bkg_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 92
# data points = 501
# variables = 14
chi-square = 52.6611549
reduced chi-square = 0.10813379
Akaike info crit = -1100.61674
Bayesian info crit = -1041.58425
R-squared = 0.94690612
[[Variables]]
peak_amplitude: 12.2231135 +/- 0.29554108 (2.42%) (init = 8)
peak_center: 16.4280869 +/- 0.01091051 (0.07%) (init = 16)
peak_sigma: 0.72096400 +/- 0.01336667 (1.85%) (init = 1)
peak_fwhm: 1.69774046 +/- 0.03147610 (1.85%) == '2.3548200*peak_sigma'
peak_height: 6.76360674 +/- 0.09854044 (1.46%) == '0.3989423*peak_amplitude/max(1e-15, peak_sigma)'
bkg_s0: 3.51175736 +/- 0.04941392 (1.41%) (init = 3.787995)
bkg_s1: 3.72930068 +/- 0.09558236 (2.56%) (init = 3.959487)
bkg_s2: 4.26846495 +/- 0.12650286 (2.96%) (init = 4.384009)
bkg_s3: 4.42375490 +/- 0.10170203 (2.30%) (init = 4.431971)
bkg_s4: 4.49590448 +/- 0.10615552 (2.36%) (init = 4.243976)
bkg_s5: 3.96515315 +/- 0.09336555 (2.35%) (init = 4.115153)
bkg_s6: 3.35531899 +/- 0.12669985 (3.78%) (init = 3.965325)
bkg_s7: 2.89909752 +/- 0.16190211 (5.58%) (init = 2.788437)
bkg_s8: 2.82656963 +/- 0.13445495 (4.76%) (init = 2.984317)
bkg_s9: 3.43338680 +/- 0.15987281 (4.66%) (init = 3.383491)
bkg_s10: 3.73024843 +/- 0.12096865 (3.24%) (init = 3.791937)
[[Correlations]] (unreported correlations are < 0.300)
C(bkg_s7, bkg_s8) = -0.8192
C(peak_amplitude, peak_sigma) = +0.7987
C(bkg_s8, bkg_s9) = -0.7063
C(bkg_s5, bkg_s6) = -0.6950
C(peak_amplitude, bkg_s7) = -0.6878
C(bkg_s2, bkg_s3) = -0.6672
C(bkg_s9, bkg_s10) = -0.6060
C(bkg_s3, bkg_s4) = -0.5743
C(bkg_s1, bkg_s2) = -0.5646
C(bkg_s4, bkg_s5) = -0.5542
C(bkg_s7, bkg_s9) = +0.5216
C(peak_sigma, bkg_s7) = -0.5193
C(peak_amplitude, bkg_s8) = +0.5185
C(bkg_s0, bkg_s1) = +0.4448
C(peak_sigma, bkg_s8) = +0.3733
C(peak_center, bkg_s6) = +0.3599
C(bkg_s4, bkg_s6) = +0.3597
C(bkg_s0, bkg_s2) = -0.3595
C(bkg_s2, bkg_s4) = +0.3504
C(bkg_s8, bkg_s10) = +0.3455
C(bkg_s6, bkg_s7) = -0.3332
C(peak_center, bkg_s7) = -0.3301
C(peak_amplitude, bkg_s9) = -0.3206
from this we can make a few observations. First, the correlation between the “spline” parameters” and the “peak parameters” is noticeable, but not extremely high – that’s good, and the estimated uncertainties do account for this correlation. The spline components are correlated with each other (especially with the N-1 and N+1 spline parameter). Second, we can see that the initial values for the background spline parameters are pretty good.
We can plot the results and fit components with
comps = out.eval_components()
plt.plot(x, out.best_fit, label='best fit')
plt.plot(x, comps['bkg_'], label='background')
plt.plot(x, comps['peak_'], label='peak')
plt.legend()
which will generate the plot shown below:
If we’re interested in seeing the locations of the knots, you might do
knot_yvals = np.array([o.value for o in out.params.values() if o.name.startswith('bkg')])
plt.plot(knot_xvals, knot_yvals, 'o', color='black', label='spline knots values')
which will generate be shown as
You might be interested in trying to assess what impact the select of the knots has on the resulting peak intensity. For example, you might try some of the following set of knot values:
knot_xvals1 = np.array([1, 3, 5, 7, 9, 11, 13, 19, 21, 23, 25])
knot_xvals2 = np.array([1, 3, 5, 7, 9, 11, 13, 16, 19, 21, 23, 25])
knot_xvals3 = np.array([1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25])
and re-run the fit with these different sets of knot points. The results are shown in the table below.
Table of Peak amplitudes with varying spline points
spline x points
N
Peak amplitude value and uncertainty
knot_xvals1
11
12.223 (0.295)
knot_xvals2
12
11.746 (0.594)
knot_xvals3
13
12.052 (0.872)
Adding more spline points, especially near the peak center around x=16.4
,
can impact the measurement of the amplitude but the uncertainty increases
dramatically enough to mostly cover the same range of values. This is a
interesting case of adding more parameters to a fit and having the
uncertainties in the fitted parameters getting worse. The interested
reader is encouraged to explore the fit reports and plot these different case.
Finally, the basic case above used 11 spline points to fit the baseline. In fact, it would be reasonable to ask whether that is enough parameters to fit the full spectra. By imposing that there is also a Gaussian peak nearby makes the spline fit only the background, but without the Gaussian, the spline could fit the full curve. By way of example, we’ll just try increasing the number of spline points to fit this data
plt.plot(x, y, 'o', label='data')
for nknots in (10, 15, 20, 25):
model = SplineModel(prefix='bkg_', xknots=np.linspace(0, 25, nknots))
params = model.guess(y, x)
out = model.fit(y, params, x=x)
plt.plot(x, out.best_fit, label=f'best-fit ({nknots} knots)')
plt.legend()
plt.show()
which will show the fit below:
By itself, 10 knots does not give a very good fit, but 25 knots or more does give a very good fit to the peak. This should give some confidence that the fit with 11 parameters for the background spline is acceptable, but also give some reason to be careful in selecting the number of spline points to use.