Assume that, for the $i$th observation, the relationship between independent variable $\mathbf{x_i}=\begin{bmatrix} x_{1i},, x_{2i},, \ldots, x_{pi}\ \end{bmatrix}'$ and dependent variable
where
Given that the function LsqFit.jl in this tutorial, to find the least squares solution.
One example of non-linear model is the exponential model, which takes a one-element predictor variable
and the model becomes:
To fit data using LsqFit.jl, pass the defined model function (m), data (tdata and ydata) and the initial parameter value (p0) to curve_fit(). For now, LsqFit.jl only supports the Levenberg Marquardt algorithm.
julia> # t: array of independent variables
julia> # p: array of model parameters
julia> m(t, p) = p[1] * exp.(p[2] * t)
julia> p0 = [0.5, 0.5]
julia> fit = curve_fit(m, tdata, ydata, p0)It will return a composite type LsqFitResult, with some interesting values:
dof(fit): degrees of freedomcoef(fit): best fit parametersfit.resid: vector of residualsfit.jacobian: estimated Jacobian at the solution
The Jacobian
The matrix is therefore:
The Jacobian of the exponential model function with respect to
By default, the finite differences is used (see NLSolversBase.jl for more information), is used to approximate the Jacobian for the data fitting algorithm and covariance computation. Alternatively, a function which calculates the Jacobian can be supplied to curve_fit() for faster and/or more accurate results.
function j_m(t,p)
J = Array{Float64}(length(t),length(p))
J[:,1] = exp.(p[2] .* t) #df/dp[1]
J[:,2] = t .* p[1] .* J[:,1] #df/dp[2]
J
end
fit = curve_fit(m, j_m, tdata, ydata, p0)The non-linear function
where
Consider the residual vector functon $r({\boldsymbol{\gamma}})=\begin{bmatrix} r_1({\boldsymbol{\gamma}}) \ r_2({\boldsymbol{\gamma}}) \ \vdots\ r_n({\boldsymbol{\gamma}}) \end{bmatrix}$ with entries:
Each entry's linear approximation can hence be written as:
Since the $i$th row of
which is a linear function on
The linear approximation of the non-linear least squares problem leads to the approximation of the covariance matrix of each parameter, from which we can perform regression analysis.
Consider a least squares solution
Set $\boldsymbol{\gamma}^$ as the fixed point in linear approximation, $r({\boldsymbol{\gamma^}}) = r$ and $J(\boldsymbol{\gamma^}) = J$. A parameter vector near $\boldsymbol{\gamma}^$ can be expressed as
which is essentially the linear least squares problem:
where
The covariance matrix for the analytical solution is:
Note that
Assume the errors in each sample are independent, normal distributed with zero mean and same variance, i.e.
where
In LsqFit.jl, the covariance matrix calculation uses QR decomposition to be more computationally stable, which has the form:
estimate_covar() computes the covariance matrix of fit:
julia> cov = estimate_covar(fit)
2×2 Array{Float64,2}:
0.000116545 0.000174633
0.000174633 0.00258261The standard error is then the square root of each diagonal elements of the covariance matrix. standard_error() returns the standard error of each parameter:
julia> se = standard_error(fit)
2-element Array{Float64,1}:
0.0114802
0.0520416margin_error() computes the product of standard error and the critical value of each parameter at a certain significance level (default is 5%) from t-distribution. The margin of error at 10% significance level can be computed by:
julia> margin_of_error = margin_error(fit, 0.1)
2-element Array{Float64,1}:
0.0199073
0.0902435confidence_interval() returns the confidence interval of each parameter at certain significance level, which is essentially the estimate value ± margin of error. To get the confidence interval at 10% significance level, run:
julia> confidence_intervals = confidence_interval(fit, 0.1)
2-element Array{Tuple{Float64,Float64},1}:
(0.976316, 1.01613)
(1.91047, 2.09096)curve_fit() also accepts weight parameter (wt) to perform Weighted Least Squares and General Least Squares, where the parameter
Weight parameter (wt) is an array or a matrix of weights for each sample. To perform Weighted Least Squares, pass the weight array [w_1, w_2, ..., w_n] or the weight matrix W:
The weighted least squares problem becomes:
in matrix form:
where $r({\boldsymbol{\gamma}})=\begin{bmatrix} r_1({\boldsymbol{\gamma}}) \ r_2({\boldsymbol{\gamma}}) \ \vdots\ r_n({\boldsymbol{\gamma}}) \end{bmatrix}$ is a residual vector function with entries:
The algorithm in LsqFit.jl will then provide a least squares solution
!!! note
In LsqFit.jl, the residual function passed to levenberg_marquardt() is in different format, if the weight is a vector:
```julia
r(p) = sqrt.(wt) .* ( model(xpts, p) - ydata )
lmfit(r, g, p0, wt; kwargs...)
```
```math
r_i({\boldsymbol{\gamma}}) = \sqrt{w_i} \cdot [m(\mathbf{x_i}, {\boldsymbol{\gamma}}) - Y_i]
```
Cholesky decomposition, which is effectively a sqrt of a matrix, will be performed if the weight is a matrix:
```julia
u = chol(wt)
r(p) = u * ( model(xpts, p) - ydata )
lmfit(r, p0, wt; kwargs...)
```
```math
r_i({\boldsymbol{\gamma}}) = \sqrt{w_i} \cdot [m(\mathbf{x_i}, {\boldsymbol{\gamma}}) - Y_i]
```
The solution will be the same as the least squares problem mentioned in the tutorial.
Set $r({\boldsymbol{\gamma^}}) = r$ and $J(\boldsymbol{\gamma^}) = J$, the linear approximation of the weighted least squares problem is then:
The analytical solution to the linear approximation is:
Assume the errors in each sample are independent, normal distributed with zero mean and different variances (heteroskedastic error), i.e.
We know the error variance and we set the weight as the inverse of the variance (the optimal weight), i.e.
The covariance matrix is now:
If we only know the relative ratio of different variances, i.e.
where curve_fit() currently does not support this implementation. curve_fit() assumes the weight as the inverse of the error covariance matrix rather than the ratio of error covariance matrix, i.e. the covariance of the estimated parameter is calculated as covar = inv(J'*fit.wt*J).
!!! note Passing vector of ones as the weight vector will cause mistakes in covariance estimation.
Pass the vector of 1 ./ var(ε) or the matrix inv(covar(ε)) as the weight parameter (wt) to the function curve_fit():
julia> wt = inv(cov_ε)
julia> fit = curve_fit(m, tdata, ydata, wt, p0)
julia> cov = estimate_covar(fit)!!! note If the weight matrix is not a diagonal matrix, General Least Squares will be performed.
Assume the errors in each sample are correlated, normal distributed with zero mean and different variances (heteroskedastic and autocorrelated error), i.e.
Set the weight matrix as the inverse of the error covariance matrix (the optimal weight), i.e.
Pass the matrix inv(covar(ε)) as the weight parameter (wt) to the function curve_fit():
julia> wt = 1 ./ yvar
julia> fit = curve_fit(m, tdata, ydata, wt, p0)
julia> cov = estimate_covar(fit)In most cases, the variances of errors are unknown. To perform Weighted Least Square, we need estimate the variances of errors first, which is the squared residual of $i$th sample:
Unweighted fitting (OLS) will return the residuals we need, since the estimator of OLS is unbiased. Then pass the reciprocal of the residuals as the estimated optimal weight to perform Weighted Least Squares:
julia> fit_OLS = curve_fit(m, tdata, ydata, p0)
julia> wt = 1 ./ fit_OLS.resid
julia> fit_WLS = curve_fit(m, tdata, ydata, wt, p0)
julia> cov = estimate_covar(fit_WLS)A Plots.jl plot recipe is provided for visualizing a fit. The simplest case
is plotting just the fit curve by itself, which can be done by supplying the
model and the fit object to a plot call:
plot(m, fit)The plot recipe can also show the confidence and prediction intervals by
supplying the purpose keyword (default :neither):
plot(
plot(m, fit; purpose=:neither),
plot(m, fit; purpose=:prediction),
plot(m, fit; purpose=:confidence),
plot(m, fit; purpose=:both),
;
layout=(2,2),
)Changing the keyword significance (default 0.05) changes which confidence
is being illustrated.
Hansen, P. C., Pereyra, V. and Scherer, G. (2013) Least squares data fitting with applications. Baltimore, Md: Johns Hopkins University Press, p. 147-155.
Kutner, M. H. et al. (2005) Applied Linear statistical models.
Weisberg, S. (2014) Applied linear regression. Fourth edition. Hoboken, NJ: Wiley (Wiley series in probability and statistics).
