wsolve - Weighted linear and polynomial solver with uncertainty¶

`wsolve`	Given a linear system \(y = A x + \delta y\), estimates \(x\) and \(\delta x\).
`wpolyfit`	Return the polynomial of degree \(n\) that minimizes \(\sum(p(x_i) - y_i)^2/\sigma_i^2\).
`LinearModel`	Model evaluator for linear solution to \(Ax = y\).
`PolynomialModel`	Model evaluator for best fit polynomial \(p(x) = y +/- \delta y\).

Weighted linear and polynomial solver with uncertainty.

Given \(A \bar x = \bar y \pm \delta \bar y\), solve using s = wsolve(A,y,dy)

wsolve uses the singular value decomposition for increased accuracy.

The uncertainty in the solution is estimated from the scatter in the data. Estimates the uncertainty for the solution from the scatter in the data.

The returned model object s provides:

s.x

solution

s.std

uncertainty estimate assuming no correlation

s.rnorm

residual norm

s.DoF

degrees of freedom

s.cov

covariance matrix

s.ci(p)

confidence intervals at point p

s.pi(p)

prediction intervals at point p

s(p)

predicted value at point p


s.x	solution
s.std	uncertainty estimate assuming no correlation
s.rnorm	residual norm
s.DoF	degrees of freedom
s.cov	covariance matrix
s.ci(p)	confidence intervals at point p
s.pi(p)	prediction intervals at point p
s(p)	predicted value at point p

Example¶

Weighted system:

>>> import numpy as np
>>> from bumps import wsolve
>>> A = np.array([[1,2,3],[2,1,3],[1,1,1]], dtype='d')
>>> dy = [0.2,0.01,0.1]
>>> y = [ 14.16, 13.01, 6.15]
>>> s = wsolve.wsolve(A,y,dy)
>>> print(", ".join(f"{a:0.2f} +/- {b:0.2f}" for a,b in zip(s.x[:, 0],s.std)))
1.05 +/- 0.17, 2.20 +/- 0.12, 2.91 +/- 0.12

Note there is a counter-intuitive result that scaling the estimated uncertainty in the data does not affect the computed uncertainty in the fit. This is the correct result — if the data were indeed selected from a process with ten times the uncertainty, you would expect the scatter in the data to increase by a factor of ten as well. When this new data set is fitted, it will show a computed uncertainty increased by the same factor. Monte carlo simulations bear this out. The conclusion is that the dataset carries its own information about the variance in the data, and the weight vector serves only to provide relative weighting between the points.

class bumps.wsolve.LinearModel(x=None, DoF=None, SVinv=None, rnorm=None)[source]¶

Bases: object

Model evaluator for linear solution to \(Ax = y\).

Use s(A) to compute the predicted value of the linear model s at points given on the rows of \(A\).

Computes a confidence interval (range of likely values for the mean at \(x\)) or a prediction interval (range of likely values seen when measuring at \(x\)). The prediction interval gives the width of the distribution at \(x\). This should be the same regardless of the number of measurements you have for the value at \(x\). The confidence interval gives the uncertainty in the mean at \(x\). It should get smaller as you increase the number of measurements. Error bars in the physical sciences usually show a \(1-\alpha\) confidence value of \(\text{erfc}(1/\sqrt{2})\), representing a \(1-\sigma\) standard deviation of uncertainty in the mean.

Confidence intervals for the expected value of the linear system evaluated at a new point \(w\) are given by the \(t\) distribution for the selected interval \(1-\alpha\), the solution \(x\), and the number of degrees of freedom \(n-p\):

\[w^T x \pm t^{\alpha/2}_{n-p} \sqrt{ \text{var}(w) }\]

where the variance \(\text{var}(w)\) is given by:

\[\text{var}(w) = \sigma^2 (w^T (A^TA)^{-1} w)\]

Prediction intervals are similar, except the variance term increases to include both the uncertainty in the predicted value and the variance in the data:

\[\text{var}(w) = \sigma^2 (1 + w^T (A^TA)^{-1} w)\]

DoF¶: number of degrees of freedom in the solution

ci(A, sigma=1)[source]¶

Compute the calculated values and the confidence intervals for the linear model evaluated at \(A\).

sigma=1 corresponds to a \(1-\sigma\) confidence interval

Confidence intervals are sometimes expressed as \(1-\alpha\) values, where \(\alpha = \text{erfc}(\sigma/\sqrt{2})\).

property cov¶: covariance matrix [inv(A’A); O(n^3)]

property p¶: p-value probability of rejection

pi(A, p=0.05)[source]¶

Compute the calculated values and the prediction intervals for the linear model evaluated at \(A\).

p=0.05 corresponds to the 95% prediction interval.

rnorm¶: 2-norm of the residuals \(||y-Ax||_2\)

property std¶: solution standard deviation [sqrt(var); O(n^2)]

property var¶: solution variance [diag(cov); O(n^2)]

x¶: solution to the equation \(Ax = y\)

class bumps.wsolve.PolynomialModel(x, y, dy, s, origin=False)[source]¶

Bases: object

Model evaluator for best fit polynomial \(p(x) = y +/- \delta y\).

Use p(x) for PolynomialModel p to evaluate the polynomial at all points in the vector x.

DoF¶: number of degrees of freedom in the solution

ci(x, sigma=1)[source]¶

Evaluate the polynomial and the confidence intervals at x.

sigma=1 corresponds to a 1-sigma confidence interval

coeff¶: polynomial coefficients

property cov¶

covariance matrix

Note that the ones column will be absent if origin is True.

degree¶: polynomial degree

der(x)[source]¶: Evaluate the polynomial derivative at x.

origin¶: True if polynomial goes through the origin

property p¶: p-value probability of rejection

pi(x, p=0.05)[source]¶

Evaluate the polynomial and the prediction intervals at x.

p = 1-alpha = 0.05 corresponds to 95% prediction interval

plot(ci=1, pi=0)[source]¶

rnorm¶: 2-norm of the residuals \(||y-Ax||_2\)

property std¶: solution standard deviation

property var¶: solution variance

bumps.wsolve.wpolyfit(x, y, dy=1, degree=None, origin=False)[source]¶

Return the polynomial of degree \(n\) that minimizes \(\sum(p(x_i) - y_i)^2/\sigma_i^2\).

if origin is True, the fit should go through the origin.

Returns PolynomialModel.

bumps.wsolve.wsolve(A, y, dy=1, rcond=1e-12)[source]¶

Given a linear system \(y = A x + \delta y\), estimates \(x\) and \(\delta x\).

A is an n x m array of measurement points.

y is an n x k array or vector of length n of measured values at A.

dy is a scalar or an n x 1 array of uncertainties in the values at A.

Returns LinearModel.