Application Center - Maplesoft

Contact Maplesoft Request Quote

Submit your work

Application Center Applications Classroom Tips and Techniques: Least-Squares Fits Preview

App Preview:

Classroom Tips and Techniques: Least-Squares Fits

You can switch back to the summary page by clicking here.

Learn about Maple

Download Application

Classroom Tips and Techniques: Least-Squares Fits

Robert J. Lopez

Emeritus Professor of Mathematics and Maple Fellow

Maplesoft

Introduction

The least-squares fitting of functions to data can be done in Maple with eleven different commands from four different packages. The CurveFitting and LinearAlgebra packages each have a LeastSquares command, the former limited to the fitting of univariate linear models; the latter, applicable to univariate or multivariate linear models. The Optimization package has the LSSolve and NLPSolve commands, the former specifically designed to solve least-squares problems; the latter, capable of minimizing a nonlinear sum-of-squares.

These seven command from the Statistics package can return some measure of regression analysis (see Table 2): Fit, LinearFit, PolynomialFit, ExponentialFit, LogarithmicFit, PowerFit, and NonlinearFit. The Fit command passes problems to either LinearFit or NonlinearFit, as appropriate. The NonlinearFit command invokes Optimization's LSSolve command, while the remaining commands (implementing a linearization where necessary) make use of LinearAlgebra's LeastSquares command.

This month's article will explore each of these eleven tools, examine the spectrum of problems to which they apply, and give examples of their use.

Tools

Table 1 summarizes the eleven least-squares commands available in Maple.

Package

Command

Comments

CurveFitting

LeastSquares

•	Fit a univariate linear model to data

•	Exact solutions supported

•	Fitting curve can have free parameters

LinearAlgebra

LeastSquares

•	Fit a univariate or multivariate linear model to data

•	Obtain general or minimum-norm least-squares solution

•	Input can be set of linear equations

•	Exact solutions supported

•	Fitting curve can have free parameters

Optimization

LSSolve

•	Obtain local minimum of

•	Input: list of residuals

•	Supports equality and/or inequality constraints, and bounds on variables

•	Both supported methods use differentiation

•	Numeric solutions only

NLPSolve

•	Obtain local minimum of

•

Input:

•	Supports equality and/or inequality constraints, and bounds on variables

•	Methods include nonlinear simplex (Nelder-Mead) for unconstrained multivariate objective functions

•	Numeric solutions only

Statistics

Fit

•	Passes least-squares fit of a linear model to LinearFit, and of a nonlinear model to NonlinearFit

•	Accepts model only as an expression

LinearFit

•	Passes least-squares fit of a linear model to (numerical) LinearAlgebra

•	Model input as list or vector of component expressions or functions

PolynomialFit

•	Passes least-squares fit of a polynomial to (numerical) LinearAlgebra

•	Input: polynomial degree, data, and independent variable

ExponentialFit

•	Linearizes the fitting function to , and passes problem to (numerical) LinearAlgebra

•	Input: Data and independent variable

LogarithmicFit

•	Treats the fitting function as linear in and passes problem to (numerical) LinearAlgebra

•	Input: Data and independent variable

PowerFit

•	Linearizes the fitting function to , and passes problem to (numerical) LinearAlgebra

•	Input: Data and independent variable

NonlinearFit

•	Passes the least-squares fit of a nonlinear model to the LSSolve command in Optimization, obtaining a local best-fit

•	Input: Model as expression or function, data, independent variable

Table 1 Maple commands for least-squares fitting

The LeastSquares command in the CurveFitting package fits a univariate linear model to data. The input data can be a list of points, or separate lists (or vectors) of values for the independent and dependent variables. The data points can be weighted, and the particular linear model can be provided as an expression linear in the model parameters. Computations are done in exact/symbolic form, so the fitting curve can contain free parameters that are not solve for. Both the Context Menu for a list of lists, and the Curve Fitting Assistant provide an interactive interface to this command.

The LeastSquares command in the LinearAlgebra package provides a number of additional functionalities for linear models: the model can be multivariate; the first argument can be a set of equations; for rank-deficient models both the general and minimum-norm solutions are available; and the user has control over the name of free parameters in a general solution. Like the CurveFitting version, this command can also work in exact/symbolic mode, so inputs and outputs can contain symbolic terms.

When applied to floating-point data, the LeastSquares command in LinearAlgebra will implement calculations based either on a QR decomposition, or a singular-values decomposition. Since the QR decomposition does not readily determine the rank of the decomposed matrix, least-squares fits based on this approach can fail for rank-deficient matrices. Because the default settings for automatically switching to the more robust singular-values approach can be thwarted by a given matrix, the safest policy appears to be setting the method option to SVD in all cases.

The LSSolve command in the Optimization package provides a local solution to both linear and nonlinear least-squares problems. The objective function (a sum of squares of deviations) shown in Table 1 is minimized, possibly subject to constraints (equality, inequality, bounds on variables). The input to the command could be a list corresponding to the linear least-squares problem . Alternatively, the input to the command could be a list of residuals (deviations) of the form . If the model is given by the function , where u is a vector of parameters, then . If the least-squares solution of the (inconsistent) equations , is required, then , enabling LSSolve to accept what is essentially a list of equations.

The NLPSolve command in the Optimization package provides a local extreme for a multivariate function, whether linear or nonlinear. If this function is the sum-of-squares of deviations, then finding its minimum is equivalent to solving a least-squares problem. This command is included in Table 1 because it can invoke the Nelder-Mead method (nonlinear simplex in Maple), which is the only option in Maple that does not use differentiation to find local extrema of unconstrained multivariate functions. (Additional derivative-free options are available in Dr. Moiseev's DirectSearch package, the details of which were discussed here.)

All seven regression commands in the Statistics package work in floating-point arithmetic only. Output for each command can be one or a list of the items in Table 2, or a module containing all the relevant regression-analysis details shown in the table. The first eight items are available for the NonlinearFit command; all 16 are available for the other six regression commands. The help page for these options can be obtained by clicking here , or by executing the command ?Statistics,Regression,Solution.

degreesoffreedom

leastsquaresfunction

parametervalues

parametervector

residuals

residualmeansquare

residualstandarddeviation

residualsumofsquares

AtkinsonTstatistic

condidenceintervals

CookDstatistic

externallystandardizedresiduals

internallystandardizedresiduals

leverages

standarderrors

variancecovariancematrix

Table 2 Regression analysis elements

Table 1 indicates that the Fit command is an interface to the LinearFit and NonlinearFit commands. The LinearFit and PolynomialFit commands invoke the numeric version of the LeastSquares command in LinearAlgebra, as do the ExponentialFit, LogarithmicFit, and PowerFit commands (after linearization). The NonlinearFit command invokes the LSSolve command in Optimization.

Universe of Discourse

Tables 3 and 4 summarize the universe of discourse for the least-squares options in Maple, Table 3 dealing with univariate models; and Table 4, multivariate models. The characteristics of this universe can be taken as linear/nonlinear, overdetermined/underdetermined/exactly determined, consistent/inconsistent, univariate/multivariate, provided the notion of "determined" is given a precise meaning. At first glance, a system can have more equations than unknown parameters, but if the equations are redundant, there may actually be fewer distinct equations in the system than unknowns. Such a system would be underdetermined, but by a simple count of equations, might be called overdetermined. We opt for the former meaning, namely, that the terms underdetermined, overdetermined, and exactly determined be applied only after all redundancies have been eliminated.

According to Table 3, whether a univariate model is linear or nonlinear, if there are more distinct equations than unknowns (i.e., if the system is truly overdetermined), then the system is necessarily inconsistent, and a least-squares solution is appropriate. So too for the underdetermined, inconsistent system - a least-squares solution is appropriate, and will contain free parameters. The underdetermined, consistent system will also have a general solution containing free parameters, but here, the least-squares technique need not be invoked.

Underdetermined or exactly determined consistent systems are essentially interpolation problems; inconsistent systems are the ones requiring least-squares techniques. Underdetermined linear models will have a parameter-dependent general solution, from which can be extracted a unique solution of minimum norm ().

Univariate Models
	Overdetermined	Underdetermined & Consistent	Underdetermined & Inconsistent
Linear	LeastSquares (CurveFitting) LeastSquares (LinearAlgebra) LSSolve (Optimization) Fit (Statistics) LinearFit (Statistics)	solve LinearSolve (LinearAlgebra) LeastSquares (CurveFitting) LeastSquares (LinearAlgebra) (Optimization)	LeastSquares (CurveFitting) LeastSquares (LinearAlgebra)
Nonlinear	LSSolve (Optimization) NonlinearFit (Statistics) (Statistics)	(Optimization)	(normal equations) (Optimization)
Table 3 Problems and tools for fitting univariate models to data

(1) Local extrema, at best. (2) Can fail for intractable algebra. (3) Minimize sum-of-squares of deviations.

Table 4 categorizes multivariate linear models, those given as , according to the number of rows (r) and columns (c) in the matrix A. However, this classification is affected by the rank of . Systems that are truly underdetermined have a parameter-dependent general solution, which, if projected onto the row space of A, becomes the minimum-norm solution. Full-rank matrices A with at least as many rows as columns have a trivial null space, and hence any associated linear system has a unique solution, even if it is in the least-squares sense.

Multivariate Linear Models
	Rank of		Appropriate Commands
	Full rank	Consistent (necessarily)	LinearSolve and LeastSquares (LinearAlgebra)
	Deficient	Consistent	LinearSolve and LeastSquares (LinearAlgebra)
	Deficient	Inconsistent	LeastSquares (LinearAlgebra)
	Full rank	Consistent (necessarily)	LinearSolve (LinearAlgebra)
	Deficient	Consistent	LinearSolve and LeastSquares (LinearAlgebra)
	Deficient	Inconsistent	LeastSquares (LinearAlgebra)
	Full rank	Consistent	LinearSolve and LeastSquares (LinearAlgebra) LSSolve (Optimization)
	Full rank	Inconsistent	LinearSolve and LeastSquares (LinearAlgebra) LSSolve (Optimization)
	Deficient	Consistent	LinearSolve and LeastSquares (LinearAlgebra)
	Deficient	Inconsistent	LeastSquares (LinearAlgebra)
Table 4 Problems and tools for fitting multivariate linear models to data

For either of Tables 3 or 4, the bifurcation induced by the exact/floating-point distinction arises only for invocations of the LeastSquares command in LinearAlgebra, and is dealt with only in the context of specific examples. Problems that turn out to be consistent and exactly determined are actually interpolation problems, and not least-squares problems.

The overdetermined nonlinear multivariate model can be solved with the NonlinearFit command in Statistics, and the LSSolve command in Optimization. Of course, the sum-of-squares of deviations can be directly minimized by, for example, the NLPSolve command in Optimization. Underdetermined nonlinear multivariate models pose a special challenge. None of the tools in LinearAlgebra apply, and the numeric tools in Optimization and Statistics provide only local solutions, so these tools will not return a general solution. If the algebra is tractable, it might be possible for the solve command to yield the general solution: for a consistent system, apply it directly to the equations; for an inconsistent system, to the normal equations.

Examples

This section contains some 21 examples illustrating the use of the Maple commands in Table 1. The organization of the examples is based on Table 3 and 4, and the remarks on nonlinear multivariate systems following Table 4.

Linear Univariate Models

Overdetermined Case

Example 1.

Fit f(x) = sum(b[k]*sin(k*x), k = 1 .. 5) to ten data points along .

Solution

•	Define , the fitting function.

•	Context Menu: Assign Function

f(x) = sum(b[k]*sin(k*x), k = 1 .. 5)

•	Define a list of x-values (as floats).

•	Form a list of corresponding y-values:

Apply the LeastSquares command from the CurveFitting package.

HFloat(3.761652233381777)*sin(x)-HFloat(3.1917311225758707)*sin(2.*x)+HFloat(2.0766346632668946)*sin(3.*x)-HFloat(1.6745633371939552)*sin(4.*x)+HFloat(1.3699491229601735)*sin(5.*x)

Apply the LeastSquares command from the LinearAlgebra package.

The arguments are a set of equations of the form , and a set of parameters (the ).

{b[1] = HFloat(3.7616522333817786), b[2] = HFloat(-3.191731122575871), b[3] = HFloat(2.076634663266895), b[4] = HFloat(-1.6745633371939554), b[5] = HFloat(1.3699491229601752)}

Notice that the output is not the fitting function, but a set of equations defining the parameters.

To the problem in the form , apply the LeastSquares command from LinearAlgebra.

A, v := LinearAlgebra:-GenerateMatrix(Equate(`~`[f](X), Y), [seq(b[k], k = 1 .. 5)]); LinearAlgebra:-LeastSquares(A, v)

Vector(5, {(1) = 3.76165223338178, (2) = -3.19173112257587, (3) = 2.07663466326689, (4) = -1.67456333719396, (5) = 1.36994912296017})

The output is now a vector of values for the parameters.

Apply the LSSolve command from the Optimization package. The input is the list .

[15.4472790754279341, Vector(5, {(1) = 3.76165223338178, (2) = -3.19173112257587, (3) = 2.07663466326690, (4) = -1.67456333719395, (5) = 1.36994912296018})]

The output is a list, the first member of which is half the sum of the squares of the deviations; and the second of which is a vector of values for the parameters.

Apply the LinearFit command from the Statistics package.

The arguments are a list of basis functions for the linear model, the data, and the independent variable for the model.

infolevel[Statistics] := 5; Statistics:-LinearFit([seq(sin(k*x), k = 1 .. 5)], X, Y, x); infolevel[Statistics] := 0

HFloat(3.7616522333817777)*sin(x)-HFloat(3.1917311225758707)*sin(2*x)+HFloat(2.0766346632668946)*sin(3*x)-HFloat(1.6745633371939554)*sin(4*x)+HFloat(1.3699491229601752)*sin(5*x)

By setting infolevel to 5, additional information about the calculation is printed. The fitting function is returned.

Apply the Fit command from Statistics; the first argument must now be the model function.

infolevel[Statistics] := 5; Statistics:-Fit(f(x), X, Y, x); infolevel[Statistics] := 0

HFloat(3.7616522333817777)*sin(x)-HFloat(3.1917311225758707)*sin(2*x)+HFloat(2.0766346632668946)*sin(3*x)-HFloat(1.6745633371939554)*sin(4*x)+HFloat(1.3699491229601752)*sin(5*x)

Notice how Fit passed the problem off to LinearFit.

Apply the Fit command from Statistics so as to return m, a module whose exports are the entries of Table 2.

Access the exports of module singly.

Not recommended: Return all 16 exports in a (poorly formatted) list:

The following device line-breaks the exports, making them easier to read.

Underdetermined Case

Consistent

Example 2.

Fit to the data points and .

Solution

•	Initialize Maple.

•	Define the fitting function: Context Menu: Assign Function

This is an interpolation problem requiring the solution of two consistent equations in three unknowns.
It is not necessarily a least-squares problem.

•	Write and solve two (consistent) equations in three unknowns.

{a = -7/2+(1/2)*c, b = 17/2-(3/2)*c, c = c}

•	Obtain the general solution, a one-parameter family of interpolating functions.

(-7/2+(1/2)*c)*x^2+(17/2-(3/2)*c)*x+c

Use the LeastSquares command from the CurveFitting package:

Although the calculation is passed off to the LeastSquares command in LinearAlgebra, which has provision for controlling the name of the free parameter, this control is lacking in the CurveFitting package.

Use the LeastSquares command from LinearAlgebra. The arguments here will be a set of equations and a set of parameters. Note the control over the free parameter.

The LeastSquares command in LinearAlgebra returns a set of equations defining the parameters, which then have to be transferred to the model to obtain the fitting function. The appearance of the free parameter is best explained via the matrix formulation of the problem.

Cast the problem in the form .

•	Convert equations to matrix/vector form.

Matrix(2, 3, {(1, 1) = 1, (1, 2) = 1, (1, 3) = 1, (2, 1) = 4, (2, 2) = 2, (2, 3) = 1}), Vector(2, {(1) = 5, (2) = 3})

•	Obtain the general solution.

Vector(3, {(1) = s[1], (2) = -3*s[1]-2, (3) = 2*s[1]+7})

•	Obtain the minimum-norm solution.

Vector(3, {(1) = -10/7, (2) = 16/7, (3) = 29/7})

•	Project V onto the row space of A.

N := A^%T; P := N.(1/(N^%T.N)).N^%T; P.V

Vector(3, {(1) = -10/7, (2) = 16/7, (3) = 29/7})

The numeric solvers of the Optimization package, being local, will not necessarily find the minimum-norm solution, and might return any member of the general solution. The numeric solvers of the Statistics package reject underdetermined problems.

Inconsistent

Example 3.

Fit to the data point and .

Solution

•	Initialize Maple.

•	Define the fitting function: Context Menu: Assign Function

Use the LeastSquares command from the CurveFitting package:

Use the LeastSquares command from LinearAlgebra. The arguments here will be a set of equations and a set of parameters.

Cast the problem in the form .

Matrix(2, 3, {(1, 1) = 1, (1, 2) = 1, (1, 3) = 1, (2, 1) = 1, (2, 2) = 1, (2, 3) = 1}), Vector(2, {(1) = 5, (2) = 3})

Obtain the general solution

Obtain the minimum-norm solution

The null space of A is spanned by the vectors .

Nonlinear Univariate Models

Overdetermined Case

Logarithmic Model

Example 4.

Fit to the 45 data points shown in Figure 1.

X:=[.4415800447, .5203850932, .5586711668, .6765059811, .7450887166, .9557085792, .9581748469, 1.056473538, 1.566338190, 1.816703465, 2.388745916, 2.650935552, 2.961694105, 3.135006301, 3.413191570, 3.525872385, 4.033977197, 4.058984896, 4.087467046, 4.448111434, 4.478647296, 4.704561792, 4.927323253, 4.937647961, 5.212111955, 5.350970947, 6.569312236, 6.619738280, 6.636234495, 7.225453222, 7.321555106, 7.421499760, 7.628867977, 7.696458584, 7.869296084, 8.005454234, 8.192046666, 8.247384597, 8.319562068, 8.352030955, 9.377657024, 9.532776673, 9.722178164, 9.759535988, 9.887836167]:

Y:=[-.3165315447, 0.525739071e-1, .2280755754, .9930696348, 1.005519640, 2.050502588, 1.684642486, 3.030733346, 2.676977277, 5.307499018, 4.151074969, 6.894632989, 4.731555880, 7.599050973, 3.409766036, 6.936460408, 4.947406689, 6.823078626, 4.979021138, 7.125182710, 4.548574222, 9.303837054, 4.070632602, 7.469734020, 4.867068792, 8.438200908, 4.588336461, 9.204201002, 7.677634124, 10.31267885, 5.580727779, 8.814457831, 4.857491046, 10.55896517, 4.913343507, 8.240369267, 6.647593036, 9.162657236, 8.355828851, 9.204265632, 7.843490860, 9.640628922, 5.293937431, 12.36862840, 5.324349598]:

>	plot(X,Y,style=point);

Figure 1 Data points to be fitted with . (Data hidden behind table: lists X and Y of values of the independent and dependent variables, respectively.)

Solution

•	Define f, the logarithmic model function.

•	Form , the sum of squares of deviations:

SS := sum((f(X[k])-Y[k])^2, k = 1 .. 45)

Apply the LogarithmicFit command from Statistics

Apply the NonlinearFit command from Statistics

Minimize via the Optimization package

Form and solve the normal equations, then evaluate the resulting

The solution obtained by the linearization in LogarithmicFit closely matches the solutions obtained by methods that do not linearize.

Power Model

Example 5.

Fit to the 45 data points shown in Figure 2.

X:=[.3311845858, .3382461426, .3596315385, .4171713246, .4172203361, .4373441686, .4871624354, .5074279163, .5970838801, .6782429163, .6822631566, .6857075734, .7123173532, .7928747878, .8613884024, .9423775810, 1.013359430, 1.015837175, 1.039779012, 1.073557773, 1.087726149, 1.104538148, 1.183382262, 1.288303209, 1.371372688, 1.411264277, 1.510435824, 1.541170242, 1.721992776, 1.884733992, 1.890335165, 1.935953579, 1.943825167, 2.302765894, 2.439615410, 2.443593963, 2.554861738, 2.652786753, 2.773245229, 2.775144735, 2.868025527, 2.888689619, 2.940485400, 2.987329661, 2.996133391]:

Y:=[3.901426454, 5.552778517, 3.682748534, 4.425793308, 2.581500466, 3.925060388, 3.308707232, 4.501901432, 2.869493552, 2.887041285, 1.829623341, 3.125460247, 1.775237930, 2.823371431, 2.220192758, 2.501806865, 1.981506660, 2.571558903, 1.946127380, 2.664282284, 1.885672629, 2.425196004, 1.599871000, 1.675009728, 1.282658772, 1.885755564, .8991034242, 1.773028886, 1.093699604, 1.411716974, .8965011818, 1.511418676, 1.255943074, 1.115462934, 1.071285035, 1.070063783, .7260572291, 1.111299304, .8814155598, 1.272545739, .5739487804, 1.237321990, .8460171871, 1.022644001, .9277633116]:

>	plot(X,Y,style=point);

Figure 2 Data points to be fitted with . (Data hidden behind table: lists X and Y of values of the independent and dependent variables, respectively.)

Solution

•	Context Menu: Assign Function

•	Form , the sum of squares of deviations:

SS := sum((g(X[k])-Y[k])^2, k = 1 .. 45)

Apply the PowerFit command from Statistics

$[2.01022643676392/x^.734986123143025, 2.13617170057370, Vector(2, {(1) = .698247370832303, (2) = -.734986123143025})]$

Evaluate for the linearized fit

Apply the NonlinearFit command from Statistics

Minimize via the Optimization package

Form and solve the normal equations, then evaluate the resulting

The parameters computed by the linearization in PowerFit differ slightly from those computed by the other methods which don't linearize. The sum of squares of residuals returned by PowerFit is for the linearized model, not the nonlinear model; when corrected for the linearization, it is slightly larger than the value for the nonlinear fits.

Exponential Model

Example 6.

Fit to the 45 data points shown in Figure 3.

X:=[.3249060766, .4638590607, .4644519699, .4646954321, .5512620502, .5958265967, .6201785971, .6231171221, .6302313432, .6306638781, .6993610748, .7318673274, .8091876437, .9008821207, .9849002264, 1.064226572, 1.110383885, 1.259818068, 1.435261400, 1.453715259, 1.482196702, 1.532050477, 1.581696829, 1.712052277, 1.870325124, 1.872936109, 2.060662252, 2.074651185, 2.085638251, 2.114074877, 2.319376166, 2.357983108, 2.409166844, 2.417540344, 2.503005511, 2.523280179, 2.598296000, 2.603682807, 2.644623630, 2.662815612, 2.797604341, 2.896353971, 2.921562129, 2.927435786, 2.966131074]:

Y:=[1.593149583, 1.590034983, 1.011420580, 2.022496449, 1.223729582, 1.449732045, .9069637640, 1.422300135, 1.157919343, 1.543425077, 1.103220812, 1.437868212, .7945660884, 1.064526071, .9033515451, .9495079084, .8273875646, 1.159219310, .7323213934, 1.012091251, .5669217104, .8212180644, .5287792925, .6033309380, .3240342593, .5390709438, .3781516600, .6085085079, .3715978170, .5464110888, .2366337341, .3838739654, .3703637640, .4050191695, .3121357178, .3419301014, .1946629567, .3555389028, .1884514276, .4031447974, .2821896664, .3160107514, .2328625706, .2576745642, .1755520822]:

>	plot(X,Y,style=point);

Figure 3 Data points to be fitted with . (Data hidden behind table: lists X and Y of values of the independent and dependent variables, respectively.)

Solution

•	Context Menu: Assign Function

•	Form , the sum of squares of deviations:

SS := sum((h(X[k])-Y[k])^2, k = 1 .. 45)

Apply the ExponentialFit command from Statistics

[2.02149034952439*exp(-.736048918381843*x), 2.37018488848112, Vector(2, {(1) = .703835036169110, (2) = -.736048918381843})]

Evaluate for the linearized fit

Apply the NonlinearFit command from Statistics

Minimize via the Optimization package

Form and solve the normal equations, then evaluate the resulting

The parameters computed by the linearization in ExponentialFit differ slightly from those computed by the other methods which don't linearize. The sum of squares of residuals returned by ExponentialFit is for the linearized model, not the nonlinear model; when corrected for the linearization, it is slightly larger than the value for the nonlinear fits.

Michaelis-Menten Model

Example 7.

Fit , the Michaelis-Menten model, to the 46 data points shown in Figure 4.

Figure 4 Data points to be fitted with . (Data hidden behind table: lists S and V of values of the independent and dependent variables, respectively.)

This example is a summary of one that appears in the ebook, Advanced Engineering Mathematics with Maple, an example that also appears in the Reporter article Nonlinear Fit, Optimization, and the DirectSearch Package. The fit is obtained with two different linearizations, and nonlinearly, the outcome being that the linearized solutions provide decidedly poor fits.

Solution

•	Define : Context Menu: Assign Function

•	Define , sum of squares of deviations.

SS := sum((v(S[k])-V[k])^2, k = 1 .. 46)

Linearization 1

Linearization 2

The linearization requires the computation of the reciprocals and .

The linearization requires a list of ratios lambda[k] = v[k]/s[k] with new independent variable .

Obtain the least-squares regression line

Here, and .

a[1] := 1/coeff(rhs(L[1]), z, 0); b[1] := a[1]*coeff(rhs(L[1]), z, 1)

a[2] := coeff(rhs(L[2]), lambda, 0); b[2] := -coeff(rhs(L[2]), lambda, 1)

Nonlinear Fit

•	Sum of Squares for :

•	Sum of Squares for :

•	Sum of Squares for nonlinear fitting function:

The NonlinearFit command from Statistics can return the fitting function, but the Statistics package's LSSolve command, whose input is a list of deviations (called residuals) returns the parameter values. In this example, it is even possible to form the normal equation and solve them numerically.

Figure 5 compares the graphs of the three fitting functions. Both from the graph and from the values of , it should be clear that linearizations do not necessarily provide the best fits to data.

Figure 5 Nonlinear fit (black), (red), (green) superimposed on Figure 4

Underdetermined Case

Consistent

Example 8.

Fit to the two points and .

This problem is essentially an interpolation, with the expected result being a one-parameter family of curves all going through the two given points. If were a linear function, such an outcome, and the means to achieve it, would be clear. For this particular , it is possible to find this one-parameter family of solutions, but in general, it might not be possible to implement the requisite manipulations.

Exact Solution

•	Restart Maple.

•	Control-drag the equation Context Menu: Assign Function

•	Solve for b and c as functions of a.

•	Obtain the one-parameter family of solutions.

-exp(a*x)*(ln(2)+ln(1/(-1+exp(-3*exp(-a))))-3*exp(-a)-ln((3-exp(-3*exp(-a))-x+x*exp(-3*exp(-a)))/(-1+exp(-3*exp(-a)))))

The LSSolve command in the Optimization package requires at least as many residuals as parameters; otherwise, an error results. Hence, it really does not apply here.

Inconsistent

Example 9.

Fit to the two points and .

The equations and would necessarily be inconsistent, so this is not an interpolation, but a problem of fitting by least squares. A general solution consisting of a two-parameter family of curves is expected.

General Solution

•	If necessary, define .

•	Define S the sum of squares.

•	Obtain and solve the normal equations. The parameters a and b are free, and .

{a = a, b = b, c = exp(-(exp(a)*b-4)/exp(a))-1}

•	Evaluate the sum of squares for this solution.

•	Obtain the general solution to the underdetermined, inconsistent least-squares problem.

= exp(a*x)*(b+ln(exp(-(exp(a)*b-4)/exp(a))-1+x))

Casual inspection shows that every member of the general solution passes through .

As in Example 8, the LSSolve command in the Optimization package requires at least as many residuals as parameters; otherwise, an error results. Hence, it really does not apply here.

Linear Multivariate Models

This section considers linear multivariate models that are cast in the form . As per Table 4, the examples are classified by the properties of the r × c matrix A, and the vector v.

•	Initialize Maple.

Full Rank

Define the full-rank matrix

Af := Matrix(3, 4, {(1, 1) = 9, (1, 2) = -5, (1, 3) = -72, (1, 4) = -85, (2, 1) = 45, (2, 2) = 47, (2, 3) = -79, (2, 4) = -19, (3, 1) = -10, (3, 2) = -54, (3, 3) = 75, (3, 4) = 57})

for which = .

Consistent

Example 10.

Solve the least-squares problem , where , and .

Since there are fewer equations than variables, the system cannot be inconsistent. Each row in A represents the left-hand side of a distinct equation, so no matter what appears on the right, the equations in the system must be consistent.

Solution

•	Apply the LeastSquares command, designating s as the free variable base-name. The result is the general solution containing one free parameter. The solution is u, a vector of parameter values.

Vector(4, {(1) = 23661/76427+(197391/76427)*s[2], (2) = s[2], (3) = 14011/76427+(193352/76427)*s[2], (4) = -10262/76427-(147376/76427)*s[2]})

•	Obtain the minimum-norm least-squares solution. This is the projection of the general solution onto the row space of Af.

Vector(4, {(1) = 613579001/6927264966, (2) = -1778379167/20781794898, (3) = -344638939/10390897449, (4) = 319439654/10390897449})

Obtain P, the matrix that projects onto the row space of A.

•	The columns of N are a basis for the row space.

N := Matrix(map(Transpose, RowSpace(Af))); P := N.(1/(N^%T.N)).N^%T

•	Project onto the row space. The result is the minimum-norm solution, a vector that lies in the row space of Af.

= Vector(4, {(1) = 613579001/6927264966, (2) = -1778379167/20781794898, (3) = -344638939/10390897449, (4) = 319439654/10390897449})

Rank-Deficient

Define the rank-deficient matrix

Ad := Matrix(3, 4, {(1, 1) = -18, (1, 2) = -9, (1, 3) = 63, (1, 4) = 72, (2, 1) = -11, (2, 2) = -8, (2, 3) = 37, (2, 4) = 33, (3, 1) = -3, (3, 2) = 1, (3, 3) = 12, (3, 4) = 23})

for which = .

Consistent

Example 11.

Solve the least-squares problem , where , and .

That the system is consistent can be seen from

= Matrix(3, 5, {(1, 1) = 1, (1, 2) = 0, (1, 3) = -19/5, (1, 4) = -31/5, (1, 5) = -1, (2, 1) = 0, (2, 2) = 1, (2, 3) = 3/5, (2, 4) = 22/5, (2, 5) = 1, (3, 1) = 0, (3, 2) = 0, (3, 3) = 0, (3, 4) = 0, (3, 5) = 0})

General solution:

= Vector(4, {(1) = -1+(19/5)*s[3]+(31/5)*s[4], (2) = 1-(3/5)*s[3]-(22/5)*s[4], (3) = s[3], (4) = s[4]})

Minimum-norm solution:

= Vector(4, {(1) = 221/6065, (2) = 608/6065, (3) = -95/1213, (4) = 261/1213})

Inconsistent

Example 12.

Solve the least-squares problem , where , and .

That the system is inconsistent can be seen from

= Matrix(3, 5, {(1, 1) = 1, (1, 2) = 0, (1, 3) = -19/5, (1, 4) = -31/5, (1, 5) = 0, (2, 1) = 0, (2, 2) = 1, (2, 3) = 3/5, (2, 4) = 22/5, (2, 5) = 0, (3, 1) = 0, (3, 2) = 0, (3, 3) = 0, (3, 4) = 0, (3, 5) = 1})

General solution:

= Vector(4, {(1) = -344/1055+(19/5)*s[3]+(31/5)*s[4], (2) = 423/1055-(3/5)*s[3]-(22/5)*s[4], (3) = s[3], (4) = s[4]})

Minimum-norm solution:

= Vector(4, {(1) = 26881/1279715, (2) = 63113/1279715, (3) = -12856/255943, (4) = 22207/255943})

	Full Rank
	A full-rank matrix in a square system must necessarily be consistent, and therefore have a unique solution. There cannot be a least-squares problem in this case.

Rank-Deficient

Define the rank-deficient matrix

Bd := Matrix(4, 4, {(1, 1) = -6, (1, 2) = 42, (1, 3) = 30, (1, 4) = -24, (2, 1) = 22, (2, 2) = 6, (2, 3) = 2, (2, 4) = 32, (3, 1) = 30, (3, 2) = -50, (3, 3) = -38, (3, 4) = 64, (4, 1) = -14, (4, 2) = -2, (4, 3) = 0, (4, 4) = -21})

for which = .

Consistent

Example 13.

Solve the least-squares problem , where , and .

That the system is consistent can be seen from

= Matrix(4, 5, {(1, 1) = 1, (1, 2) = 0, (1, 3) = -1/10, (1, 4) = 31/20, (1, 5) = 11/160, (2, 1) = 0, (2, 2) = 1, (2, 3) = 7/10, (2, 4) = -7/20, (2, 5) = 13/160, (3, 1) = 0, (3, 2) = 0, (3, 3) = 0, (3, 4) = 0, (3, 5) = 0, (4, 1) = 0, (4, 2) = 0, (4, 3) = 0, (4, 4) = 0, (4, 5) = 0})

General solution:

= Vector(4, {(1) = 3/7-(31/7)*s[2]-3*s[3], (2) = s[2], (3) = s[3], (4) = (20/7)*s[2]+2*s[3]-13/56})

Minimum-norm solution:

= Vector(4, {(1) = 257/8204, (2) = 73/1172, (3) = 83/2051, (4) = 439/16408}) Vector(4, {(1) = 0.3132618235e-1, (2) = 0.6228668942e-1, (3) = 0.4046806436e-1, (4) = 0.2675524135e-1})

Because this system is essentially just an underdetermined one, a general solution is available with the LinearSolve command in the LinearAlgebra package.

= Vector(4, {(1) = s[1], (2) = -7*s[1]+9/16-(21/2)*s[4], (3) = 10*s[1]-11/16+(31/2)*s[4], (4) = s[4]})

Seek a least-squares solution numerically:

= Vector(4, {(1) = .940625000000000, (2) = -.115625000000000, (3) = 0., (4) = -.562500000000000})

That this is a member of the general solution can be seen by projecting it onto the row space of Bd.

•	The columns of N are a basis for the row space.

N := Matrix(map(Transpose, RowSpace(Bd))); P := N.(1/(N^%T.N)).N^%T

•	Project onto the row space. The result is the minimum-norm solution, a vector that lies in the row space of Bd.

= Vector(4, {(1) = .447222343907994, (2) = -.306663521498669, (3) = -.483306687542500, (4) = -.194115879932644})

However, to obtain the minimum-norm solution numerically, specify that the calculation is to be based on the singular value decomposition, rather than on the default QR decomposition.

= Vector(4, {(1) = 0.313261823500731e-1, (2) = 0.622866894197952e-1, (3) = 0.404680643588493e-1, (4) = 0.267552413456850e-1})

Inconsistent

Example 14.

Solve the least-squares problem , where , and .

That the system is inconsistent can be seen from

= Matrix(4, 5, {(1, 1) = 1, (1, 2) = 0, (1, 3) = -1/10, (1, 4) = 31/20, (1, 5) = 0, (2, 1) = 0, (2, 2) = 1, (2, 3) = 7/10, (2, 4) = -7/20, (2, 5) = 0, (3, 1) = 0, (3, 2) = 0, (3, 3) = 0, (3, 4) = 0, (3, 5) = 1, (4, 1) = 0, (4, 2) = 0, (4, 3) = 0, (4, 4) = 0, (4, 5) = 0})

General solution:

= Vector(4, {(1) = -404/19341-(31/7)*s[2]-3*s[3], (2) = s[2], (3) = s[3], (4) = (20/7)*s[2]+2*s[3]+668/19341})

Minimum-norm solution:

= Vector(4, {(1) = 49996/5666913, (2) = -3356/809559, (3) = -7148/1888971, (4) = 9524/629657}) Vector(4, {(1) = 0.8822440013e-2, (2) = -0.4145466853e-2, (3) = -0.3784070798e-2, (4) = 0.1512569542e-1})

Seek a least-squares solution numerically:

= Vector(4, {(1) = 0.6804415947e16, (2) = -0.6760826063e15, (3) = -0.1270111944e16, (4) = -0.4471888478e16})

The default method, based on a QR decomposition, utterly fails because this decomposition does not have an efficient way to determine rank. For problems such as this (and Example 13), specify the method as the one based on the singular value decomposition.

= Vector(4, {(1) = 0.882244001275476e-2, (2) = -0.414546685294091e-2, (3) = -0.378407079833411e-2, (4) = 0.151256954182992e-1})

Alternatively, use the LinearFit command from the Statistics package. Although this command is based on the LeastSquares from LinearAlgebra, there is an additional wrapper that attempts to deal with the issues raised by numeric calculations.

The deficiency in rank of the matrix has been detected, and the calculation is based on the singular value decomposition. The control is via the ratio of the smallest to the largest singular values, which is the reciprocal of an estimated condition number for the input matrix. If this ratio is smaller than the default threshold , the matrix is deemed to be ill-conditioned, and the least-squares calculation is based on the singular value decomposition. This default threshold is modified with the svdtolerance parameter.

infolevel[Statistics] := 5; Statistics:-LinearFit([x, y, z, w], evalf(Bd), evalf(`<,>`(1, 2, 3, 4)), [x, y, z, w])

Statistics:-LinearFit([x, y, z, w], evalf(Bd), evalf(`<,>`(1, 2, 3, 4)), [x, y, z, w], svdtolerance = 0.1e-16); infolevel[Statistics] := 0

The reciprocal of the estimated condition number is slightly larger than , but that is well below the default threshold of , so the first least-squares calculation is based on the singular value decomposition; in the second where the reciprocal of the estimated condition number is slightly larger than the threshold, the calculation is based on the default QR decomposition, and consequently fails.

Full Rank

Define the full-rank matrix

Cf := Matrix(4, 3, {(1, 1) = -74, (1, 2) = -60, (1, 3) = 35, (2, 1) = 13, (2, 2) = 51, (2, 3) = -54, (3, 1) = 32, (3, 2) = 20, (3, 3) = -17, (4, 1) = 48, (4, 2) = -46, (4, 3) = -25})

for which = .

Consistent

Example 15.

Solve the least-squares problem , where , and .

That the system is consistent can be seen from

= Matrix(4, 4, {(1, 1) = 1, (1, 2) = 0, (1, 3) = 0, (1, 4) = -3966/13319, (2, 1) = 0, (2, 2) = 1, (2, 3) = 0, (2, 4) = -454/13319, (3, 1) = 0, (3, 2) = 0, (3, 3) = 1, (3, 4) = -8783/13319, (4, 1) = 0, (4, 2) = 0, (4, 3) = 0, (4, 4) = 0})

Consequently, this is not a least-squares problem, but a properly determined system with a unique solution, obtainable for example, by LinearSolve in LinearAlgebra.

= Vector(3, {(1) = -3966/13319, (2) = -454/13319, (3) = -8783/13319})

Inconsistent

Example 16.

Solve the least-squares problem , where , and .

That the system is inconsistent can be seen from

= Matrix(4, 4, {(1, 1) = 1, (1, 2) = 0, (1, 3) = 0, (1, 4) = 0, (2, 1) = 0, (2, 2) = 1, (2, 3) = 0, (2, 4) = 0, (3, 1) = 0, (3, 2) = 0, (3, 3) = 1, (3, 4) = 0, (4, 1) = 0, (4, 2) = 0, (4, 3) = 0, (4, 4) = 1})

Because the matrix is full-rank, the null space is empty, and the least-squares solution is unique.

= Vector(3, {(1) = -92649103/54194075900, (2) = -2590616723/54194075900, (3) = -1141568146/13548518975}) Vector(3, {(1) = -0.1709579903e-2, (2) = -0.4780258137e-1, (3) = -0.8425778110e-1})

In general, the sum-of-squares of residuals is given by .

Alternatively, the solution can also be found with the LSSolve command in the Optimization package.

[3.80837226913578464, Vector(3, {(1) = -0.170957990262550e-2, (2) = -0.478025813703376e-1, (3) = -0.842577810981735e-1})]

The first member of the output list is half the sum-of-squares of the residuals; doubling this number gives = .

Rank-Deficient

Define the rank-deficient matrix

Cd := Matrix(4, 3, {(1, 1) = -84, (1, 2) = -11, (1, 3) = 77, (2, 1) = -41, (2, 2) = 80, (2, 3) = 46, (3, 1) = 67, (3, 2) = 13, (3, 3) = -61, (4, 1) = 70, (4, 2) = 21, (4, 3) = -63})

for which = .

Consistent

Example 17.

Solve the least-squares problem , where , and .

That the system is consistent can be seen from

= Matrix(4, 4, {(1, 1) = 1, (1, 2) = 0, (1, 3) = -66/71, (1, 4) = -91/71, (2, 1) = 0, (2, 2) = 1, (2, 3) = 7/71, (2, 4) = 43/71, (3, 1) = 0, (3, 2) = 0, (3, 3) = 0, (3, 4) = 0, (4, 1) = 0, (4, 2) = 0, (4, 3) = 0, (4, 4) = 0})

General solution:

= Vector(3, {(1) = 31/7-(66/7)*s[2], (2) = s[2], (3) = -(71/7)*s[2]+43/7})

Minimum-norm solution:

= Vector(3, {(1) = -3122/4723, (2) = 5099/9446, (3) = 6307/9446})

The general solution of an overdetermined but consistent system can also be found with the LinearSolve command from LinearAlgebra.

= Vector(3, {(1) = sigma[1], (2) = -(7/66)*sigma[1]+31/66, (3) = (71/66)*sigma[1]+91/66})

It is left to the reader to show that by appropriately redefining the free parameter in one general solution, the other will be obtained.

Inconsistent

Example 18.

Solve the least-squares problem , where , and .

That the system is inconsistent can be seen from

= Matrix(4, 4, {(1, 1) = 1, (1, 2) = 0, (1, 3) = -66/71, (1, 4) = 0, (2, 1) = 0, (2, 2) = 1, (2, 3) = 7/71, (2, 4) = 0, (3, 1) = 0, (3, 2) = 0, (3, 3) = 0, (3, 4) = 1, (4, 1) = 0, (4, 2) = 0, (4, 3) = 0, (4, 4) = 0})

General solution:

= Vector(3, {(1) = 22579/59829-(66/7)*s[2], (2) = s[2], (3) = -(71/7)*s[2]+7723/19943})

Minimum-norm solution:

= Vector(3, {(1) = 454084/40367481, (2) = 1045071/26911654, (3) = -178369/26911654}) Vector(3, {(1) = 0.1124875738e-1, (2) = 0.3883339909e-1, (3) = -0.6627946391e-2})

Numeric linear algebra:

= Vector(3, {(1) = -0.1450119183e15, (2) = 0.1538005194e14, (3) = -0.1559976697e15})

This calculation fails because the default QR-based method does not recognize that the rank-deficiency of the matrix. The more robust SVD-based method must be invoked.

= Vector(3, {(1) = 0.112487573846879e-1, (2) = 0.388333990917095e-1, (3) = -0.662794639080900e-2})

The LSSolve command from the Optimization package can find only a local solution, that is, one member of the general solution family.

[7.06546156546156556, Vector(3, {(1) = 0.174099469874118e-1, (2) = 0.381799395883903e-1, (3) = 0.})]

Project this solution onto the row space of

•	The columns of N are a basis for the row space; P projects onto the row space.

N := Matrix(map(Transpose, RowSpace(Cd))); P := N.(1/(N^%T.N)).N^%T

•	The projection is the minimum-norm solution.

= Vector(3, {(1) = 0.112487573846879e-1, (2) = 0.388333990917095e-1, (3) = -0.662794639080898e-2})

In this example, the LinearFit command from the Statistics package finds the minimum-norm solution, but this outcome is dependent on the relative values of the default setting of the svdtolerance parameter, and the reciprocal of the approximate condition number computed for .

•	The rank-deficiency of had been detected, and the SVD-based method invoked. The minimum-norm solution is returned.

•	The reciprocal of the approximate condition number of :

= Vector(4, {(1) = 183.938056091007, (2) = 84.6096420123703, (3) = 0.1035656303e-13, (4) = 0.})

•	This value is smaller than , the default svdtolerance parameter, so the more robust SVD-based method is invoked.

Nonlinear Multivariate Fit

Overdetermined Case

The first two columns of the matrix

M := Matrix(5, 3, {(1, 1) = 2.2467, (1, 2) = 5.2219, (1, 3) = 6.5622, (2, 1) = 2.0083, (2, 2) = 6.0656, (2, 3) = 6.3261, (3, 1) = 5.8386, (3, 2) = 1.1084, (3, 3) = 11.942, (4, 1) = 7.7071, (4, 2) = 5.9855, (4, 3) = 32.096, (5, 1) = 4.6193, (5, 2) = 4.6921, (5, 3) = 15.297})

are the abscissas and ordinates, respectively, of five data points . The numbers in the third column are five corresponding observations .

Example 19.

Fit the function to the data in M.

Since = , these data points generate a set of overdetermined nonlinear equations that are necessarily inconsistent. In contrast to the linear case, there is no functionality for obtaining a least-squares fit for nonlinear equations. The tools of Statistics and Optimization are the only ones that apply.

Solution

•	Specify the nonlinear model: define the function .

•	Form , the sum of squares of residuals.

SS := add((f(M[k, 1], M[k, 2])-M[k, 3])^2, k = 1 .. 5)

Apply the NonlinearFit command from Statistics

Apply the LSSolve command from Optimization

•	Half the sum of squares is given by . Double it to get the minimized .

Apply the Minimize command from Optimization

The results from all three approaches are fairly consistent.

Underdetermined Case

Consistent

The first two columns of the matrix

M := Matrix(2, 3, {(1, 1) = 2.2467, (1, 2) = 5.2219, (1, 3) = 6.5622, (2, 1) = 2.0083, (2, 2) = 6.0656, (2, 3) = 6.3261})

are the abscissas and ordinates, respectively, of two data points . The numbers in the third column are two corresponding observations .

Example 20.

Fit the function to the data in M.

Since the data generate a set of two equations in three unknown parameters, this is an interpolation problem in which = suggests there will be a general solution with one free parameter. In the nonlinear case, there is no theory by which a (unique) minimum-norm solution is extracted.

Solution

•	Specify the nonlinear model by defining .

•	From the two given data points, form two equations in the three unknown parameters.

q[1] := f(M[1, 1], M[1, 2]) = M[1, 3]; q[2] := f(M[2, 1], M[2, 2]) = M[2, 3]

•	Solve two equations for any two parameters in terms of the third. Here, is the free parameter.

{b = -5.878610145*ln(.1523879187*a)+5.390184700*ln(.1580752754*a), c = -2.639756998*ln(.1580752754*a)+2.273944135*ln(.1523879187*a)}

•	The general solution is a fitting function dependent on one free parameter:

Numeric solutions that seek to minimize a sum-of-squares of residuals return, at best, individual members of this family of solutions.

Inconsistent

Example 21.

Fit to the two points and .

The data determines two inconsistent equations in the three unknown parameters . This is no longer an interpolation; it is a least-squares problem.

Solution

•	Specify the nonlinear model by defining .

•	Define P and Q, the two data points.

•	Form SS, the sum of squares of residuals.

SS := (f(P[1], P[2])-P[3])^2+(f(Q[1], Q[2])-Q[3])^2

•	Form and solve the three normal equations.

•	The general solution is a fitting function dependent on two free parameters:

Numeric solutions that seek to minimize return, at best, individual members of this family of solutions.

Legal Notice: © Maplesoft, a division of Waterloo Maple Inc. 2012. Maplesoft and Maple are trademarks of Waterloo Maple Inc. This application may contain errors and Maplesoft is not liable for any damages resulting from the use of this material. This application is intended for non-commercial, non-profit use only. Contact Maplesoft for permission if you wish to use this application in for-profit activities.

Maple

Maple Add-Ons

Student Success Platform

Improving Retention Rates

MapleSim

MapleSim Add-Ons

Systems Engineering

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

App Preview:

Classroom Tips and Techniques: Least-Squares Fits

Maple

Powerful math software that is easy to use

Maple Add-Ons

Student Success Platform

Improving Retention Rates

MapleSim

Advanced System Level Modeling

MapleSim Add-Ons

Systems Engineering

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

App Preview:

Classroom Tips and Techniques: Least-Squares Fits