org.apache.commons.math3.stat.regression
Class AbstractMultipleLinearRegression

java.lang.Object
  extended by org.apache.commons.math3.stat.regression.AbstractMultipleLinearRegression
All Implemented Interfaces:
MultipleLinearRegression
Direct Known Subclasses:
GLSMultipleLinearRegression, OLSMultipleLinearRegression

public abstract class AbstractMultipleLinearRegression
extends Object
implements MultipleLinearRegression

Abstract base class for implementations of MultipleLinearRegression.

Since:
2.0
Version:
$Id: AbstractMultipleLinearRegression.java 7721 2013-02-14 14:07:13Z CardosoP $

Constructor Summary
AbstractMultipleLinearRegression()
           
 
Method Summary
protected abstract  RealVector calculateBeta()
          Calculates the beta of multiple linear regression in matrix notation.
protected abstract  RealMatrix calculateBetaVariance()
          Calculates the beta variance of multiple linear regression in matrix notation.
protected  double calculateErrorVariance()
          Calculates the variance of the error term.
protected  RealVector calculateResiduals()
          Calculates the residuals of multiple linear regression in matrix notation.
protected  double calculateYVariance()
          Calculates the variance of the y values.
 double estimateErrorVariance()
          Estimates the variance of the error.
 double estimateRegressandVariance()
          Returns the variance of the regressand, ie Var(y).
 double[] estimateRegressionParameters()
          Estimates the regression parameters b.
 double[] estimateRegressionParametersStandardErrors()
          Returns the standard errors of the regression parameters.
 double[][] estimateRegressionParametersVariance()
          Estimates the variance of the regression parameters, ie Var(b).
 double estimateRegressionStandardError()
          Estimates the standard error of the regression.
 double[] estimateResiduals()
          Estimates the residuals, ie u = y - X*b.
protected  RealMatrix getX()
           
protected  RealVector getY()
           
 boolean isNoIntercept()
           
 void newSampleData(double[] data, int nobs, int nvars)
          Loads model x and y sample data from a flat input array, overriding any previous sample.
protected  void newXSampleData(double[][] x)
          Loads new x sample data, overriding any previous data.
protected  void newYSampleData(double[] y)
          Loads new y sample data, overriding any previous data.
 void setNoIntercept(boolean noIntercept)
           
protected  void validateCovarianceData(double[][] x, double[][] covariance)
          Validates that the x data and covariance matrix have the same number of rows and that the covariance matrix is square.
protected  void validateSampleData(double[][] x, double[] y)
          Validates sample data.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AbstractMultipleLinearRegression

public AbstractMultipleLinearRegression()
Method Detail

getX

protected RealMatrix getX()
Returns:
the X sample data.

getY

protected RealVector getY()
Returns:
the Y sample data.

isNoIntercept

public boolean isNoIntercept()
Returns:
true if the model has no intercept term; false otherwise
Since:
2.2

setNoIntercept

public void setNoIntercept(boolean noIntercept)
Parameters:
noIntercept - true means the model is to be estimated without an intercept term
Since:
2.2

newSampleData

public void newSampleData(double[] data,
                          int nobs,
                          int nvars)

Loads model x and y sample data from a flat input array, overriding any previous sample.

Assumes that rows are concatenated with y values first in each row. For example, an input data array containing the sequence of values (1, 2, 3, 4, 5, 6, 7, 8, 9) with nobs = 3 and nvars = 2 creates a regression dataset with two independent variables, as below:

   y   x[0]  x[1]
   --------------
   1     2     3
   4     5     6
   7     8     9
 

Note that there is no need to add an initial unitary column (column of 1's) when specifying a model including an intercept term. If isNoIntercept() is true, the X matrix will be created without an initial column of "1"s; otherwise this column will be added.

Throws IllegalArgumentException if any of the following preconditions fail:

Parameters:
data - input data array
nobs - number of observations (rows)
nvars - number of independent variables (columns, not counting y)
Throws:
NullArgumentException - if the data array is null
DimensionMismatchException - if the length of the data array is not equal to nobs * (nvars + 1)
NumberIsTooSmallException - if nobs is smaller than nvars

newYSampleData

protected void newYSampleData(double[] y)
Loads new y sample data, overriding any previous data.

Parameters:
y - the array representing the y sample
Throws:
NullArgumentException - if y is null
NoDataException - if y is empty

newXSampleData

protected void newXSampleData(double[][] x)

Loads new x sample data, overriding any previous data.

The input x array should have one row for each sample observation, with columns corresponding to independent variables. For example, if
  x = new double[][] {{1, 2}, {3, 4}, {5, 6}} 
then setXSampleData(x) results in a model with two independent variables and 3 observations:
   x[0]  x[1]
   ----------
     1    2
     3    4
     5    6
 

Note that there is no need to add an initial unitary column (column of 1's) when specifying a model including an intercept term.

Parameters:
x - the rectangular array representing the x sample
Throws:
NullArgumentException - if x is null
NoDataException - if x is empty
DimensionMismatchException - if x is not rectangular

validateSampleData

protected void validateSampleData(double[][] x,
                                  double[] y)
                           throws MathIllegalArgumentException
Validates sample data. Checks that
  • Neither x nor y is null or empty;
  • The length (i.e. number of rows) of x equals the length of y
  • x has at least one more row than it has columns (i.e. there is sufficient data to estimate regression coefficients for each of the columns in x plus an intercept.

Parameters:
x - the [n,k] array representing the x data
y - the [n,1] array representing the y data
Throws:
NullArgumentException - if x or y is null
DimensionMismatchException - if x and y do not have the same length
NoDataException - if x or y are zero-length
MathIllegalArgumentException - if the number of rows of x is not larger than the number of columns + 1

validateCovarianceData

protected void validateCovarianceData(double[][] x,
                                      double[][] covariance)
Validates that the x data and covariance matrix have the same number of rows and that the covariance matrix is square.

Parameters:
x - the [n,k] array representing the x sample
covariance - the [n,n] array representing the covariance matrix
Throws:
DimensionMismatchException - if the number of rows in x is not equal to the number of rows in covariance
NonSquareMatrixException - if the covariance matrix is not square

estimateRegressionParameters

public double[] estimateRegressionParameters()
Estimates the regression parameters b.

Specified by:
estimateRegressionParameters in interface MultipleLinearRegression
Returns:
The [k,1] array representing b

estimateResiduals

public double[] estimateResiduals()
Estimates the residuals, ie u = y - X*b.

Specified by:
estimateResiduals in interface MultipleLinearRegression
Returns:
The [n,1] array representing the residuals

estimateRegressionParametersVariance

public double[][] estimateRegressionParametersVariance()
Estimates the variance of the regression parameters, ie Var(b).

Specified by:
estimateRegressionParametersVariance in interface MultipleLinearRegression
Returns:
The [k,k] array representing the variance of b

estimateRegressionParametersStandardErrors

public double[] estimateRegressionParametersStandardErrors()
Returns the standard errors of the regression parameters.

Specified by:
estimateRegressionParametersStandardErrors in interface MultipleLinearRegression
Returns:
standard errors of estimated regression parameters

estimateRegressandVariance

public double estimateRegressandVariance()
Returns the variance of the regressand, ie Var(y).

Specified by:
estimateRegressandVariance in interface MultipleLinearRegression
Returns:
The double representing the variance of y

estimateErrorVariance

public double estimateErrorVariance()
Estimates the variance of the error.

Returns:
estimate of the error variance
Since:
2.2

estimateRegressionStandardError

public double estimateRegressionStandardError()
Estimates the standard error of the regression.

Returns:
regression standard error
Since:
2.2

calculateBeta

protected abstract RealVector calculateBeta()
Calculates the beta of multiple linear regression in matrix notation.

Returns:
beta

calculateBetaVariance

protected abstract RealMatrix calculateBetaVariance()
Calculates the beta variance of multiple linear regression in matrix notation.

Returns:
beta variance

calculateYVariance

protected double calculateYVariance()
Calculates the variance of the y values.

Returns:
Y variance

calculateErrorVariance

protected double calculateErrorVariance()

Calculates the variance of the error term.

Uses the formula
 var(u) = u · u / (n - k)
 
where n and k are the row and column dimensions of the design matrix X.

Returns:
error variance estimate
Since:
2.2

calculateResiduals

protected RealVector calculateResiduals()
Calculates the residuals of multiple linear regression in matrix notation.
 u = y - X * b
 

Returns:
The residuals [n,1] matrix


Copyright © 2017 CNES. All Rights Reserved.