dstats.regress

A module for performing linear regression. This module has an unusual interface, as it is range-based instead of matrix based. Values for independent variables are provided as either a tuple or a range of ranges. This means that one can use, for example, map, to fit high order models and lazily evaluate certain values. (For details, see examples below.)

Members

Classes

Loess1D
class Loess1D

This class is returned from the loess1D function and holds the state of a loess regression with one predictor variable.

Functions

linearRegress
RegressRes linearRegress(U Y, TC input)

Perform a linear regression as in linearRegressBeta, but return a RegressRes with useful stuff for statistical inference. If the last element of input is a real, this is used to specify the confidence intervals to be calculated. Otherwise, the default of 0.95 is used. The rest of input should be the elements of X.

linearRegressBeta
double[] linearRegressBeta(U Y, T XIn)

Perform a linear regression and return just the beta values. The advantages to just returning the beta values are that it's faster and that each range needs to be iterated over only once, and thus can be just an input range. The beta values are returned such that the smallest index corresponds to the leftmost element of X. X can be either a tuple or a range of input ranges. Y must be an input range.

linearRegressBetaBuf
double[] linearRegressBetaBuf(double[] buf, U Y, TRidge XRidge)

Same as linearRegressBeta, but allows the user to specify a buffer for the beta terms. If the buffer is too short, a new one is allocated. Otherwise, the results are returned in the user-provided buffer.

linearRegressPenalized
double[] linearRegressPenalized(Y yIn, X xIn, double lasso, double ridge)

Performs lasso (L1) and/or ridge (L2) penalized linear regression. Due to the way the data is standardized, no intercept term should be included in x (unlike linearRegress and linearRegressBeta). The intercept coefficient is implicitly included and returned in the first element of the returned array. Usage is otherwise identical.

loess1D
Loess1D loess1D(RY y, RX x, double span, int degree = 1)

This function performs loess regression. Loess regression is a local regression procedure, where a prediction of the dependent (y) variable is made from an observation of the independent (x) variable by weighted least squares over x values in the neighborhood of the value being evaluated.

logistic
double logistic(double xb)

The logistic function used in logistic regression.

logisticRegress
LogisticRes logisticRegress(T yIn, V input)

Similar to logisticRegressBeta, but returns a LogisticRes with useful stuff for statistical inference. If the last element of input is a floating point number instead of a range, it is used to specify the confidence interval calculated. Otherwise, the default of 0.95 is used.

logisticRegressBeta
double[] logisticRegressBeta(T yIn, U xRidge)

Computes a logistic regression using a maximum likelihood estimator and returns the beta coefficients. This is a generalized linear model with the link function f(XB) = 1 / (1 + exp(XB)). This is generally used to model the probability that a binary Y variable is 1 given a set of X variables.

logisticRegressPenalized
double[] logisticRegressPenalized(Y yIn, X xIn, double lasso, double ridge)

Performs lasso (L1) and/or ridge (L2) penalized logistic regression. Due to the way the data is standardized, no intercept term should be included in x (unlike logisticRegress and logisticRegressBeta). The intercept coefficient is implicitly included and returned in the first element of the returned array. Usage is otherwise identical.

polyFit
PolyFitRes!(PowMap!(uint, T)[]) polyFit(U Y, T X, uint N, double confInt = 0.95)

Convenience function that takes a forward range X and a forward range Y, creates an array of PowMap structs for integer powers 0 through N, and calls linearRegress.

polyFitBeta
double[] polyFitBeta(U Y, T X, uint N, double ridge = 0)

Convenience function that takes a forward range X and a forward range Y, creates an array of PowMap structs for integer powers from 0 through N, and calls linearRegressBeta.

polyFitBetaBuf
double[] polyFitBetaBuf(double[] buf, U Y, T X, uint N, double ridge = 0)

Same as polyFitBeta, but allows the caller to provide an explicit buffer to return the coefficients in. If it's too short, a new one will be allocated. Otherwise, results will be returned in the user-provided buffer.

powMap
PowMap!(ExpType, T) powMap(T range, ExpType exponent)

Maps a forward range to a power determined at runtime. ExpType is the type of the exponent. Using an int is faster than using a double, but obviously less flexible.

residuals
Residuals!(F, U, T) residuals(F[] betas, U Y, T X)

Given the beta coefficients from a linear regression, and X and Y values, returns a range that lazily computes the residuals.

Structs

LogisticRes
struct LogisticRes

Plain old data struct to hold the results of a logistic regression.

PolyFitRes
struct PolyFitRes(T)

Struct returned by polyFit.

PowMap
struct PowMap(ExpType, T)
RegressRes
struct RegressRes

Struct that holds the results of a linear regression. It's a plain old data struct.

Residuals
struct Residuals(F, U, T...)

Forward Range for holding the residuals from a regression analysis.

Meta

Authors

David Simcha