DenseInfoTheory

Much faster implementations of information theory functions for the special but common case where all observations are integers on the range [0, nBin). This is the case, for example, when the observations have been previously binned using, for example, dstats.base.frqBin().

Note that, due to the optimizations used, joint() cannot be used with the member functions of this struct, except entropy().

For those looking for hard numbers, this seems to be on the order of 10x faster than the generic implementations according to my quick and dirty benchmarks.

Constructors

this
this(uint nBin)

Constructs a DenseInfoTheory object for nBin bins. The values taken by each observation must then be on the interval [0, nBin).

Members

Functions

condEntropy
double condEntropy(R1 x, R2 y)

H(X | Y)

condMutualInfo
double condMutualInfo(R1 x, R2 y, R3 z)

I(X; Y | Z)

entropy
double entropy(R range)

Computes the entropy of a set of observations. Note that, for this function, the joint() function can be used to compute joint entropies as long as each individual range contains only integers on [0, nBin).

mutualInfo
double mutualInfo(R1 x, R2 y)

I(x; y)

mutualInfoPval
double mutualInfoPval(double mutualInfo, double n)

Calculates the P-value for I(X; Y) assuming x and y both have supports of [0, nBin). The P-value is calculated using a Chi-Square approximation. It is asymptotically correct, but is approximate for finite sample size.

Meta