dstats.infotheory

Basic information theory. Joint entropy, mutual information, conditional mutual information. This module uses the base 2 definition of these quantities, i.e, entropy, mutual info, etc. are output in bits.

Members

Functions

condEntropy
double condEntropy(T data, U cond)

Calculate the conditional entropy H(data | cond).

condMutualInfo
double condMutualInfo(T x, U y, V z)

Calculates the conditional mutual information I(x, y | z) from a set of observations.

entropy
double entropy(T data)

Calculates the joint entropy of a set of observations. Each input range represents a vector of observations. If only one range is given, this reduces to the plain old entropy. Input range must have a length.

entropyCounts
double entropyCounts(T data)

This function calculates the Shannon entropy of a forward range that is treated as frequency counts of a set of discrete observations.

entropySorted
double entropySorted(T data)

Calculates the entropy of any old input range of observations more quickly than entropy(), provided that all equal values are adjacent. If the input is sorted by more than one key, i.e. structs, the result will be the joint entropy of all of the keys. The compFun alias will be used to compare adjacent elements and determine how many instances of each value exist.

joint
Joint!(FlattenType!(T)) joint(T args)

Bind a set of ranges together to represent a joint probability distribution.

mutualInfo
double mutualInfo(T x, U y)

Calculates the mutual information of two vectors of discrete observations.

mutualInfoTable
double mutualInfoTable(T table)

Calculates the mutual information of a contingency table representing a joint discrete probability distribution. Takes a set of finite forward ranges, one for each column in the contingency table. These can be expressed either as a tuple of ranges or a range of ranges.

Structs

DenseInfoTheory
struct DenseInfoTheory

Much faster implementations of information theory functions for the special but common case where all observations are integers on the range [0, nBin). This is the case, for example, when the observations have been previously binned using, for example, dstats.base.frqBin().

Joint
struct Joint(T...)

Iterate over a set of ranges by value in lockstep and return an ObsEnt, which is used internally by entropy functions on each iteration.

Meta

Authors

David Simcha