kendallCor

Kendall's Tau-b, O(N log N) version. This is a non-parametric measure of monotonic association and can be defined in terms of the bubble sort distance, or the number of swaps that would be needed in a bubble sort to sort input2 into the same order as input1.

Since a copy of the inputs is made anyhow because they need to be sorted, this function can work with any input range. However, the ranges must have the same length.

Note:

As an optimization, when a range is a SortedRange with predicate "a < b", it is assumed already sorted and not sorted a second time by this function. This is useful when applying this function multiple times with one of the arguments the same every time:

1 auto lhs = randArray!rNormal(1_000, 0, 1);
2 auto indices = new size_t[1_000];
3 makeIndex(lhs, indices);
4 
5 foreach(i; 0..1_000) {
6     auto rhs = randArray!rNormal(1_000, 0, 1);
7     auto lhsSorted = assumeSorted(
8         indexed(lhs, indices)
9     );
10 
11     // Rearrange rhs according to the sorting permutation of lhs.
12     // kendallCor(lhsSorted, rhsRearranged) will be much faster than
13     // kendallCor(lhs, rhs).
14     auto rhsRearranged = indexed(rhs, indices);
15     assert(kendallCor(lhsSorted, rhsRearranged) == kendallCor(lhs, rhs));
16 }

References: A Computer Method for Calculating Kendall's Tau with Ungrouped Data, William R. Knight, Journal of the American Statistical Association, Vol. 61, No. 314, Part 1 (Jun., 1966), pp. 436-439

The Variance of Tau When Both Rankings Contain Ties. M.G. Kendall. Biometrika, Vol 34, No. 3/4 (Dec., 1947), pp. 297-298

double
kendallCor
(
T
U
)
if (
isInputRange!(T) &&
isInputRange!(U)
)

Meta