Kendall's Tau-b, O(N log N) version. This is a non-parametric measure
of monotonic association and can be defined in terms of the
bubble sort distance, or the number of swaps that would be needed in a
bubble sort to sort input2 into the same order as input1.
Since a copy of the inputs is made anyhow because they need to be sorted,
this function can work with any input range. However, the ranges must
have the same length.
Note:
As an optimization, when a range is a SortedRange with predicate "a < b",
it is assumed already sorted and not sorted a second time by this function.
This is useful when applying this function multiple times with one of the
arguments the same every time:
1 autolhs = randArray!rNormal(1_000, 0, 1);
2 autoindices = newsize_t[1_000];
3 makeIndex(lhs, indices);
4 5 foreach(i; 0..1_000) {
6 autorhs = randArray!rNormal(1_000, 0, 1);
7 autolhsSorted = assumeSorted(
8 indexed(lhs, indices)
9 );
10 11 // Rearrange rhs according to the sorting permutation of lhs.12 // kendallCor(lhsSorted, rhsRearranged) will be much faster than13 // kendallCor(lhs, rhs).14 autorhsRearranged = indexed(rhs, indices);
15 assert(kendallCor(lhsSorted, rhsRearranged) == kendallCor(lhs, rhs));
16 }
References:
A Computer Method for Calculating Kendall's Tau with Ungrouped Data,
William R. Knight, Journal of the American Statistical Association, Vol.
61, No. 314, Part 1 (Jun., 1966), pp. 436-439
The Variance of Tau When Both Rankings Contain Ties. M.G. Kendall.
Biometrika, Vol 34, No. 3/4 (Dec., 1947), pp. 297-298
Kendall's Tau-b, O(N log N) version. This is a non-parametric measure of monotonic association and can be defined in terms of the bubble sort distance, or the number of swaps that would be needed in a bubble sort to sort input2 into the same order as input1.
Since a copy of the inputs is made anyhow because they need to be sorted, this function can work with any input range. However, the ranges must have the same length.
Note:
As an optimization, when a range is a SortedRange with predicate "a < b", it is assumed already sorted and not sorted a second time by this function. This is useful when applying this function multiple times with one of the arguments the same every time:
References: A Computer Method for Calculating Kendall's Tau with Ungrouped Data, William R. Knight, Journal of the American Statistical Association, Vol. 61, No. 314, Part 1 (Jun., 1966), pp. 436-439
The Variance of Tau When Both Rankings Contain Ties. M.G. Kendall. Biometrika, Vol 34, No. 3/4 (Dec., 1947), pp. 297-298