mgcpy.hypothesis_tests package¶
Submodules¶
mgcpy.hypothesis_tests.transforms module¶
-
mgcpy.hypothesis_tests.transforms.
k_sample_transform
(x, y, is_y_categorical=False)[source]¶ Transform to represent a k-sample test as an independence test
Parameters: - X (2D numpy.array) –
is interpreted as either:
- a
[n*n]
distance matrix, a square matrix with zeros on diagonal for n samples OR - a
[n*p]
data matrix, a matrix with n samples in p dimensions
- a
- Y (2D numpy.array) –
is interpreted as either:
- a
[n*n]
distance matrix, a square matrix with zeros on diagonal for n samples OR - a
[n*p]
data matrix, a matrix with n samples in p dimensions - a
[n*1]
label matrix, categorical data for X, ifis_y_categorical
is set to True
- a
- is_y_categorical (boolean) – if set to True,
Y
has categorical data ans is a labels array for X, else, it is a plain data matrix
Returns: u: a concatenated data matrix of dimensions [2*n, p]
v: a label matrix for u
, which indicates to which category each data entry inu
belongs to
Return type: list
- X (2D numpy.array) –
-
mgcpy.hypothesis_tests.transforms.
paired_two_sample_transform
(x, y)[source]¶ Transform to represent a paired two-sample test as an independence test Steps:
- combine x and y to get the joint_distribution
- sample n pairs from the joint_distribution
- compute the eucledian distance between the sampled n pairs, which is
randomly_sampled_pairs_distance
- compute the eucledian distance between the actual x and y, which is
actual_pairs_distance
- compute the two sample transformed matrices of
randomly_sampled_pairs_distance
andactual_pairs_distance
Parameters: - X (2D numpy.array) – is interpreted as either:
- a
[n*n]
distance matrix, a square matrix with zeros on diagonal for n samples OR - a[n*p]
data matrix, a matrix with n samples in p dimensions - Y (2D numpy.array) – is interpreted as either:
- a
[n*n]
distance matrix, a square matrix with zeros on diagonal for n samples OR - a[n*p]
data matrix, a matrix with n samples in p dimensions
Returns: u: a data matrix of dimensions [2*n, p]
v: a label matrix for u
, which indicates to which category each data entry inu
belongs to
Return type: list
-
mgcpy.hypothesis_tests.transforms.
paired_two_sample_test_dcorr
(x, y, which_test='biased', compute_distance_matrix=None, is_fast=False)[source]¶ Compute paired two sample test’s DCorr test_statistic
Parameters: - X (2D numpy.array) –
is interpreted as either:
- a
[n*n]
distance matrix, a square matrix with zeros on diagonal for n samples OR - a
[n*p]
data matrix, a matrix with n samples in p dimensions
- a
- Y (2D numpy.array) –
is interpreted as either:
- a
[n*n]
distance matrix, a square matrix with zeros on diagonal for n samples OR - a
[n*p]
data matrix, a matrix with n samples in p dimensions
- a
Returns: paired two sample DCorr test_statistic
Return type: float
- X (2D numpy.array) –