napkinxc.measures.Jain_et_al_inverse_propensity¶
- napkinxc.measures.Jain_et_al_inverse_propensity(Y, A=0.55, B=1.5)[source]¶
Calculate inverse propensity as proposed in Jain et al. 2016. Inverse propensity \(q_l\) of label \(l\) is calculated as:
\[C = (\log N - 1)(B + 1)^A \,, \ q_l = 1 + C(N_l + B)^{-A} \,,\]where \(N\) is total number of data points, \(N_j\) is total number of data points for and \(A\) and \(B\) are dataset specific parameters.
- Parameters:
Y (ndarray, csr_matrix, list[list[tuple[int|str, float]]) – Labels (typically ground truth for train data) provided as a matrix with non-zero values for relevant labels.
A (float, optional) –
Dataset specific parameter, typical values:
0.5:
WikiLSHTC-325K
andWikipediaLarge-500K
0.6:
Amazon-670K
andAmazon-3M
0.55: otherwise
Defaults to 0.55
B (float, optional) –
Dataset specific parameter, typical values:
0.4:
WikiLSHTC-325K
andWikipediaLarge-500K
2.6:
Amazon-670K
andAmazon-3M
1.5: otherwise
Defaults to 1.5
- Returns:
Array with the inverse propensity for all labels
- Return type:
ndarray