napkinxc.measures.Jain_et_al_inverse_propensity¶

napkinxc.measures.Jain_et_al_inverse_propensity(Y, A=0.55, B=1.5)[source]¶

Calculate inverse propensity as proposed in Jain et al. 2016. Inverse propensity \(q_l\) of label \(l\) is calculated as:

\[C = (\log N - 1)(B + 1)^A \,, \ q_l = 1 + C(N_l + B)^{-A} \,,\]

where \(N\) is total number of data points, \(N_j\) is total number of data points for and \(A\) and \(B\) are dataset specific parameters.

Parameters:

Y (ndarray, csr_matrix, list[list[tuple[int|str, float]]) – Labels (typically ground truth for train data) provided as a matrix with non-zero values for relevant labels.
A (float, optional) –
Dataset specific parameter, typical values:
- 0.5: WikiLSHTC-325K and WikipediaLarge-500K
- 0.6: Amazon-670K and Amazon-3M
- 0.55: otherwise
Defaults to 0.55
B (float, optional) –
Dataset specific parameter, typical values:
- 0.4: WikiLSHTC-325K and WikipediaLarge-500K
- 2.6: Amazon-670K and Amazon-3M
- 1.5: otherwise
Defaults to 1.5

Returns:

Array with the inverse propensity for all labels

Return type:

ndarray