Nevertheless, computing 33 features repeatedly during the instance generation process described in Section 3 could be time-consuming. Therefore, we employed feature selection to reduce this set to a more manageable one. Using the data from the 6480 COCO instances, we define a dissimilarity matrix as $1-ρλi,λj$, where $ρ$ is the Pearson correlation between features $λi$ and $λj$. Then, we use this dissimilarity matrix as input to a k-means clustering algorithm, such that similar features are clustered together. To determine the number of clusters, $k=8$, we use silhouette analysis. The results are shown in Table 2. We leverage our knowledge of the features to select the most suitable ones. For example, $CN$ and $RQ2$ can be computed from the same model, while $EL25$ is required to calculate $LQ25$. Moreover, $H(Y)$ has proven to be an effective predictor of ill-conditioning (Muñoz, Kirley et al., 2015). The resulting feature vector used to summarize each instance is $λ=RQ2CNH(Y)ξ(1)γ(Y)EL25LQ25PKST$.

Table 2:
Average silhouette value for each feature cluster obtained using correlation as the dissimilarity measure for k-means clustering.
1.000$CN$
0.907 $H(Y)$ $βmin$ $βmax$ $εmax$
0.534 $RQ2$ $RL2$ $RLI2$ $RQI2$ $EQ10$ $EQ25$ $EL50$ $EQ50$ $FDC$
0.525 $LQ25$ $LQ10$ $DISP1%$ $Hmax$ $M0$
0.514 $γ(Y)$ $κ(Y)$
0.385 $ξ(1)$ $ξ(2)$ $ξ(N)$
0.149 $EL25$ $EL10$ $LQ50$
0.146 $PKS$ $σ(1)$ $σ(2)$ $ET10$ $ET25$ $ET50$
1.000$CN$
0.907 $H(Y)$ $βmin$ $βmax$ $εmax$
0.534 $RQ2$ $RL2$ $RLI2$ $RQI2$ $EQ10$ $EQ25$ $EL50$ $EQ50$ $FDC$
0.525 $LQ25$ $LQ10$ $DISP1%$ $Hmax$ $M0$
0.514 $γ(Y)$ $κ(Y)$
0.385 $ξ(1)$ $ξ(2)$ $ξ(N)$
0.149 $EL25$ $EL10$ $LQ50$
0.146 $PKS$ $σ(1)$ $σ(2)$ $ET10$ $ET25$ $ET50$

Close Modal