Following the online experiment setting in section 6.3, we explored the reliability of SWORE on 40 participants in the online monitoring scenario. Similarly, we leveraged the prerecorded 25 trials to pretrain the embryonic SWORE, LOR, and SVR models, respectively. Then we ran SWORE and other baselines independently 100 times and calculated the average prediction accuracy for all 40 participants in Table 5.

Table 5:

Comparison of Average Prediction Accuracy on 40 Participants (in %).

ACCP1P2P3P4P5P6P7P8
SVR 69.1 $±$ 0.36 76.9 $±$ 0.30 74.0 $±$ 0.31 70.4 $±$ 0.35 72.7 $±$ 0.34 70.6 $±$ 0.39 51.5 $±$ 0.47 76.6 $±$ 0.31
LOR 72.6 $±$ 0.33 78.3 $±$ 0.31 73.1 $±$ 0.37 73.8 $±$ 0.33 73.7 $±$ 0.36 74.5 $±$ 0.32 75.1 $±$ 0.33 74.8 $±$ 0.32
SWORE 76.0 $±$ 0.30 79.9 $±$ 0.29 76.5 $±$ 0.31 75.9 $±$ 0.31 76.7 $±$ 0.31 74.6 $±$ 0.33 78.3 $±$ 0.32 76.9 $±$ 0.30
ACC P9 P10 P11 P12 P13 P14 P15 P16
SVR 69.7 $±$ 0.36 71.0 $±$ 0.37 73.7 $±$ 0.37 74.4 $±$ 0.33 73.7 $±$ 0.32 72.0 $±$ 0.35 74.2 $±$ 0.36 45.8 $±$ 0.44
LOR 73.2 $±$ 0.35 73.3 $±$ 0.36 76.5 $±$ 0.32 78.3 $±$ 0.30 74.3 $±$ 0.34 72.6 $±$ 0.35 72.5 $±$ 0.35 76.3 $±$ 0.32
SWORE 75.7 $±$ 0.33 75.2 $±$ 0.32 78.2 $±$ 0.30 78.4 $±$ 0.31 74.1 $±$ 0.36 74.0 $±$ 0.32 74.0 $±$ 0.36 79.9 $±$ 0.31
ACC P17 P18 P19 P20 P21 P22 P23 P24
SVR 63.3 $±$ 0.41 65.6 $±$ 0.43 67.0 $±$ 0.38 68.8 $±$ 0.36 47.4 $±$ 0.45 63.7 $±$ 0.40 50.9 $±$ 0.38 75.3 $±$ 0.35
LOR 79.8 $±$ 0.29 74.4 $±$ 0.34 77.6 $±$ 0.32 73.7 $±$ 0.36 76.6 $±$ 0.34 73.1 $±$ 0.34 84.2 $±$ 0.27 84.6 $±$ 0.24
SWORE 80.0 $±$ 0.29 76.7 $±$ 0.31 79.0 $±$ 0.31 76.7 $±$ 0.35 76.4 $±$ 0.32 75.2 $±$ 0.35 84.0 $±$ 0.29 84.2 $±$ 0.25
ACC P25 P26 P27 P28 P29 P30 P31 P32
SVR 71.7 $±$ 0.34 72.8 $±$ 0.32 75.2 $±$ 0.33 69.3 $±$ 0.39 72.6 $±$ 0.36 79.0 $±$ 0.32 70.1 $±$ 0.39 39.1 $±$ 0.41
LOR 71.1 $±$ 0.37 76.8 $±$ 0.31 72.3 $±$ 0.35 76.3 $±$ 0.33 79.7 $±$ 0.30 78.9 $±$ 0.30 77.1 $±$ 0.33 80.8 $±$ 0.31
SWORE 75.2 $±$ 0.32 77.8 $±$ 0.29 74.0 $±$ 0.32 76.9 $±$ 0.31 80.0 $±$ 0.29 79.5 $±$ 0.28 77.5 $±$ 0.30 81.1 $±$ 0.28
ACC P33 P34 P35 P36 P37 P38 P39 P40
SVR 72.6 $±$ 0.35 59.3 $±$ 0.41 70.8 $±$ 0.35 75.5 $±$ 0.30 78.9 $±$ 0.29 74.0 $±$ 0.32 74.1 $±$ 0.35 56.1 $±$ 0.43
LOR 74.5 $±$ 0.33 80.2 $±$ 0.28 77.1 $±$ 0.31 74.9 $±$ 0.31 79.1 $±$ 0.28 79.4 $±$ 0.29 79.5 $±$ 0.29 76.5 $±$ 0.33
SWORE 77.7 $±$ 0.31 80.9 $±$ 0.29 77.3 $±$ 0.31 76.3 $±$ 0.31 79.8 $±$ 0.28 81.0 $±$ 0.28 80.6 $±$ 0.28 78.3 $±$ 0.32
Notes: We run each baseline independently 100 times and calculate the mean and 95% confidence interval. The best results are in bold. Best results are in bold.

