F1 scores of BERT-large models fine-tuned on CoNLL and evaluated on randomly permuted versions of the dev and test sets: π(dev) and π(test).
Sign In or Create an Account