AED and F0.5 results on the test split of English-ESL data. Lower AED is better; higher F0.5 is better. The first three columns (markers correspond to Figure 4) are the punctuation restoration baselines, which ignore the input punctuation. The fourth and fifth columns are our correction models, which use parsed and gold trees. The final column is the BiLSTM-CRF model tailored for the punctuation correction task.