A comparison of the log-perplexity of base and target models (a), the corresponding histogram across the Δppl axis (b), and the relative proportions of the three datasets in each δppl percentile (c), for 30k examples sampled from PRE ∪ Lang-8BF such that 10k examples were selected from REV, RT, and Lang-8 respectively. The histogram (b) x-axis has been reversed to align the ‘best’ examples (with the lowest Δppl) towards the right, copying the alignment of the δppl plot (c); for the scatter plot (a), the best examples are towards the bottom-right. The δppl scores shown (c) are the values actually used by the various training strategies.
This site uses cookies. By continuing to use our website, you are agreeing to our privacy policy. No content on this site may be used to train artificial intelligence systems without permission in writing from the MIT Press.