Figure 1: 
Left: Histogram of BLEU scores that show wide variance in performance for a base NMT system (transformer) with different hyperparameters (e.g., BPE operations, # of layers, initial learning rate). Right: Scatterplot of BLEU and decoding time with different hyperparameters. Gold stars represent the Pareto-optimal systems.

Left: Histogram of BLEU scores that show wide variance in performance for a base NMT system (transformer) with different hyperparameters (e.g., BPE operations, # of layers, initial learning rate). Right: Scatterplot of BLEU and decoding time with different hyperparameters. Gold stars represent the Pareto-optimal systems.

Close Modal

or Create an Account

Close Modal
Close Modal