Figure 4: 
Results for each combination of recurrent unit and attention type. All numbers are medians over 100 initializations.  = no attention;  = location-based attention;  = content-based attention. A grayed-out cell indicates that the architecture scored below 50% on the test set. In (b), the SRN  produced the first auxiliary 45% of the time; for all other models, the proportion of first-auxiliary outputs is almost exactly one minus the first-word accuracy (i.e., the proportion of main-auxiliary outputs).

Results for each combination of recurrent unit and attention type. All numbers are medians over 100 initializations. = no attention; = location-based attention; = content-based attention. A grayed-out cell indicates that the architecture scored below 50% on the test set. In (b), the SRN produced the first auxiliary 45% of the time; for all other models, the proportion of first-auxiliary outputs is almost exactly one minus the first-word accuracy (i.e., the proportion of main-auxiliary outputs).

Close Modal

or Create an Account

Close Modal
Close Modal