Skip to Main Content
Table 1: 

Performance of the networks found by the MDL model compared with classical RNNs for the tasks in this paper. Test accuracy indicates deterministic accuracy, the accuracy restricted to deterministic steps; Dyck-n tasks have no deterministic steps, hence here we report categorical accuracy, defined as the fraction of steps where a network assigns a probability lower than ϵ = 0.005 to each of the illegal symbols. When available, the last column refers to an infinite accuracy theorem for MDL networks: describing their behavior not only for a finite test set but over the relevant, infinite language.

TrainingTest cross-entropy (× 10−2)Test accuracy (%)Best RNNMDLRNN
set sizeMDLRNNRNNoptimalMDLRNNRNNTypeSizeproof
anbn 100 29.4 53.2 25.8 100.0 99.8 Elman Th. 4.1 
500 25.8 51.0 25.8 100.0 99.8 Elman 
 
anbncn 100 49.3 62.6 17.2 96.5 99.8 Elman Th. 4.2 
500 17.2 55.4 17.2 100.0 99.8 Elman 
 
anbncndn 100 65.3 68.1 12.9 68.6 99.8 GRU  
500 13.5 63.6 12.9 99.9 99.8 GRU 
 
anb2n 100 17.2 38.0 17.2 100.0 99.9 Elman Th. 4.3 
500 17.2 34.7 17.2 100.0 99.9 GRU 
 
anbmcn +m 100 39.8 47.6 26.9 98.9 98.9 Elman + L1 128 Th. 4.4 
500 26.8 45.1 26.9 100.0 98.9 Elman 128 
 
Dyck-1 100 110.7 94.5 88.2 69.9 10.9 Elman Th. 4.5 
500 88.7 93.0 88.2 100.0 10.8 LSTM 
 
Dyck-2 20,000 1.19 1.19 1.18 99.3 89.0 GRU 128  
 
Addition 100 0.0 75.8 0.0 100.0 74.9 Elman Th. 4.6 
400 0.0 72.1 0.0 100.0 79.4 Elman 
TrainingTest cross-entropy (× 10−2)Test accuracy (%)Best RNNMDLRNN
set sizeMDLRNNRNNoptimalMDLRNNRNNTypeSizeproof
anbn 100 29.4 53.2 25.8 100.0 99.8 Elman Th. 4.1 
500 25.8 51.0 25.8 100.0 99.8 Elman 
 
anbncn 100 49.3 62.6 17.2 96.5 99.8 Elman Th. 4.2 
500 17.2 55.4 17.2 100.0 99.8 Elman 
 
anbncndn 100 65.3 68.1 12.9 68.6 99.8 GRU  
500 13.5 63.6 12.9 99.9 99.8 GRU 
 
anb2n 100 17.2 38.0 17.2 100.0 99.9 Elman Th. 4.3 
500 17.2 34.7 17.2 100.0 99.9 GRU 
 
anbmcn +m 100 39.8 47.6 26.9 98.9 98.9 Elman + L1 128 Th. 4.4 
500 26.8 45.1 26.9 100.0 98.9 Elman 128 
 
Dyck-1 100 110.7 94.5 88.2 69.9 10.9 Elman Th. 4.5 
500 88.7 93.0 88.2 100.0 10.8 LSTM 
 
Dyck-2 20,000 1.19 1.19 1.18 99.3 89.0 GRU 128  
 
Addition 100 0.0 75.8 0.0 100.0 74.9 Elman Th. 4.6 
400 0.0 72.1 0.0 100.0 79.4 Elman 
Close Modal

or Create an Account

Close Modal
Close Modal