Skip to Main Content
Table 4:
Test Error Rates (%) on the 20 bAbI QA Tasks for Models Using 10,000 Training Examples.
TasksJoint NTMJoint DNC1Joint DNC2Single D-NTM (ff)Single D-NTM (GRU)
1: One supporting fact 40.66.7 9.012.6 16.213.7 2.15.3 5.68.8 
2: Two supporting facts 56.31.5 39.220.5 47.517.3 43.411.1 57.93.9 
3: Three supporting facts 47.81.7 39.616.4 44.314.5 66.83.2 54.512.2 
4: Two argument relations 0.90.7 0.40.7 0.40.3 2.55.6 0.00.0 
5: Three argument relations 1.90.8 1.51.0 1.90.6 2.95.5 1.50.5 
6: Yes/no questions 18.41.6 6.97.5 11.17.1 35.717.0 34.917.6 
7: Counting 19.92.5 9.87.0 15.47.1 8.53.3 13.06.9 
8: Lists/sets 18.54.9 5.55.9 10.06.6 4.87.3 11.39.4 
9: Simple negation 17.92.0 7.78.3 11.77.4 13.411.1 36.13.6 
10: Indefinite knowledge 25.77.3 9.611.4 14.710.8 14.49.7 36.410.0 
11: Basic coreference 24.47.0 3.35.7 7.28.1 3.65.0 18.313.0 
12: Conjunction 21.96.6 5.06.3 10.18.1 6.25.3 11.510.6 
13: Compound coreference 8.20.8 3.13.6 5.53.4 3.53.2 8.37.9 
14: Time reasoning 44.913.0 11.07.5 15.07.4 15.019.4 58.31.5 
15: Basic deduction 46.51.6 27.220.1 40.211.1 0.00.0 30.212.6 
16: Basic induction 53.81.4 53.61.9 54.71.3 52.02.2 49.12.4 
17: Positional reasoning 29.95.2 32.48.0 30.910.1 14.415.2 36.911.3 
18: Size reasoning 4.51.3 4.21.8 4.32.1 0.10.1 22.122.5 
19: Path finding 86.519.4 64.637.4 75.930.4 29.519.4 64.38.7 
20: Agent motivation 1.40.6 0.00.1 0.00.0 0.10.2 2.61.7 
Avg. err (%) 28.52.9 16.77.6 20.87.1 15.97.4 27.68.2 
TasksJoint NTMJoint DNC1Joint DNC2Single D-NTM (ff)Single D-NTM (GRU)
1: One supporting fact 40.66.7 9.012.6 16.213.7 2.15.3 5.68.8 
2: Two supporting facts 56.31.5 39.220.5 47.517.3 43.411.1 57.93.9 
3: Three supporting facts 47.81.7 39.616.4 44.314.5 66.83.2 54.512.2 
4: Two argument relations 0.90.7 0.40.7 0.40.3 2.55.6 0.00.0 
5: Three argument relations 1.90.8 1.51.0 1.90.6 2.95.5 1.50.5 
6: Yes/no questions 18.41.6 6.97.5 11.17.1 35.717.0 34.917.6 
7: Counting 19.92.5 9.87.0 15.47.1 8.53.3 13.06.9 
8: Lists/sets 18.54.9 5.55.9 10.06.6 4.87.3 11.39.4 
9: Simple negation 17.92.0 7.78.3 11.77.4 13.411.1 36.13.6 
10: Indefinite knowledge 25.77.3 9.611.4 14.710.8 14.49.7 36.410.0 
11: Basic coreference 24.47.0 3.35.7 7.28.1 3.65.0 18.313.0 
12: Conjunction 21.96.6 5.06.3 10.18.1 6.25.3 11.510.6 
13: Compound coreference 8.20.8 3.13.6 5.53.4 3.53.2 8.37.9 
14: Time reasoning 44.913.0 11.07.5 15.07.4 15.019.4 58.31.5 
15: Basic deduction 46.51.6 27.220.1 40.211.1 0.00.0 30.212.6 
16: Basic induction 53.81.4 53.61.9 54.71.3 52.02.2 49.12.4 
17: Positional reasoning 29.95.2 32.48.0 30.910.1 14.415.2 36.911.3 
18: Size reasoning 4.51.3 4.21.8 4.32.1 0.10.1 22.122.5 
19: Path finding 86.519.4 64.637.4 75.930.4 29.519.4 64.38.7 
20: Agent motivation 1.40.6 0.00.1 0.00.0 0.10.2 2.61.7 
Avg. err (%) 28.52.9 16.77.6 20.87.1 15.97.4 27.68.2 

Notes: This table reports the average error rate and standard deviation of several models trained with different random seeds. denotes joint training of one model on all tasks, and denotes separate training of separate model on each task. The number in bold indicates the best performance.

Close Modal

or Create an Account

Close Modal
Close Modal