A road map of the various elements that affect MT optimization.
Which Loss Functions? | Which Optimization Algorithm? |
Error (§3.1) | Minimum Error Rate Training (§5.1) |
Softmax (§3.2) | Gradient-based Methods (§5.2, §6.5) |
Risk (§3.3) | Margin-based Methods (§5.3) |
Margin, Perceptron (§3.4) | Linear Regression (§5.4) |
Ranking (§3.5) | Perceptron (§6.2) |
Minimum Squared Error (§3.6) | MIRA (§6.3) |
AROW (§6.4) | |
Which Evaluation Measure? | Which Hypotheses to Target? |
Corpus-level, Sentence Level (§2.5) | k-best vs. Lattice vs. Forest (§2.4) |
BLEU and Approximations (§2.5.1, §2.5.2) | Merged k-bests (§5) |
Other Measures (§8.3) | Forced Decoding (§2.4), Oracles (§4) |
Other Topics: | |
Large Data Sets (§7), Non-linear Models (§8.1), | |
Domain Adaptation (§8.2), Search and Optimization (§8.4) |
Which Loss Functions? | Which Optimization Algorithm? |
Error (§3.1) | Minimum Error Rate Training (§5.1) |
Softmax (§3.2) | Gradient-based Methods (§5.2, §6.5) |
Risk (§3.3) | Margin-based Methods (§5.3) |
Margin, Perceptron (§3.4) | Linear Regression (§5.4) |
Ranking (§3.5) | Perceptron (§6.2) |
Minimum Squared Error (§3.6) | MIRA (§6.3) |
AROW (§6.4) | |
Which Evaluation Measure? | Which Hypotheses to Target? |
Corpus-level, Sentence Level (§2.5) | k-best vs. Lattice vs. Forest (§2.4) |
BLEU and Approximations (§2.5.1, §2.5.2) | Merged k-bests (§5) |
Other Measures (§8.3) | Forced Decoding (§2.4), Oracles (§4) |
Other Topics: | |
Large Data Sets (§7), Non-linear Models (§8.1), | |
Domain Adaptation (§8.2), Search and Optimization (§8.4) |