The strengths (+), weaknesses (−), and improvement directions (?) of the three mainstreams of TST methods on non-parallel data.
Method . | Strengths & Weaknesses . |
---|---|
Disentanglement | + More profound in theoretical analysis, e.g., disentangled representation learning |
− Difficulties of training deep generative models (VAEs, GANs) for text | |
− Hard to represent all styles as latent code | |
− Computational cost rises with the number of styles to model | |
Prototype Editing | + High BLEU scores due to large word preservation |
− Attribute marker detection step can fail if the style and semantics are confounded | |
− The step target attribute retrieval by templates can fail if there are large rewrites for styles, e.g., Shakespearean English vs. modern English | |
− Target attribute retrieval step has large complexity (quadratic to the number of sentences) | |
− Large computational cost if there are many styles, each of which needs a pre-trained LM for the generation step | |
? Future work can enable matchings for syntactic variation | |
? Future work can use grammatical error correction to post-edit the output | |
Pseudo-Parallel Corpus Construction | + Performance can approximate supervised model performance, if the pseudo-parallel data are of good quality |
− May fail for small corpora | |
− May fail if the mono-style corpora do not have many samples with similar contents | |
− For IBT, divergence is possible, and sometimes needs special designs to prevent it | |
− For IBT, time complexity is high (due to iterative pseudo data generation) | |
? Improve the convergence of the IBT |
Method . | Strengths & Weaknesses . |
---|---|
Disentanglement | + More profound in theoretical analysis, e.g., disentangled representation learning |
− Difficulties of training deep generative models (VAEs, GANs) for text | |
− Hard to represent all styles as latent code | |
− Computational cost rises with the number of styles to model | |
Prototype Editing | + High BLEU scores due to large word preservation |
− Attribute marker detection step can fail if the style and semantics are confounded | |
− The step target attribute retrieval by templates can fail if there are large rewrites for styles, e.g., Shakespearean English vs. modern English | |
− Target attribute retrieval step has large complexity (quadratic to the number of sentences) | |
− Large computational cost if there are many styles, each of which needs a pre-trained LM for the generation step | |
? Future work can enable matchings for syntactic variation | |
? Future work can use grammatical error correction to post-edit the output | |
Pseudo-Parallel Corpus Construction | + Performance can approximate supervised model performance, if the pseudo-parallel data are of good quality |
− May fail for small corpora | |
− May fail if the mono-style corpora do not have many samples with similar contents | |
− For IBT, divergence is possible, and sometimes needs special designs to prevent it | |
− For IBT, time complexity is high (due to iterative pseudo data generation) | |
? Improve the convergence of the IBT |