Human fluency, entity-relevance, and faithfulness scores of entity-controlled models for entities at different document sentences. The Krippendorf’s α inter-rater agreement for these scores are 0.60, 0.78, and 0.44.
Sent. . | Method . | Fluen. . | Ent.-rel. . | Faith. . |
---|---|---|---|---|
3&4 | SD2 | 4.75 | 2.81 | 63% |
D.GPT2+CMDP | 4.79 | 3.36 | 64% | |
5&6 | SD2 | 4.78 | 2.68 | 62% |
D.GPT2+CMDP | 4.78 | 3.29 | 62% |
Sent. . | Method . | Fluen. . | Ent.-rel. . | Faith. . |
---|---|---|---|---|
3&4 | SD2 | 4.75 | 2.81 | 63% |
D.GPT2+CMDP | 4.79 | 3.36 | 64% | |
5&6 | SD2 | 4.78 | 2.68 | 62% |
D.GPT2+CMDP | 4.78 | 3.29 | 62% |