Abstract
We present a Chinese word segmentation model learned from punctuation marks which are perfect word delimiters. The learning is aided by a manually segmented corpus. Our method is considerably more effective than previous methods in unknown word recognition. This is a step toward addressing one of the toughest problems in Chinese word segmentation.
Issue Section:
Articles
This content is only available as a PDF.
© 2009 Association for Computational Linguistics
2009