Abstract
In this article we report work on Chinese semantic role labeling, taking advantage of two recently completed corpora, the Chinese PropBank, a semantically annotated corpus of Chinese verbs, and the Chinese Nombank, a companion corpus that annotates the predicate-argument structure of nominalized predicates. Because the semantic role labels are assigned to the constituents in a parse tree, we first report experiments in which semantic role labels are automatically assigned to hand-crafted parses in the Chinese Treebank. This gives us a measure of the extent to which semantic role labels can be bootstrapped from the syntactic annotation provided in the treebank. We then report experiments using automatic parses with decreasing levels of human annotation in the input to the syntactic parser: parses that use gold-standard segmentation and POS-tagging, parses that use only gold-standard segmentation, and fully automatic parses. These experiments gauge how successful semantic role labeling for Chinese can be in more realistic situations. Our results show that when hand-crafted parses are used, semantic role labeling accuracy for Chinese is comparable to what has been reported for the state-of-the-art English semantic role labeling systems trained and tested on the English PropBank, even though the Chinese PropBank is significantly smaller in size. When an automatic parser is used, however, the accuracy of our system is significantly lower than the English state of the art. This indicates that an improvement in Chinese parsing is critical to high-performance semantic role labeling for Chinese.