Skip to Main Content
Table 1

The top three rows summarize the components of the BioScope corpus—abstracts (BSA), full papers (BSP), and clinical reports (BSR)—annotated for speculation and negation. The bottom row details the held-out evaluation data (BSE) provided for the CoNLL-2010 Shared Task. Columns indicate the total number of sentences and their average length, the number of hedged/negated sentences, the number of cues, and the number of multiword cues. (Note that BSE is not annotated for negation, and we do not provide speculation statistics for BSR as this data set will only be used for the negation experiments.




Speculation
Negation

Sentences
Length
Sentences
Cues
MWCs
Sentences
Cues
MWCs
BSA 11,871 26.1 2,101 2,659 364 1,597 1,719 86 
BSP 2,670 25.7 519 668 84 339 376 23 
BSR 6,383 7.7 – – – 865 870 
BSE 5,003 27.6 790 1,033 87 – – – 



Speculation
Negation

Sentences
Length
Sentences
Cues
MWCs
Sentences
Cues
MWCs
BSA 11,871 26.1 2,101 2,659 364 1,597 1,719 86 
BSP 2,670 25.7 519 668 84 339 376 23 
BSR 6,383 7.7 – – – 865 870 
BSE 5,003 27.6 790 1,033 87 – – – 
Close Modal

or Create an Account

Close Modal
Close Modal