TYPE . | Error . | Explanation . |
---|---|---|
1 | Could be considered correct | Cases of true semantic ambiguity. Both analyses could be considered correct. |
For example, in the phrase mrkz kwx erbi the adjective erbi (“arab”) modifies | ||
mrkz (“center”) in gold. The parser attaches it to kwx (“force”). Both could be correct. | ||
Clause attachment | In complex sentences with multiple clauses or coordinated structures, the parser | |
often identifies the conjunctions and the predicates correctly, but makes mistakes in | ||
connecting clauses. Semantic or world knowledge is required for disambiguation. | ||
PP attachment | Semantic or world knowledge is also often required to determine PP attachment. | |
For example, in the clause kdi lmnwe hedptm el ewbdim ifralim the parser attaches the | ||
PP el ewbdim ifralim (“over Israeli workers”) to the verb lmnew (“to prevent”) | ||
rather than to the required noun hedptm (“their preference”). | ||
2 | Seg/Tag err in focus word | Incorrect segmentation of a token may lead to missing or incorrect dependency heads. |
For example, the parser analyses the token bqrb as a single word (a preposition, | ||
“near”) while in the gold standard it is segmented into three words b + h + qrb | ||
(preposition + def + noun, “in the battle”). This leads to missing dependency heads. | ||
Seg/Tag err in other word | Incorrect segmentation of a token may also lead to an incorrect dependent. | |
For example, in the phrase bqrb mgnnh the parser analyses the PP b + qrb | ||
(preposition + noun, “in battle”) as a single word bkrb (preposition, “near”). | ||
As a result, the word mgnnh (defence) is labeled object of a preposition (pobj) | ||
rather than a genitive object of a construct-state noun (gobj). | ||
Label err due to tagging err | Incorrect tag prediction may lead to an apropriate yet incorrect arc label. | |
For example, in the phrase amcei xi lhpgnwt (“living means for demonstrations”) | ||
the parser tags the adjective xi (“living”) as a noun instead of an adjective, which is | ||
why it attaches xi as gobj (genitive object) to “means” rather than as amod. | ||
3 | Gold is wrong | The analysis in gold is wrong, while the analysis provided by the parser is correct. |
For example, in the phrase w+b+silwp ewbdwt (“and in distortion of facts”), | ||
the conjunction marker w is labeled comp in gold while the parser correctly picks cc. | ||
Train is inconsistent | (a) Multiple labels are used for the same type of dependencies. | |
For example, prepmod and comp are both used in the train set for | ||
prepositional complements and prepositional modifiers without a clear distinction. | ||
(b) Identical structures are analyzed in different ways. For example, in the train set | ||
there are different structures used for the same type of partitive construction. | ||
In both (a) and (b), the predicted analyses might likewise be inconsistent and arbitrary. | ||
Label underspecified | The label dep is used instead of different types of dependencies in gold. In several cases | |
the test set uses more specific labels where the parser predicts dep, and vice versa. | ||
4 | Other | There is a smaller amount of errors that involve linguistic structures that reflect |
particular Semitic phenomena. For example: | ||
(a) Indefinite objects in Hebrew are not case marked, so are sometimes mislabeled as | ||
subject due to flexible word order patterns and object pre-posing. | ||
(b) Construct-state nouns may be analysed as names and vice versa. Since Hebrew | ||
lacks capitalization, Hebrew names very often string-match common nouns. | ||
(c) Adjective attachment errors inside construct-state nouns. For example, in the phrase | ||
hjlt qnswt kbdim the parser attaches the adjective kbdim (“heavy”) to the construct-state | ||
noun hjlt (“imposition-of”) instead of attaching it to the genitive object qnswt (“fines”). | ||
TYPE . | Error . | Explanation . |
---|---|---|
1 | Could be considered correct | Cases of true semantic ambiguity. Both analyses could be considered correct. |
For example, in the phrase mrkz kwx erbi the adjective erbi (“arab”) modifies | ||
mrkz (“center”) in gold. The parser attaches it to kwx (“force”). Both could be correct. | ||
Clause attachment | In complex sentences with multiple clauses or coordinated structures, the parser | |
often identifies the conjunctions and the predicates correctly, but makes mistakes in | ||
connecting clauses. Semantic or world knowledge is required for disambiguation. | ||
PP attachment | Semantic or world knowledge is also often required to determine PP attachment. | |
For example, in the clause kdi lmnwe hedptm el ewbdim ifralim the parser attaches the | ||
PP el ewbdim ifralim (“over Israeli workers”) to the verb lmnew (“to prevent”) | ||
rather than to the required noun hedptm (“their preference”). | ||
2 | Seg/Tag err in focus word | Incorrect segmentation of a token may lead to missing or incorrect dependency heads. |
For example, the parser analyses the token bqrb as a single word (a preposition, | ||
“near”) while in the gold standard it is segmented into three words b + h + qrb | ||
(preposition + def + noun, “in the battle”). This leads to missing dependency heads. | ||
Seg/Tag err in other word | Incorrect segmentation of a token may also lead to an incorrect dependent. | |
For example, in the phrase bqrb mgnnh the parser analyses the PP b + qrb | ||
(preposition + noun, “in battle”) as a single word bkrb (preposition, “near”). | ||
As a result, the word mgnnh (defence) is labeled object of a preposition (pobj) | ||
rather than a genitive object of a construct-state noun (gobj). | ||
Label err due to tagging err | Incorrect tag prediction may lead to an apropriate yet incorrect arc label. | |
For example, in the phrase amcei xi lhpgnwt (“living means for demonstrations”) | ||
the parser tags the adjective xi (“living”) as a noun instead of an adjective, which is | ||
why it attaches xi as gobj (genitive object) to “means” rather than as amod. | ||
3 | Gold is wrong | The analysis in gold is wrong, while the analysis provided by the parser is correct. |
For example, in the phrase w+b+silwp ewbdwt (“and in distortion of facts”), | ||
the conjunction marker w is labeled comp in gold while the parser correctly picks cc. | ||
Train is inconsistent | (a) Multiple labels are used for the same type of dependencies. | |
For example, prepmod and comp are both used in the train set for | ||
prepositional complements and prepositional modifiers without a clear distinction. | ||
(b) Identical structures are analyzed in different ways. For example, in the train set | ||
there are different structures used for the same type of partitive construction. | ||
In both (a) and (b), the predicted analyses might likewise be inconsistent and arbitrary. | ||
Label underspecified | The label dep is used instead of different types of dependencies in gold. In several cases | |
the test set uses more specific labels where the parser predicts dep, and vice versa. | ||
4 | Other | There is a smaller amount of errors that involve linguistic structures that reflect |
particular Semitic phenomena. For example: | ||
(a) Indefinite objects in Hebrew are not case marked, so are sometimes mislabeled as | ||
subject due to flexible word order patterns and object pre-posing. | ||
(b) Construct-state nouns may be analysed as names and vice versa. Since Hebrew | ||
lacks capitalization, Hebrew names very often string-match common nouns. | ||
(c) Adjective attachment errors inside construct-state nouns. For example, in the phrase | ||
hjlt qnswt kbdim the parser attaches the adjective kbdim (“heavy”) to the construct-state | ||
noun hjlt (“imposition-of”) instead of attaching it to the genitive object qnswt (“fines”). | ||