Feature templates used for the Bio-NER task. wi is the current word token on position i. ti is the POS tag on position i. oi is the orthography mode on position i. yi is the classification label on position i. yi − 1yi represents label transition. represents a Cartesian product between two sets.
Word Token–based Features: |
{wi − 2, wi − 1, wi, wi + 1, wi + 2, wi − 1wi, wiwi + 1} |
×{yi, yi − 1yi} |
Part-of-Speech (POS)–based Features: |
{ti − 2, ti − 1, ti, ti + 1, ti + 2, ti − 2ti − 1, ti − 1ti, titi + 1, ti + 1ti + 2, ti − 2ti − 1ti, ti − 1titi + 1, titi + 1ti + 2} |
×{yi, yi − 1yi} |
Orthography Pattern–based Features: |
{oi − 2, oi − 1, oi, oi + 1, oi + 2, oi − 2oi − 1, oi − 1oi, oioi + 1, oi + 1oi + 2} |
×{yi, yi − 1yi} |
Word Token–based Features: |
{wi − 2, wi − 1, wi, wi + 1, wi + 2, wi − 1wi, wiwi + 1} |
×{yi, yi − 1yi} |
Part-of-Speech (POS)–based Features: |
{ti − 2, ti − 1, ti, ti + 1, ti + 2, ti − 2ti − 1, ti − 1ti, titi + 1, ti + 1ti + 2, ti − 2ti − 1ti, ti − 1titi + 1, titi + 1ti + 2} |
×{yi, yi − 1yi} |
Orthography Pattern–based Features: |
{oi − 2, oi − 1, oi, oi + 1, oi + 2, oi − 2oi − 1, oi − 1oi, oioi + 1, oi + 1oi + 2} |
×{yi, yi − 1yi} |