DP-TSG notation. For consistency, we largely follow the notation of Liang, Jordan, and Klein (2010).
αc | DP concentration parameter for each non-terminal type c ∈ V |
P0(e|c) | CFG base distribution |
x | Set of all non-terminal nodes in the treebank |
![]() | Set of sampling sites (one for each x ∈ x) |
S | A block of sampling sites, where S ⊆ ![]() |
![]() | Binary variables to be sampled (bs = 1 for frontier nodes) |
z | Latent state of the segmented treebank |
m | Number of sites s ∈ S s.t. bs = 1 |
n = {nc,e} | Sufficient statistics of z |
ΔnS:m | Change in counts by setting m sites in S |
αc | DP concentration parameter for each non-terminal type c ∈ V |
P0(e|c) | CFG base distribution |
x | Set of all non-terminal nodes in the treebank |
![]() | Set of sampling sites (one for each x ∈ x) |
S | A block of sampling sites, where S ⊆ ![]() |
![]() | Binary variables to be sampled (bs = 1 for frontier nodes) |
z | Latent state of the segmented treebank |
m | Number of sites s ∈ S s.t. bs = 1 |
n = {nc,e} | Sufficient statistics of z |
ΔnS:m | Change in counts by setting m sites in S |