MIT Press

Figure 2:

Training and inference phase of segmentation and topic classification (Sector). For training (A), we preprocess Wikipedia documents to supply a ground truth for segmentation T, headings 𝒵 and topic labels 𝒴. During inference (B), we invoke Sector with unseen plain text to predict topic embeddings e_k on sentence level. The embeddings are used to segment the document and classify headings ${\hat{z}}_{j}$ and normalized topic labels $ŷ_{j}$ ⁠.

This Feature Is Available To Subscribers Only