Training and inference phase of segmentation and topic classification (Sector). For training (A), we preprocess Wikipedia documents to supply a ground truth for segmentation T, headings 𝒵 and topic labels 𝒴. During inference (B), we invoke Sector with unseen plain text to predict topic embeddings ek on sentence level. The embeddings are used to segment the document and classify headings and normalized topic labels .
This site uses cookies. By continuing to use our website, you are agreeing to our privacy policy. No content on this site may be used to train artificial intelligence systems without permission in writing from the MIT Press.