Skip to Main Content
Table 4 
Feature set for our axiom identification model. The features are based on content and typography.
Content Sentence overlap Semantic textual similarity between the current and next discourse element. We include features that compute the proportion of common unigrams and bigrams across the two discourse elements. This feature is conjoined with the tag assigned to the current and next sentence. 
Geometry entities Number of geometry entities (constants, predicates, and functions)—normalized by the number of tokens in this discourse element. This feature is conjoined with the tag assigned to the current discourse element. 
Keywords Indicator that the current discourse element contains any one of the following words: hence, if, equal, twice, proportion, ratio, product. This feature is conjoined with the tag assigned to the current discourse element. 
 
Discourse RST edge Indicator for the RST relation between the current and next discourse element. This feature is conjoined with the tag assigned to the current and next sentence. 
Axiom, Theorem, Corollary Mention (a) The current (or previous) discourse element is mentioned as an Axiom, Theorem, or Corollary (e.g., Similar Triangle Theorem or Corollary 2.1). (b) The section or subsection in the textbook containing the current (or previous) discourse element mentions an Axiom, Theorem, or Corollary. This feature is conjoined with the tag assigned to the current (and previous) discourse element. 
Equation The current (or next) discourse element contains an equation (e.g., PA × PB = PT2). This feature is conjoined with the tag assigned to the current (and next) sentence. 
Associated diagram The current discourse element contains a pointer to a figure (e.g., “Figure 2.1”). This feature is conjoined with the tag assigned to the current discourse element. 
Bold/ Underline The discourse element (or previous discourse element) contains text that is in bold font or underlined. Conjoined with the tag assigned to the current (and previous) discourse element. 
Bounding box Indicator that the current and previous discourse elements are bounded by a bounding box in the textbook. Conjoined with the tag assigned to the current (and previous) discourse element. 
JSON structure Indicator that the current and previous discourse element are in the same node of the JSON hierarchy. Conjoined with the tag assigned to the current (and previous) discourse element. 
Content Sentence overlap Semantic textual similarity between the current and next discourse element. We include features that compute the proportion of common unigrams and bigrams across the two discourse elements. This feature is conjoined with the tag assigned to the current and next sentence. 
Geometry entities Number of geometry entities (constants, predicates, and functions)—normalized by the number of tokens in this discourse element. This feature is conjoined with the tag assigned to the current discourse element. 
Keywords Indicator that the current discourse element contains any one of the following words: hence, if, equal, twice, proportion, ratio, product. This feature is conjoined with the tag assigned to the current discourse element. 
 
Discourse RST edge Indicator for the RST relation between the current and next discourse element. This feature is conjoined with the tag assigned to the current and next sentence. 
Axiom, Theorem, Corollary Mention (a) The current (or previous) discourse element is mentioned as an Axiom, Theorem, or Corollary (e.g., Similar Triangle Theorem or Corollary 2.1). (b) The section or subsection in the textbook containing the current (or previous) discourse element mentions an Axiom, Theorem, or Corollary. This feature is conjoined with the tag assigned to the current (and previous) discourse element. 
Equation The current (or next) discourse element contains an equation (e.g., PA × PB = PT2). This feature is conjoined with the tag assigned to the current (and next) sentence. 
Associated diagram The current discourse element contains a pointer to a figure (e.g., “Figure 2.1”). This feature is conjoined with the tag assigned to the current discourse element. 
Bold/ Underline The discourse element (or previous discourse element) contains text that is in bold font or underlined. Conjoined with the tag assigned to the current (and previous) discourse element. 
Bounding box Indicator that the current and previous discourse elements are bounded by a bounding box in the textbook. Conjoined with the tag assigned to the current (and previous) discourse element. 
JSON structure Indicator that the current and previous discourse element are in the same node of the JSON hierarchy. Conjoined with the tag assigned to the current (and previous) discourse element. 
Close Modal

or Create an Account

Close Modal
Close Modal