## Abstract

Spurious ambiguity is the phenomenon whereby distinct derivations in grammar may assign the same structural reading, resulting in redundancy in the parse search space and inefficiency in parsing. Understanding the problem depends on identifying the essential mathematical structure of derivations. This is trivial in the case of context free grammar, where the parse structures are ordered trees; in the case of type logical categorial grammar, the parse structures are proof nets. However, with respect to multiplicatives, intrinsic proof nets have not yet been given for displacement calculus, and proof nets for additives, which have applications to polymorphism, are not easy to characterize. In this context we approach here multiplicative-additive spurious ambiguity by means of the proof-theoretic technique of focalization.

## 1. Introduction

In context free grammar (CFG), sequential rewriting derivations exhibit spurious ambiguity: Distinct rewriting derivations may correspond to the same parse structure (tree) and the same structural reading.^{1} In this case, it is transparent to develop parsing algorithms avoiding spurious ambiguity by reference to parse trees. In categorial grammar (CG), the problem is more subtle. The Cut-free Lambek sequent proof search space is finite, but involves a combinatorial explosion of spuriously ambiguous sequential proofs. This spurious ambiguity in CG can be understood, analogously to CFG, as involving inessential rule reorderings, which we parallelize in underlying geometric parse structures that are (planar) proof nets.

The planarity of Lambek proof nets reflects that the formalism is continuous or concatenative. But the challenge of natural grammar is discontinuity or apparent displacement, whereby there is syntactic/semantic mismatch, or elements appearing out of place. Hence the subsumption of Lambek calculus by displacement calculus **D**, including intercalation as well as concatenation (Morrill, Valentín, and Fadda 2011).

Proof nets for **D** must be partially non-planar; steps towards intrinsic correctness criteria for displacement proof nets are made in Fadda (2010) and Moot (2014, 2016). Additive proof nets are considered in Hughes and van Glabbeck (2005) and Abrusci and Maieli (2016). However, even in the case of Lambek calculus, it is not clear that in practice parsing by reference to intrinsic criteria (Morril 2011; Moot and Retoré 2012, Appendix B) is more efficient than parsing by reference to extrinsic criteria of uniform sequent calculus (Miller et al. 1991; Hendriks 1993). In its turn, on the other hand, uniform proof does not extend to product left rules and product unit left rules, nor to additives. The focalization of Andreoli (1992) is a methodology midway between proof nets and uniform proof. Here, we apply the focusing discipline to the parsing as deduction of **D** with additives.

In Chaudhuri, Miller, and Saurin (2008), multifocusing is defined for unit-free multiplicative-additive linear logic, providing canonical sequent proofs; an eventual goal would be to formulate multifocusing for multiplicative-additive categorial logic and for categorial logic generally. In this respect the present article represents an intermediate step (and includes units, which have linguistic use). Note that Simmons (2012) develops focusing for Lambek calculus with additives, but not for displacement logic, for which we show completeness of focusing here.

The article is structured as follows. In Sections 1.1 and 1.2 we describe spurious ambiguity in context-free grammar and Lambek calculus. In Section 2 we recall the displacement calculus with additives. In Section 3 we contextualize the problem of spurious ambiguity in computational linguistics. In Section 4 we discuss focalization. In Section 5 we present focalization for the displacement calculus with additives. In Section 6 we prove the completeness of focalization for displacement calculus with additives. In Section 7 we exemplify focalization and evaluate it compared with uniform proof. We conclude in Section 8. In Appendix A we prove the auxiliary technical result of Cut-elimination for weak focalization.

### 1.1. Spurious Ambiguity in CFG

**spurious ambiguity**, and the underlying geometric parse structures are ordered trees.

### 1.2. Spurious Ambiguity in CG

**L**(Lambek 1958) is a logic of strings with the operation + of concatenation. Recall the definitions of types, configurations, and sequents of

**L**in terms of a set of primitive types;

^{2}in Backus Naur form (BNF) notation where is the set of types, the set of configurations, and

**Seq**(

**L**) the set of sequents:Lambek calculus types have the following interpretation in semigroups or monoids:Where

*A*,

*B*,

*C*,

*D*are types and Γ, Δ are configurations, the logical rules of Lambek calculus are as follows:There is completeness when the types are interpreted in free semigroups or monoids Pentus (1993, 1998).

**proof nets**; for a survey see Lamarche and Retoré (1996). Proof nets provide a geometric perspective on derivational equivalence. Alternatively, we may identify the same algebraic parse structure (Curry-Howard term):But Lambek calculus is continuous (planarity). A major issue in grammar is discontinuity, namely, syntax/semantics mismatch (e.g., the fact that quantifier phrases occur in situ but take sentential scope; or “gapping” coordination), hence the displacement calculus. This provides a general accommodation of discontinuity in grammar, by contrast with, for example, Combinatory Categorial Grammar (Steedman, 2000), which seeks minimal case by case augmentations of the formalism to deal with quantification, gapping, and so on.

^{3}

## 2. **D** with Additives, **DA**

The basic categorial grammar of Ajdukiewicz and Bar-Hillel is concatenative/continuous/projective, and this feature is reflected in the fact that it is context-free equivalent in weak generative power (Bar-Hillel, Gaifman, and Shamir 1960). The same is true of the logical categorial grammar of Lambek (1958), which is still context free in generative power (Pentus 1992). The main challenge in natural grammar comes from syntax/semantics mismatch or displacement; such non-concatenativity/discontinuity/non-projectivity is treated in mainstream linguistics by overt movement (e.g., the verb-raising of cross-serial dependencies) and covert movement (e.g., the quantifier-raising of quantification). The displacement calculus is a response to this challenge, which preserves all the good design features of Lambek calculus while extending the generative power and capturing “movement” phenomena such as cross-serial dependencies and quantifer-raising.^{4}

In this section we present displacement calculus **D**, and a displacement logic **DA** comprising **D** with additives. Although **D** is indeed a conservative extension of the Lambek calculus allowing empty antecedents (**L***), we think of it not just as an *extension* of Lambek calculus but as a *generalization*, because it involves a whole reformulation to deal with discontinuity while conserving **L*** as a special case.

_{k}; intercalation at the

*k*th separator, counting from the left); see Figure 3. Recall the definition of types and their sorts, configurations and their sorts, and sequents, for the displacement calculus with additives (

*i*and

*j*range over the naturals 0, 1, …):Where Λ is the metasyntactic empty string there is now the BNF definition of configurations:For example, there is the configuration (

*S*↑

_{1}

*N*) ↑

_{2}

*N*{

*N*, 1:

*S*↑

_{1}

*N*{

*N*},

*S*}, 1,

*N*, 1

*A*is defined by:Where Γ is a configuration of sort

*i*and Δ

_{1}, …, Δ

_{i}are configurations, the

**fold**Γ ⊗ 〈Δ

_{1}: …: Δ

_{i}〉 is the result of replacing the successive 1’s in Γ by Δ

_{1}, …, Δ

_{i}respectively. For example ((

*S*↑

_{1}

*N*) ↑

_{2}

*N*{

*N*, 1:

*S*↑

_{1}

*N*{

*N*},

*S*}, 1,

*N*, 1) ⊗ 〈1:

*N*,

*N*∖

*S*: Λ〉 is (

*S*↑

_{1}

*N*) ↑

_{2}

*N*{

*N*, 1:

*S*↑

_{1}

*N*{

*N*},

*S*},

*N*,

*N*∖

*S*,

*N*.

*I*} of Lambek (1958, 1988), which are defined in relation to concatenation, are the basic means of categorial (sub)categorization. The directional divisions over, /, and under, ∖, are exemplified by assignments such as

*the*:

*N*/

*CN*for

*the man*:

*N*and

*sings*:

*N*∖

*S*for

*John sings*:

*S*, and

*loves*: (

*N*∖

*S*)/

*N*for

*John loves Mary*:

*S*. Hence, for

*the man*:And for

*John sings*and

*John loves Mary*:

*considers*: (

*N*∖

*S*)/(

*N*• (

*CN*/

*CN*)) for

*John considers Mary socialist*:

*S*.Of course, this use of product is not essential: We could just as well have used ((

*N*∖

*S*)/(

*CN*/

*CN*))/

*N*because in general we have both

*A*/(

*C*•

*B*) ⇒ (

*A*/

*B*)/

*C*(currying) and (

*A*/

*B*)/

*C*⇒

*A*/(

*C*•

*B*) (uncurrying).

*J*}, the displacement connectives, of Morrill and Valentín (2010) and Morrill, Valentín, and Fadda (2011) are defined in relation to intercalation. When the value of the

*k*subscript is 1 it may be omitted (i.e., it defaults to one). Circumfixation, or extraction, ↑, is exemplified by a discontinuous particle-verb assignment

*calls*+ 1 +

*up*: (

*N*∖

*S*) ↑

*N*for

*Mary calls the man up*:

*S*:

## 3. The Problem of Spurious Ambiguity in Computational Linguistics

This section elaborates the bibliographic context of so-called spurious ambiguity, a problem frequently arising in varieties of parsing of different formalisms. The spurious ambiguity that has been discussed in the literature falls in two broad classes: that for categorial parsing and that for dependency parsing.

The literature on spurious ambiguity for categorial parsing is represented by the following.

- •
Hepple (1990) provides an analysis of normal form theorem proving for Cut-free Lambek calculus without product. Two systems are considered: first, a notion of normalization for this implicative Lambek calculus, and second, a constructive calculus generating all and only the normal forms of the first. The latter consists of applying right rules as much as possible and then switching to left rules, which revert to the right phase in the minor premise and which conserve the value active type in the major premise. Hepple shows that the system is sound and complete and that it delivers a unique proof in each Cut-free semantic equivalence class. This amounts to uniform proof (Miller et al. 1991), but was developed independently. Retrospectively, it can be seen as focusing for the implicative fragment of Lambek calculus. It is straightforwardly extendible to product right (Hendriks 1993), but not product left, which requires the deeper understanding of the focusing method.

- •
Eisner (1996) provides a normal form framework for Combinatory Categorial Grammar (CCG) with generalized binary composition rules. CCG is a version of categorial grammar with a small number of combinatory schemata (or a version of CFG with an infinite number of non-terminals) in which the basic directional categorial cancellation rules are extended with phrase structure schemata corresponding to combinators of combinatory logic. Eisner, following Hepple and Morrill (1989), defines a notion of normal form by a restriction on which rule mothers can serve as which daughters of subsequent rules in bottom–up parsing. Eisner notes that his marking of rules has a rough resemblance to the rule regulation of Hendriks (1993) (which is focusing). But whereas the former is Cut-based rule normalization, the latter is Cut-free metarule normalization. Eisner’s method applies to a wide range of harmonic and non-harmonic composition rules, but not to type-lifting rules. It has been suggested by G. Penn (personal communication) that it is not clear whether CCG is actually categorial grammar; at least, it is not logical categorial grammar; in any case, the focusing methodology we use here is still normalization, but represents a much more general discipline than that used by Eisner.

- •
Moortgat and Moot (2011) study versions of proof nets and focusing for so-called Lambek-Grishin calculus. Lambek-Grishin calculus is like Lambek calculus but includes a disjunctive multiplicative connective family as well as the conjunctive multiplicative connective family of the Lambek calculus

**L**, and it is thus multiple-conclusioned. For the case**LG**considered, suitable structural linear distributivity postulates facilitate the capacity to describe non-context free patterns. By contrast with displacement calculus**D**, which is single-conclusioned (it is intuitionistic), and which absorbs the structural properties in the sequent syntax (it has no structural postulates),**LG**has a non-classical term regime that can assign multiple readings to a proof net, and has semantically non-neutral focusing and defocusing rules. Thus it is quite different from the system considered here in several respects. The multiplicative focusing for**LG**is a straightforward adaptation of that for linear logic and additives are not addressed; importantly, the backward-chaining focused**LG**search space still requires the linear distributivity postulates (which leave no trace in the terms assigned), whereas the backward chaining**D**search space has no structural postulates.

The literature on spurious ambiguity for dependency parsing is represented by the following.

- •
Huang, Vogel, and Chen (2011) address the problem of word alignment between pairs of corresponding sentences in (statistical) machine translation. Because this task may be very complex (in fact the search space may grow exponentially), a technique called

**synchronous parsing**is used to constrain the search space. However, this approach exhibits the problem of spurious ambiguity, which that paper elaborates in depth. We do not know at this moment whether this work can bring us useful techniques to deal with spurious ambiguity in the field of logic theorem-proving. - •
Goldberg and Nivre (2012) and Cohen, Gómez-Rodríguez, and Satta (2012) focus on the problem of spurious ambiguity of a general technique for dependency parsing called

**transition-based dependency parsing**. The spurious ambiguity investigated in that paper arises in transition systems where different sequences of transitions yield the same dependency tree. The framework of dependency grammars is non-logical in its nature and the spurious ambiguity referred here is completely different from the one addressed in our article, and unfortunately not useful to our work. - •
Hayashi, Kondo, and Matsumoto (2013), which is situated in the field of dependency parsing, proposes sophisticated parsing techniques that may have spurious ambiguity. A method is presented in order to deal with it based on normalization and canonicity of sequences of actions of appropriate transition systems.

All these approaches tackle by a kind of normalization the general phenomena of spurious ambiguity, whereby different but equivalent alternative applications of operations result in the same output. Focalization in particular is applicable to logical grammar, in which parsing is deduction. This turns out to be quite a deep methodology, revealing, for example, not only invertible (“reversible”) rules, which were known to the early proof-theorists, but also dual focusing (“irreversible”) rules, providing a general discipline for logical normalization, which we now elaborate.

## 4. Reducing Spurious Ambiguity: On Focalization

**DA**. A further sequent rule is missing, the so-called

*Cut*rule, which incorporates in the sequent calculus the notion of (

*contextualised*)

*transitivity*:Note that the Cut rule does not have the property that every type in the premises is a (sub)type of the conclusion (the subformula property), hence it is an obstacle to proof search. Any standard logic should have the Cut rule because this encodes transitivity of the entailment relation, but at the same time a major result of logics presented as sequent calculi is the

**Cut-elimination**theorem, or as it is usually known after Gentzen,

**Haupsatz**. This very important theorem gives as a byproduct the subformula property and decidability in the case of

**DA**.

^{5}

### 4.1. Properties of Cut-Free Proof Search

*R*, ∖

*R*, •

*L*,

*IL*, ↑

_{k}

*R*, ↓

_{k}

*R*, ⊙

_{k}

*L*,

*JL*, &

*R*, and ⊕

*L*. By way of example, consider /

*R*:We can safely reverse sequent (25a) into sequent (25b) because both provability and non-provability are preserved—that is, Γ ⇒

*C*/

*B*is provable iff Γ,

*B*⇒

*C*is provable. We say that the type-occurrence of

*C*/

*B*in the succedent of (25a) is

**reversible**.

^{6}This means that in the face of alternative reversible rule options a choice can be made arbitrarily and the other possibilities forgotten (

**don’t care nondeterminism**).

*L*, ∖

*L*, •

*R*,

*IR*, ↑

_{k}

*L*, ↓

_{k}

*L*, ⊙

_{k}

*R*,

*JR*, &

*L*and ⊕

*R*. By way of example, consider the /

*L*rule:We cannot assume safely that there exist configurations Γ and Δ(

*C*) that match the antecedent of the end-sequent, and that preserve provability: We do not have that Δ(

*C*/

*B*, Γ) ⇒

*D*is provable iff Γ ⇒

*B*and Δ(

*C*) ⇒

*D*are provable. In this case, we say that the distinguished type-occurrence

*C*/

*B*in the above sequent is

**irreversible**.

^{7}In the face of alternative irreversible rule options, the choice matters and different possibilities must be tried (

**don’t know nondeterminism**).

#### 4.1.1. On Reversible Rules.

- •
Applying the two rules in either order, both (proof) derivations have the same syntactic structure (and the same semantic lambda-term labeling, which we introduce later when we present focused calculus).

- •
The reversible rules •

*L*and ∖*R**commute*in the order of application in the proof search. These commutations contribute to spurious ambiguity. - •
Reversible rules can be applied don’t care nondeterministically. This result was already known by proof-theorists in the 1930s (Gentzen, and others such as Kleene). These rules were called

**invertible**rules.

#### 4.1.2. On Irreversible Rules.

- •
Dually, both (proof) derivations have the same syntactic structure, (i.e., proof net), as well as the same semantic term labeling codifying the structure of derivations.

- •
The irreversible rules for /

_{(S/(N∖S))/CN}and /_{(N∖S)/N}*commute*in the order of application in the proof search. These commutations also contribute to spurious ambiguity. - •
Contrary to the behavior of reversible types, the choice of the active type (the type decomposed) is critical when it is irreversible.

### 4.2. Consequences

- •
A combinatorial explosion in the finite Cut-free proof search space due to the aforementioned rule commutativity.

- •
The solution: proof nets. But proof nets for categorial logic in general are not fully understood, for example, in the case of units and additive connectives.

- •
This problem is exacerbated in displacement calculus because there are more connectives and hence more rules giving rise to spurious ambiguities.

- •
A good partial solution: the discipline of focalization.

### 4.3. Toward a (Considerable) Elimination of Spurious Ambiguity: Focalization

Reversible rules commute in their order of application and their key property is reversability. Dually, irreversible rules also commute and their key property (completely unknown to the early proof-theorists^{8}) is so-called **focalization**, which was defined by Andreoli in Andreoli (1992).

### 4.4. The Discipline of Focalization

Given a sequent, reversible rules are applied in a don’t care nondeterministic fashion until no longer possible. When there are no occurrences of reversible types, one chooses an irreversible type don’t know nondeterministically as active type-occurrence; we say that this type occurrence is in **focus**, or that it is **focalized**; and one applies proof search to its subtypes while these remain irreversible. When one finds a reversible type, reversible rules are applied in a don’t care nondeterministic fashion again until no longer possible, when another irreversible type is chosen, and so on.

#### 4.4.1. A Non-focalized Proof: A “Gymkhana in the Proof-search”.

*Q*=

*S*/(

*N*∖

*S*), and

*TV*= (

*N*∖

*S*)/

*N*.Sequent (27) has the algebraic Curry-Howard term:Consider the proof fragment:This fragment of proof concludes into a correct proof derivation. But crucially, the discipline of focalization is not applied resulting—in the words of the father of linear logic—in a gymkhana! (In the sense of jumping back and forth.) The foci on irreversible types are not preserved and alternating irreversible types are chosen. Concretely, the underlines show how the active irreversible type in the right premise is not a subtype of the active irreversible type in the endsequent. The discipline of Andreoli’s focalization consists of maintaining the focus on the irreversible subtypes, hence reducing the search space and reducing spurious ambiguity. This rests on the focusing property: that if an irreversible rule instance leads to a proof, it leads to a proof in which the subtypes of the active type are the active types of the premises, if they are irreversible. This is the remarkable proof-theoretic discovery of Andreoli, which fortunately can be applied in the present case as this article explains. The discipline of focalization reduces dramatically the combinatorial explosion of categorial proof-search.

#### 4.4.2. A Focalized Proof Derivation.

### 4.5. A Last Ingredient in the Focalization Discipline

- •
What happens with literal types? Can they be considered reversible or irreversible? What happens then in the focalized proof-search paradigm?

- •
The answers: Literals can be assigned what is called in the literature a reversible or irreversible

**bias**in any manner according to which they belong to the set of reversible or irreversible types. (As we state later we leave open the question of which biases may be favorable.)

**compound**reversible or irreversible types, or

**atomic**reversible or irreversible types.

## 5. Focalization for **DA**

**situated**(in the antecedent of a sequent, input:

^{•}; in the succedent of a sequent, output: °) compound types are classified as of

*reversible*or

*irreversible polarity*according to whether their associated rule is reversible or not; situated atoms obviously do not have associated rules, but we can extend the concept of polarity to them, overloading the terms

*reversible*and

*irreversible*, applying the previously mentioned concept of bias. We define in Example (31) a BNF grammar for arbitrary types (observe that atomic types are classified as either At

^{•}or At°):

*P*and

*Q*in Example (31) denote reversible types (including atomic types) occurring in input and output position, respectively. Dually, if

*P*and

*Q*occur in the output and input position, respectively, then they are said to occur with irreversible polarity. In the atomic case, the same terms (

*reversible*and

*irreversible*) are also used. The table in Example (32) summarizes the notational convention on arbitrary (atomic or compound) polarized formulas

*P*and

*Q*:Notice that in Example (32) arbitrary polarized types exhibit a kind of De Morgan duality: If reversible output types

*Q*occur in input position, then they are irreversible; whereas if reversible output types

*P*occur in input position, then they are irreversible.

## 6. Completeness of Focalization for **DA**

In order to prove that **DA** is complete with respect to focalization, we define a logic **DA**_{Foc} with the following features: (a) the set of configurations is extended to the set _{box}, (b) the set of sequents **Seq**(**DA**) is extended to the set **Seq**(**DA**_{Foc}), (c) a new set of logical rules. The set of configurations _{box} contains , and in addition it contains **boxed** configurations, by which we understand configurations where a unique irreversible type-occurrence is decorated with a box, thus: . The set of sequents **Seq**(**DA**_{Foc}) includes **DA** sequents with possibly a box in the sequent. We have then **Seq**(**DA**) ⫋ **Seq**(**DA**_{Foc}). Sequents of **Seq**(**DA**_{Foc}) can contain at most one boxed type-occurrence. The meaning of such a box is to mark in the proof-search an irreversible (possibly atomic) type-occurrence either in the antecedent or in the succedent of a sequent. We will say that such a sequent is **focalized**.

We will use **judgements foc** and **rev** on **DA**_{Foc}-sequents. Where ∈ **Seq**(**DA**_{Foc}), **foc** means that contains a boxed type occurrence, and **rev** means that there is a complex reversible type occurrence. It follows that Constraint (33) can be judged as **foc** ∧ ¬**rev**. The judgment ¬ **foc** means that is not focalized (and so may contain reversible type-occurrences).

**strongly**focalized and induces maximal alternating phases of reversible and irreversible rule application. Pseudo-code for a recursive algorithm of strongly focalized proof search is as follows:

The top level call to determine whether a sequent *S* is provable is prove(*S*). The routine prove(*S*) calls the routine prove_rev_lst with actual parameter the unitary list [*S*]. The routine prove_rev_lst then applies reversible rules to its list of sequents Ls in a don’t care nondeterministic manner until none of the sequents contain any reversible type (i.e., it closes Ls under reversible rules). Then prove_irrev_lst is called on the list of sequents. This calls prove_irrev(*S*′) for focusings *S*′ of each sequent, and if some focusing of each sequent is provable the result **true** is returned, otherwise **false** is returned. The procedure prove_irrev applies focusing rules and recurses back on prove_rev_lst and prove_irrev_lst to determine provability for the given focusing.

In order to prove completeness of (strong) focalization we invoke also an intermediate *weakly* focalized system. In all we shall be dealing with three systems: the displacement calculus with additives **DA** with sequents notated Δ ⇒ *A*, the *weakly focalized* displacement calculus with additives **DA**_{Foc} with sequents notated Δ ⇒_{w}*A*, and the *strongly focalized* displacement calculus with additives **DA**_{Foc} with sequents notated Δ⇒*A*. Sequents of both **DA**_{foc} and **DA**_{Foc} may contain at most one focalized formula. When a **DA**_{foc} sequent is notated Δ ⇒_{w}*A* ◊**focalized**, it means that the sequent possibly contains a (unique) focalized formula. Otherwise, Δ ⇒_{w}*A* means that the sequent does not contain a focus. In **DA**_{foc}Constraint (33) is not imposed. Thus, whereas strong focalization imposes maximal alternating phases of reversible and irreversible rule application, weak focalization does not impose this maximality. In this section we prove the strong focalization property for the displacement calculus with additives **DA**, that is, strong focalization is complete.

The focalization property for Linear Logic was defined by Andreoli (1992). In this article we follow the proof idea from Laurent (2004), which we adapt to the intuitionistic non-commutative case **DA** with twin multiplicative modes of combination, the continuous (concatenation), and the discontinuous (intercalation) products. The proof relies heavily on the Cut-elimination property for weakly focalized **DA**, which is proved in Appendix A. In our presentation of focalization we have avoided the *react* rules of Andreoli (1992) and Chaudhuri (2006), and use instead our simpler box notation suitable for non-commutativity.

**DA**

_{Foc}is a subsystem of

**DA**

_{foc}.

**DA**

_{foc}has the focusing rules

*foc*and Cut rules

*p*-

*Cut*

_{1},

*p*-

*Cut*

_{2},

*n*-

*Cut*

_{1}, and

*n*-

*Cut*

_{2}

^{9}shown in Equation (34), and the reversible and irreversible rules displayed in the figures, which are read as allowing in irreversible rules the occurrence of reversible types, and in reversible rules as allowing arbitrary sequents possibly with a focalized type; when ◊

**focalized**appears in both premise and conclusion, it means that they are either both focused or both unfocused:

### 6.1. Embedding of **DA** into **DA**_{foc}

*Id*we consider for

**DA**and for both

**DA**

_{foc}and

**DA**

_{Foc}is restricted to atomic types; recalling that atomic types are classified into irreversible bias

*At*° and reversible bias

*At*

^{•}:In fact, the identity axiom holds of any type A. The generalized case is called the

**identity rule Id**. It has the following formulation in the sequent calculi considered here:

The identity rule *Id*, which applies not just to atomic types (like *Id*), but to all types is easy to prove in both **DA** and **DA**_{foc}, but the same is not the case for **DA**_{Foc}. This is the reason to consider what we have called weak focalization, which helps us to prove smoothly this crucial property for the proof of strong focalization.

*Embedding of*DA

*into*DA

_{foc})

For any configuration Δ and type *A*, we have that if ⊢ Δ ⇒ *A* then ⊢ Δ ⇒_{w}*A*.

*Proof*. We proceed by induction on the length of the derivation of **DA** proofs. In the following lines, we apply the induction hypothesis (i.h.) for each premise of **DA** rules (with the exception of the identity rule and the right rules of units):

- –
- –
Cut rule: just apply

*n*-*Cut*. - –

Left unit rules apply as in the case of **DA**.

- –
Left discontinuous product: Directly translates.

- –
- –
- –
- –
Product and implicative continuous rules: These follow the same pattern as the discontinuous case. We interchange the metalinguistic

*k*-th intercalation |_{k}with the metasyntactic concatenation ’, ’, and we interchange ⊙_{k}, ↑_{k}, and ↓_{k}with •, /, and ∖, respectively.

### 6.2. Embedding of **DA**_{foc} into **DA**_{Foc}

*Proof*. We proceed by induction on the size of **DA**_{foc}-provable sequents, namely on ||.^{10} Since **DA**_{Foc} is Cut-free (see Appendix A) we consider Cut-free **DA**_{foc} proofs of .

- –
Case || = 0. This corresponds to the axiom case, which is the same for both calculi

**DA**_{foc}and**DA**_{Foc}— see Equation (35). - –
Suppose || > 0. Because || > 0, does not correspond to an axiom. Hence is derivable with at least one logical rule. Otherwise, the size of would be equal to 0. The last rule ⋆ can be either a logical rule or a

*foc*rule (keep in mind we are considering Cut-free**DA**_{foc}proofs!). We have two cases: - a)
If ⋆ is logical, because is supposed to belong to

**Seq**(**DA**_{Foc}), it follows that its premises (possibly only one premise) belong also to**Seq**(**DA**_{Foc}). The premises have size strictly less than . Therefore, we can safely apply i.h., whence we conclude. - b)

- –
′

**foc**∧ ¬**rev**. Because is focalized and there is no reversible type occurrence, the last rule of ′ must correspond either to a multiplicative or additive irreversible rule. The premises (possibly only one premise) have size strictly less than . We can then safely apply the induction hypothesis (i.h.), which gives us**DA**_{Foc}provable premises to which we can apply the ⋆ rule. Whence**DA**_{foc}⊢ ′, and hence**DA**_{Foc}⊢ . - –
′

**foc**∧**rev**.

The size of ′ equals that of the end-sequent , and moreover the premise is **foc** ∧ **rev**, which does not belong to **Seq**(**DA**_{foc}). Clearly, we cannot apply i.h.. What can we do?

**DA**

_{foc}we can overcome this difficulty. ′ contains at least one reversible formula. We see three cases that are illustrative of the situation:We consider these in turn:

- a) We have by
*Id*that . We apply to this sequent the reversible ⊙_{k}left rule, whence . In case (37a), we have the following proof in**DA**_{foc}:To this**DA**_{foc}proof we apply Cut-elimination and we get the Cut-free**DA**_{foc}end-sequent . We have < . We can apply then i.h. and we derive the provable**DA**_{Foc}sequent , to which we can apply the left ⊙_{k}rule. We have obtained . - b) In the same way, we have that . Thus, in case (37b), we have the following proof in
**DA**_{foc}:As before, we apply Cut-elimination to the above proof. We get the Cut-free**DA**_{foc}end-sequent . It has size less than . We can apply i.h. and we obtain the**DA**_{Foc}provable sequent , to which we apply the ↑_{k}right rule. - c) We haveby applying the
*foc*rule and the invertibility of &*R*we obtain the provable**DA**_{foc}sequents and . These sequents have smaller size than . The aforementioned sequents have a Cut-free proof in**DA**_{foc}. We apply i.h. and we obtain and . We apply the*&*right rule in**DA**_{Foc}, and we obtain .

**Prov**

_{Σ}(

*C*) be the class of Σ sequents provable in calculus

*C*.

Strong focalization is complete.

*Proof*. Observe that in particular **Prov**_{Seq(DA)}(**DA**_{Foc}) = **Prov**_{Seq(DA)}(**DA**_{foc}). Because, by Theorem 1 **Prov**_{Seq(DA)}(**DA**_{Foc}) = **Prov**(**DA**), we have that **Prov**_{Seq(DA)}(**DA**_{Foc}) = **Prov**(**DA**).

## 7. Evaluation and Exemplification

CatLog version f1.2, CatLog1 (Morrill 2012), is a parser/theorem-prover using uniform proof and count-invariance for multiplicatives. CatLog version k2, CatLog3 (Morrill 2017), is a parser/theorem-prover using focusing, as expounded here, and count-invariance for multiplicatives, additives, brackets, and exponentials (Kuznetsov, Morrill, and Valentín 2017). To evaluate the performance of uniform proof and focusing, we created a system version clock3f1.2 made by substituting the theorem-proving engine of CatLog1 into the theorem-proving environment of CatLog3 so that count-invariance and other factors were kept constant while the uniform and focusing theorem-proving engines were compared.

*Montague test*(Morrill and Valentín 2016) on the two systems, that is, the task of providing a computational grammar of the Montague fragment. In particular, we ran exhaustive parsing of the minicorpus given in Figure 13. The Montague lexicon was as in Figure 14. The tests were executed in XGP Prolog on a MacBook Air. The running times in seconds for exhaustive parsing were as follows:The tests for the minicorpus show that the focusing parsing/proof-search runs in half the time of the uniform parsing/proof-search.

We could have also made comparison with proof net parsing/theorem proving (Moot 2016), but our proposal includes additives for which proof nets are still an open question, and not just the displacement calculus multiplicatives, and the point of focalization is that it is a general methodology extendible to, for example, exponentials and other modalities also, for which proof nets are also still under investigation. That is, our focalization approach is *scaleable* in a way that proof nets currently are not, and in this sense comparison with proof nets is not quite appropriate.

## 8. Conclusion

We have claimed that just as the parse structures of context free grammar are ordered trees, the parse structures of categorial grammar are proof nets. Thus, just as we think of context free algorithms as finding ordered trees, bottom–up, top–down, and so forth, we can think of categorial algorithms as finding proof nets. The complication is that proof nets are much more sublime objects than ordered trees: They embody not only syntactic coherence but also semantic coherence. Focalization is a step on the way to eliminating spurious ambiguity by building such proof nets systematically. A further step on the way, eliminating all spurious ambiguity, would be multifocusing. This remains a topic for future study. Another topic for further study in focusing is the question of which assignments of bias are favorable for the processing of given lexica/corpora.

Alternatively, for context free grammar one can perform chart parsing, or tabularization, and at least for the basic case of Lambek calculus suitable notions of proof net also support tabularization (Morrill, 1996; de Groote, 1999; Pentus, 2010; Kanovich et al., 2017). This also remains a topic for future study.

For the time being, however, we hope to have motivated the relevance of focalization to categorial parsing as deduction in relation to the **DA** categorial logic fragment, which leads naturally to the program of focalization of extensions of this fragment with connectives such as exponentials.

## Appendix A. Cut Elimination in **DA**_{foc}

We prove this by induction on the complexity (*d*, *h*) of topmost instances of *Cut*, where *d* is the size^{11} of the cut formula and *h* is the length of the Cut-free derivation above the Cut rule. There are four cases to consider: Cut with axiom in the minor premise, Cut with axiom in the major premise, principal Cuts, and permutation conversions. In each case, the complexity of the Cut is reduced. In order to save space, we will not be exhaustively showing all the cases because many follow the same pattern. In particular, for any irreversible logical rule there are always four cases to consider corresponding to the polarity of the subformulas. In the following, we will show only one representative example. Concerning continuous and discontinuous formulas, we will show only the discontinuous cases (discontinuous connectives are less well-known than the continuous ones of the plain Lambek calculus). For the continuous instances, the reader has only to interchange the meta-linguistic wrap |_{k} with the meta-syntactic concatenation ′, ′, ⊙_{k} with •, ↑_{k} with / and ↓_{k} with ∖. The units cases (principal case and permutation conversion cases) are completely trivial.

*Proof*. Identity cases.Principal cases:The case of ↓

_{k}is entirely similar to the ↑

_{k}case.Left commutative

*p*-

*Cut*conversions:Right commutative

*p*-

*Cut*conversions (unordered multiple distinguished occurrences are separated by semicolons):Left commutative

*n*-

*Cut*conversions:Right commutative

*n*-

*Cut*conversions:This completes the proof.

## Acknowledgments

We thank anonymous *Computational Linguistics* referees for comments and suggestions that have improved this article. This research was supported by grant TIN2017-89244-R from MINECO (Ministerio de Economia, Industria y Competitividad).

## Notes

This paper is a revised and expanded version of Morrill and Valentín (2015).

The original Lambek calculus did not include the product unit and had a non-empty antecedent condition (“Lambek’s restriction”). The displacement calculus used in the present article conservatively extends the Lambek calculus without Lambek’s restriction and with product units.

It is known that displacement calculus (without additives) generates a well-known class of mildly context free languages: the well-nested multiple context free languages (Sorokin 2013; Wijnholds 2014). At the time of writing, only this and other lower bounds are known; tight upper bounds on the weak generative capacity of displacement calculus constitute an open question.

This is because in Cut-free backward-chaining proof search for a given goal sequent a finite number of rules can be applied backwards in only a finite number of ways to generate subgoals at each step, and these subgoals have lower complexity (fewer connectives) than the goal matched; hence the proof search space is finite.

Other terms found in the literature are invertible, asynchronous, or negative.

Other terms found in the literature are non-invertible, synchronous, or positive.

Unknown even to the inventor of Linear Logic, J.-Y. Girard.

If it is convenient, we may drop the subscripts.

For a given type *A*, the *size* of *A*, |*A*|, is the number of connectives in *A*. By recursion on configurations we have: . Moreover, we have: .

The size of |*A*| is the number of connectives appearing in *A*.

## References

*Special issue Festschrift for Joachim Lambek*.