Calculating Zero Pronominals in Situ: A Type Logical Approach

Zero pronominals challenge Type Logical Grammar in two ways. One, TLG displays a linear resource management regime for semantic composition, meaning that pronominals call for special treatment if they want to do resource multiplication. Two, as a grammar of lexicalism, TLG applies to phonologically realized lexical entries only, illegitimating the phonetically null items during syntactic derivation. J¨agor extends the inventory of category-forming connectives of TLG by a third kind of implication that creates categories of anaphoric items and solves the ﬁrst problem above. This article goes a step further to tackle the second one. In order to formalize the constructions with zero pronominals, we design a ternary category [ A 〈 B 〉 C ] and include the latter into J¨agor’s system. The proposed system is proof-theoretically well-behaved. It is complete, sound, and decidable. More importantly, zero pronominals of various forms can be derived in the system.


Introduction
Natural languages are economical, arguably on all tangible levels, to convey more information with less effort [1,2]. On the sentence level, various resource reuse strategies are utilized to shun repetitiveness. e multiplicated resources sometimes resort to pronouns, reflexives, and auxiliaries, as in (1)- (3), and sometimes are covert, as the subject in the second coordinate of (4) and the PROs in control constructions of (5) and (6). Interestingly, however, the story goes to the contrary on the semantic side. Each of the repetitive resources, even if it is in zero form, multiplies the meaning of its antecedent and reappears in semantics. For instance, the two PROs in (5) and (6) pick up the meaning representations of their controllers in logical interpretations. ey are logically represented as the subject JOE′ and object JEAN′ (in bold), respectively, of the upstairs verbs, as shown in (7) and (8).
(1) Joe claims that he will win. pronoun e issue of how to calculate these resource reuse mechanisms in TLG is an apparent challenge to the grammar because TLG assumes a monostratal model for natural languages, which means that they can only combine neighboring items or constructions. However, all above-mentioned reused linguistic items in (1)- (6), including anaphora, non-constituent coordination, and control, are discontinuous constructions, in which the anaphor/PRO locates several words away from its antecedent/controller. us, the grammar has to be capable of coping with discontinuity in order to be a competent natural language grammar. ere are mainly two ways to deal with anaphoric constructions in type logical setting. One is to treat pronominals as being triggered by certain lexical items whose semantic representations contain a λ-operator that binds more than one variable occurrence [3][4][5][6][7]. However, the strategy of this kind forces highly complex lexical entries and coerces formidable mechanisms, like secondary wrapping [5], into syntax. Another way is to introduce into syntax a specifically designed operation to secure the pronominals a semantic interpretation as simple as an identity function. Hepple's permutation operator Δ [8], Jacobson's variable free semantics [9,10], and Jägor's Lambek calculus with Limited Contraction (LLC) [11] are all attempts of the sort, among which the last attracts the most attention and engenders a series of logical extensions for its theoretical simplicity [12][13][14][15].
Jägor [11] includes a limited version of the structural rule of contraction as is shown in Rule 1. Rule 1 does nothing more than allowing the antecedent formulae to be multiplied. e resulted system LLC highlights its vertical slash | that creates categories of anaphoric items. A sign has category A|B iff it needs an antecedent of category B and, if it finds one, behaves like an item of category A.
us, both the pronoun in (1) and reflexive in (2) are identity functions λx.x of category np|np. Making use of | Elimination Rule 2, the simple reflexive sentence (2) is derived as in Figure 1. We use natural deductions for rules or syntactic derivation hereinafter due to the reason stated in [11] as there is a tight correspondence between the structure of proofs and the syntactic structure of the Curry-Howard terms.
Despite all these successful treatments over discontinuity, we find that the current categorial machine struggles to legitimately derive constructions concerning zero pronominals, for example, the PROs in (5) and (6). PROs are different from elided constituents in coordination. e latter can be derived by the generalized coordination rule (Rule 3) as shown in Figure 2, whereas the PROs are in embedded clauses rather than a coordination construction, thus not eligible for predicate coordination.
Here is the dilemma for deriving pronominals in TLG. On the one hand, the zero pronominals have to be explicit to carry their meaning for appropriate semantic derivation, while on the other hand, Lambek calculus does not allow zero pronominals to be made explicit as it will induce the structural rule of monotonicity and jeopardizes decidability of the system by allowing the addition of some formula that is similar to one in the antecedent.
As a result, an ideal system for the present purpose is to extend LLC further to allow covert anaphoric items to be overt without hurting the system's decidability. is is our goal in the present study. It differs from earlier extensions of LLC in such a way that it is a direct logical expansion, rather than a lexical enrichment that includes anaphoric slash in the categories of control verbs [12], and may offer a more meticulous version for the noticing trending works of connecting TLG to computational distributional semantics [16][17][18][19][20]. In Section 2, we will flash out our theoretical assumptions for zero pronominals, then present the axiomatic presentation, Gentzen style sequent formulation, and labeled natural deduction of the new system LLCM in Sections 3-5, respectively, proving that LLCM is complete, sound, and decidable. More linguistic phenomena concerning zero pronominals will also be discussed in Section 5.

Anaphora Slot and LLCM
We want to follow Jäger's approach to anaphors [11] and extend LLC in a way that it can accommodate two desirable properties: one, the system should allow covert item to be overt; two, the made-explicit item should be anaphoric. is means that the system should allow both pronominals and zero pronominals. In this way, the PROs in control constructions can take up an actual logical form as well as interpretation during sentence derivation. us, (5) and (6) can be derived ideally as in Figures 3 and 4, where to stay can take up the right subject identical to its controller, instead of looting direct object of the upstairs predicate-its left neighbor-to be its subject. (To simplify the derivation, we ignore the morphological distinction between finite and infinite VPs and treat to stay as a single lexical entry. ) We name the location where the zero pronominal resides "anaphora slot." us, an anaphora slot introduction rule in embryo should be like Rule 4.

Complexity
Under type logical perspective, introducing an anaphora slot in syntax means that the logic of grammatical composition allows somewhat "redundant" (because they are covert in actual discourse) but not arbitrary category in a valid deduction. is amounts to an assumption that the structural rule of monotonicity is part of the grammar in one way or another. us, we name our system LLCM, meaning LLC with limited monotonicity.
We put forward a ternary category is an anaphora slot operator (sometimes simplified as 〈 〉). It may help to reveal how the category of a zero pronominal is introduced, deleted, or concatenated with categories to its left or right. From our experience with control construction and PROs, there is a preferable concatenative order among the adjacent strings "A, B, C," where B will concatenate with C (if there is a C to B's right) first and the result will further concatenate leftwards with its left neighbor A. us, the zero pronominals, when made overt, should obey the limited contraction 2 (Rule 5) in its sequent representation.
In the coming two sections, we will define LLCM of its axiomatic version and Gentzen style sequent formulation, and prove that it is sound, complete, and decidable under such expansion.

Model Theory of LLCM
Now we extend the inventory of LLC category-forming connectives by the ternary operator [〈〉]. So the set of LLCM categories F over a collection of atomic categories A is given below.

Definition 1. LLCM CATEGORIES
If F is a well-formed LLCM categories, then F/F, F\F, F|F, F·F and [F<F>F] are also well-formed LLCM categories.
All well-formed LLCM categories are recursively defined as in Definition 1. Next, a sound and complete modeltheoretic interpretation for LLCM is presented. [11] and a preliminary version of this system is given in our earlier work [21].) A Model M for LLCM is a tuple <W, R, S, T, ∼, f, g>, where W is a non-empty set of linguistic signs, T ⊆ W 4 is a quaternary relation on W; R, S ⊆ W 3 are ternary relations on W; ∼ ⊆ W 2 is a binary relation on W; f is a function from atomic categories to subsets of W; and g is a function from LLCM-categories to W.
e verification relation between points in W and LLCM categories is defined as follows: e ternary relation R can be taken as ordinary syntactic concatenation between linguistic signs. Rxyz means that if y and z occur adjacently in that order, the combination of the two gets an x. Relation S is a ternary relation in [11]. It is similar to R but it is responsible for anaphoric resolution. Sxyz means that x is changed into y if there is an element similar to z (noted as y ∼ z in meaning postulates) available to be antecedent for anaphora resolution. Relation T models anaphora slot operation and Txyzu means that x is the result of inserting z in between y and u. e following meaning postulates hold:   MP1-5 mean postulates about relation S [11]. MP 6-7 are about relation T, exhibiting an important feature of LLCM. ey amount to say that if x contains an anaphora slot and x is composed by y and z, then x can also be composed by y and v among which v is the result of conjoining z and the covert u. In other words, a complex sign with an anaphora slot can be represented similarly either with or without the anaphor participating in its syntactic composition. MP8-12 show how categories in the ternary operator [〈 〉] compose with its neighboring categories. MP13 says that the extension over a set of linguistic signs is closed under R. e last postulate is a structural postulate complementary to MP6 and MP7.

Definition 3. AXIOMATIC VERSION OF LLCM
e axiomatic version of LLCM is the system that is obtained when the following 12 axioms and 4 rules are added to the axiomatic version of L: D3 D4 .
en we can prove the soundness and completeness of the axiomatic version in a way that closely follows the proof for L in [22] and LLC in [11]. We will start with the axiom 6 and D2. e rest cases are already proved in [11]. e proof for deduction rules are similar. We will leave them for readers as exercise.

Complexity
Proof. We prove this via induction over the complexity of B. If B is atomic, it follows directly from the construction of f. If B is constructed by one of the three Lambek connectives or LLC's anaphoric connective, the proof of the induction step follows that in [11,21]. We will only show the induction steps for the ternary operator. is means that there are A 1 ∈║C║ C M, A 2 ∈║E║ C M, and en we have to prove that all postulates for LLCM in Definition 2 are fulfilled by the model. ey follow directly from the model construction, monotonicity of the product, and the truth lemma above, thus will not be provided here. ■ Finally, we will show that all LLCM valid formulae are derivable in LLCM. Let ║A║ C M ⊆ ║B║ C M. Suppose that A ⟶ B is not derivable in LLCM.
ere should be a A∉║B║ C M by the truth lemma. By identity axiom,├ A ⟶ A. us there is always A∈║A║CM and it is not the case ║A║ C M ⊆ ║B║ C M, which contradicts our assumption. Hence, A ⟶ B is derivable in LLCM.
Henceforth, the axiomatic system of LLCM is sound and complete.

Sequent Presentation of LLCM
In order to characterize the decidability of LLCM, we need its Gentzen-style sequent presentation. e sequent presentation of LLCM extends that of LLC by proposing R and L for anaphora slot operator [], and monotonicity. For simplicity, we will omit the labeled λ-terms for the sequent presentation in the present section.
Complexity 5 To prove the sequent presentation is equivalent to the axiomatic version, Lemma 1 is needed. And also, a function σ that maps all commas in a sequent into products • is utilized to ensure categories to type correspondence. us, A 1 , Y 1 , . . ., A n , Y n )〈B〉σ(Δ, C 1 , Z 1 , . . ., C n , Z n )] is derivable in LLCM's axiomatic version.

Theorem 3. EQUIVALENCE OF AXIOMATIC AND GENTZEN PRESENTATIONS LLCM├ X ⇒ A iff├ σ(X) ⟶ A is derivable in the axiomatic version.
The proof is omitted for the current purpose. We prove the equivalence of LLCM's axiomatic and sequent versions for the same reason that Lambek proves the equivalence of L's sequent version and its axiomatic counterpart. e decidability is decidable by Cut Elimination in sequent presentation and this result can therein further percolate to its axiomatic version.

Theorem 4. CUT ELIMINATION
If LLCM├ X ⇒ A, then there is a Cut-free sequent proof of LLCM├ X ⇒ A.
To prove this theorem, we have to distinguish three cases: (1) at least one premise of the Cut is an identity axiom; (2) both premises are results of logical rules, and the Cut formula is the active formula in both premises; (3) both premises result from introducing logical rules, and the Cut formula is not the active formula in one premise. The proof is left as an exercise to the reader.
Proof. For each rule of the Cut-free sequent calculus, the conclusion sequent of each rule contains more symbols than its premises because each formula in the premise occurs as a subformula in the conclusion and each logical rule introduces one connective. In addition, there are only finite ways to match certain sequents with the conclusion of some sequent rule. As a result, there are always at most finite choices to do a bottom-up proof search and every branch of the proof tree is finite. Decidability in LLCM is thus decidable.

LLCM's Natural Deduction in
Tree-format. Before testing on more linguistic phenomena, we offer LLCM's labeled natural deduction in tree-format here. As is shown in Sections 1 and 2, labeled deduction in tree-format helps to visualize the type-logical deduction over a sentence. Suppose Δ is a n-ary operator, ΔE and ΔI stand for its elimination rule and introduction rule, respectively. Here, we will offer <>I only.
〈〉 I enables the covert or elided pronominals to be inserted first before a [A〈B〉C] structure is constructed. E, however, is not given because is a temporary notational strategy to show where zero pronominal is. e notational will disappear when the zero pronoun finds its antecedent and multiplies its interpretation with |E. Now we can show the charm of LLCM with more linguistic constructions that allow zero pronominals.

Deriving Pros.
e anaphora and control constructions we list in (1)-(6) exhibit only the tip of the iceberg of the zero pronominals used in natural languages. Generative grammar distinguishes two types of zero pronominals. In addition to the PROs that are limited to the subject position of a non-finite clause as in control constructions of (5) and (6), there are zero pronouns that occur elsewhere in less restricted manners than PROs. e linguistic economy allows pronouns to be dropped in many languages. ey are called little pro. For instance, a subject pronoun in Spanish may be dropped from a tensed clause as in (9), and in Chinese, both the subject and object pronouns may be dropped in similar circumstances as in (10) and (11). Apparently, a type-logical system ready to derive correct readings for sentences in these languages is expected to be capable of inserting the elided pronouns in the anaphora slots and multiplying the semantic resource legitimately.
José know that he/∅ has been see by María. 'José knows that [ us, when the Lambek system is equipped with anaphoric slash and an anaphora slot operator, it becomes as powerful as we would expect it to be. For example, derivation of (9) is illustrated in Figure 5.
In Chinese, both subject and object pronominals in a tensed clause can be dropped. However, the dropped object cannot take the matrix subject as its antecedent, but as some other person known in the discourse. For example, in the Chinese discourse (12) below, pro in speaker B's answer can only refer to the object in A's question. Ideally, if our system allows the inserted pronominal to search its antecedent across the sentence border, it can derive (10) in the same way as that of (12) in Figure 6. Derivation for sentences like (11) that drops both subject and object is likewise.

Discussion
LLCM's labeled natural deduction in tree-format shows that in LLCM, sentences with different degrees of zero pronominals are all derivable, be it a PRO with the restricted occurrence, or a pro-drop with less restrictions in syntax. It exhibits a very promising picture. Nevertheless, poetic as it seems to be, the situation of pros is more complicated than we assume because they are under different restrictions in different pro-drop languages. How to tailor the system according to the requirements of different languages remains to be a question. It might be a good idea to set up a universal model and assign different parameters for different languages as in multimodal CCG [24]. We will leave it for future work. [25]. As seen from substructural logics, system LLCM with the ternary complex category [A〈B〉C] is a specific substructural logic system. It not only shares structural rules such as associative law and commutative law, but also contains monotonicity and a variant of limited contraction, whose axiomatic counterpart is A • C ⟶ A • (A • C). is rule is capable of inserting the left-hand side category A and is exactly what is needed for characterization on zero pronominals. However, linguistic facts also demonstrate that category B in the anaphora slot may relate to categories outside the slot. In other words, the elided or covert category may bear an anaphoric relation with the category outside the slot. us, we stipulate those three categories in the anaphora slot are different from each other and propose the variant of contraction. Further research is needed on theoretical significance of this variant from the perspective of substructural logics.

Data Availability
e data used to support the analysis of the study are available from the corresponding author upon reasonable request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.