Combinatory categorial grammar

Combinatory categorial grammar (CCG) is an efficiently parseable, yet linguistically expressive grammar formalism. It has a transparent interface between surface syntax and underlying semantic representation, including predicate-argument structure, quantification and information structure.

CCG relies on combinatory logic, which has the same expressive power as the lambda calculus, but builds its expressions differently. The first linguistic and psycholinguistic arguments for basing the grammar on combinators were put forth by Steedman and Szabolcsi. More recent prominent proponents of the approach are Jacobson and Baldridge.

For example, the combinator B (the compositor) is useful in creating long-distance dependencies, as in "Who do you think Mary is talking about?" and the combinator W (the duplicator) is useful as the lexical interpretation of reflexive pronouns, as in "Mary talks about herself". Together with I (the identity mapping) and C (the permutator) these form a set of primitive, non-interdefinable combinators. Jacobson interprets personal pronouns as the combinator I, and their binding is aided by a complex combinator Z, as in "Mary lost her way". Z is definable using W and B.


Parts of the Formalism

The CCG formalism defines a number of combinators (application, composition, and type-raising being the most common). These operate on syntactically-typed lexical items, by means of Natural deduction style proofs. The goal of the proof is to find some way of applying the combinators to a sequence of lexical items until no lexical item is unused in the proof. The resulting type after the proof is complete is the type of the whole expression. Thus, proving that some sequence of words is a sentence of some language amounts to proving that the words reduce to the type S.

Syntactic Types

The syntactic type of a lexical item can be either a primitive type, such as S, N, or NP, or complex, such as S\NP, or NP/N.

The complex types, schematizable as X/Y and X\Y, denote functor types that take an argument of type Y and return an object of type X. A forward slash denotes that the argument should appear to the right, while a backslash denotes that the argument should appear on the left. Any type can stand in for the X and Y here, making syntactic types in CCG a recursive type system.

Application Combinators

The application combinators, often denoted by > for forward application and < for backward application, apply a lexical item with a functor type to an argument with an appropriate type. The definition of application is given as:

\dfrac{\alpha : X/Y \qquad \beta : Y}{\alpha \beta : X}>

\dfrac{\beta : Y \qquad \alpha : X\backslash Y}{\beta \alpha : X}<

Composition Combinators

The composition combinators, often denoted by B > for forward composition and B < for backward composition, are similar to function composition from mathematics, and can be defined as follows:

\dfrac{\alpha : X/Y \qquad \beta : Y/Z}{\alpha \beta : X/Z}B_>

\dfrac{\beta : Y\backslash Z \qquad \alpha : X\backslash Y}{\beta \alpha : X\backslash Z}B_<

Type-raising Combinators

The type-raising combinators, often denoted as T > for forward type-raising and T < for backward type-raising, take argument types (usually primitive types) to functor types, which take as their argument the functors that, before type-raising, would have taken them as arguments.

\dfrac{\alpha : X}{\alpha : T/(T\backslash X)}T_>

\dfrac{\alpha : X}{\alpha : T\backslash (T/X)}T_<


The sentence "the dog bit John" has a number of different possible proofs. Below are a few of them. The variety of proofs demonstrates the fact that in CCG, sentences don't have a single structure, as in other models of grammar.

Let the types of these lexical items be

the : NP/N \qquad dog : N \qquad John : NP \qquad bit : (S\backslash NP)/NP

We can perform the simplest proof (changing notation slightly for brevity) as:

        \dfrac{bit}{(S\backslash NP)/NP}
    }{S\backslash NP}>

Opting to type-raise and compose some, we could get a fully incremental, left-to-right proof:

        }{S/(S\backslash NP)}T_>
        \dfrac{bit}{(S\backslash NP)/NP}

Formal properties

CCGs are known to be able to generate the language {a^n b^n c^n d^n : n \geq 0}. Examples of this are unfortunately too complicated to provide here, but can be found in Vijay-Shanker and Weir (1994)[1].


Vijay-Shanker and Weir (1994)[1] demonstrates that Linear Indexed Grammars, Combinatory Categorial Grammars, Tree-adjoining Grammars, and Head Grammars are weakly equivalent formalisms, in that they all define the same string languages.

See also


  1. ^ a b Vijay-Shanker, K. and Weir, David J. 1994. The Equivalence of Four Extensions of Context-Free Grammars. Mathematical Systems Theory 27(6): 511–546.
  • Baldridge, Jason (2002), "Lexically Specified Derivational Control in Combinatory Categorial Grammar." PhD Dissertation. Univ. of Edinburgh.
  • Curry, Haskell B. and Richard Feys (1958), Combinatory Logic, Vol. 1. North-Holland.
  • Jacobson, Pauline (1999), “Towards a variable-free semantics.” Linguistics and Philosophy 22, 1999. 117–184
  • Steedman, Mark (1987), “Combinatory grammars and parasitic gaps”. Natural Language and Linguistic Theory 5, 403–439.
  • Steedman, Mark (1996), Surface Structure and Interpretation. The MIT Press.
  • Steedman, Mark (2000), The Syntactic Process. The MIT Press.
  • Szabolcsi, Anna (1989), "Bound variables in syntax (are there any?)." Semantics and Contextual Expression, ed. by Bartsch, van Benthem, and van Emde Boas. Foris, 294–318.
  • Szabolcsi, Anna (1992), "Combinatory grammar and projection from the lexicon." Lexical Matters. CSLI Lecture Notes 24, ed. by Sag and Szabolcsi. Stanford, CSLI Publications. 241–269.
  • Szabolcsi, Anna (2003), “Binding on the fly: Cross-sentential anaphora in variable-free semantics”. Resource Sensitivity in Binding and Anaphora, ed. by Kruijff and Oehrle. Kluwer, 215–229.

Further reading

External links

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Categorial grammar — is a term used for a family of formalisms in natural language syntax motivated by the principle of compositionality and organized according to the view that syntactic constituents should generally combine as functions or according to a function… …   Wikipedia

  • Combinatory logic — Not to be confused with combinational logic, a topic in digital electronics. Combinatory logic is a notation introduced by Moses Schönfinkel and Haskell Curry to eliminate the need for variables in mathematical logic. It has more recently been… …   Wikipedia

  • Minimalist grammar — Minimalist grammars are a class of formal grammars that aim to provide a more rigorous, usually proof theoretic, formalization of Chomskyan Minimalist program than is normally provided in the mainstream Minimalist literature. A variety of… …   Wikipedia

  • Context-sensitive grammar — A context sensitive grammar (CSG) is a formal grammar in which the left hand sides and right hand sides of any production rules may be surrounded by a context of terminal and nonterminal symbols. Context sensitive grammars are more general than… …   Wikipedia

  • List of mathematics articles (C) — NOTOC C C closed subgroup C minimal theory C normal subgroup C number C semiring C space C symmetry C* algebra C0 semigroup CA group Cabal (set theory) Cabibbo Kobayashi Maskawa matrix Cabinet projection Cable knot Cabri Geometry Cabtaxi number… …   Wikipedia

  • Noam Chomsky — Chomsky redirects here. For other topics with the same name, see Chomsky (disambiguation). Noam Chomsky Noam Chomsky visiting Vancouver, Canada in 2004 …   Wikipedia

  • CCG — is an acronym for:* Canadian Coast Guard * Castor Cracking Group * Centre for Computational Geography * Centre for Computational Geostatistics, [ (CCG)] * Chemical Computing Group, a pharmaceutical… …   Wikipedia

  • epitomize — verb /əˈpɪt.əˌmaɪz/ a) To make an epitome of. The framework of Combinatory Categorial Grammar epitomizes the rule based generalized categorial architecture. b) To be an epitome of. Syn: sum up …   Wiktionary

  • Grammaire contextuelle — Une grammaire contextuelle (en anglais context sensitive grammar) est une grammaire formelle dans laquelle les substitutions d un symbole non terminal sont soumises à la présence d un contexte gauche et d un contexte droit. Elles sont plus… …   Wikipédia en Français

  • Mark Steedman — Mark Jerome Steedman, FBA, FRSE (born 18 September 1946) is a computational linguist and cognitive scientist. Steedman graduated from the University of Sussex in 1968, with a B.Sc in Experimental Psychology, and from the University of Edinburgh… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.