- Mildly context-sensitive language
In formal grammar theory, mildly context-sensitive languages are a class of formal languages which can be efficiently parsed, but still possess enough context sensitivity to allow the parsing of natural languages. The concept was first introduced by Aravind Joshi in 1985.
Mild context-sensitivity is defined in terms of sets of languages. A set of languages is mildly context-sensitive if and only if
- it contains all context-free languages.
- it admits limited cross-serial dependencies.
- all the languages are parsable in polynomial time.
- all the languages have constant growth; this means that the distribution of string lengths should be linear rather than supralinear. This is often guaranteed by proving a pumping lemma for some class of mildly context-sensitive languages.
Some attempts at creating mildly context-sensitive language formalisms include:
- Linear context-free rewriting systems developed by D. J. Weir
- Minimalist grammars of Edward P. Stabler, Alain Lecomte, Christian Retoré, etc.
- Head grammars of Carl Pollard
- Combinatory categorial grammars developed by Mark Steedman
- Linear indexed grammars defined by Gerald Gazdar
- Tree-adjoining grammars developed by Aravind Joshi
The first two of these grammar classes define the same set of languages, while the last four define another single, strictly smaller class. The larger of the two classes may be parsed by thread automatons, while the other smaller one may be parsed by embedded pushdown automatons.
While all languages in both classes are mildly context-sensitive and both classes support some cross-serial dependency, neither class exhausts the full set of mildly context-sensitive languages.
Control Language Hierarchy
A more precisely defined hierarchy of languages that correspond to the mildly context-sensitive class was defined by David J. Weir. Based on the work of Nabil A. Khabbaz, Weir's Control Language Hierarchy is a containment hierarchy of countable set of language classes where the Level-1 is defined as context-free, and Level-2 is the class of tree-adjoining and the other three grammars.
Following are some of the properties of Level-k languages in the hierarchy:
- Level-k languages are properly contained by Level-(k + 1) language class
- Level-k languages can be parsed in time
- Level-k contains the language , but not
- Level-k contains the language , but not
Those properties correspond well (at least for small k > 1) to the conditions of mildly context-sensitive languages imposed by Joshi, and as k gets bigger, the language class become, in a sense, less mildly context-sensitive.
- Joshi, Aravind; Vijay-Shanker, K.; Weir, David (1991), "The Convergence of Mildly Context-Sensitive Grammar Formalisms", in Sells, Peter; Shieber, Stuart; Wasow, Thomas, Foundational Issues in Natural Language Processing, Cambridge, MA: MIT Press, pp. 31–81, ISBN 0-262-19303-5, http://repository.upenn.edu/cgi/viewcontent.cgi?article=1571&context=cis_reports .
- Vijay-Shanker, K.; Weir, David (1994), "The Equivalence of Four Extensions of Context-Free Grammars", Mathematical Systems Theory 27 (6): 511–546, doi:10.1007/BF01191624, ISSN 0025-5661, http://citeseer.ist.psu.edu/vijay-shanker94equivalence.html .
- Villemonte de La Clergerie, Eric (2002), "Parsing Mildly Context-Sensitive Languages with Thread Automata", Proceedings of the 19th international conference on Computational linguistics, 1, pp. 1–7, http://acl.ldc.upenn.edu/C/C02/C02-1028.pdf .
- Weir, D. J. (1992), "A geometric hierarchy beyond context-free languages", Theoretical computer science 104 (2): 235–261, doi:10.1016/0304-3975(92)90124-X. .
- Parsing Beyond Context-Free Grammar, by Laura Kallmeyer
- Seminar on tree-adjoining grammars and mildly context-sensitive languages and formalisms, by Laura Kallmeyer
- Mildly Context-Sensitive Grammars, by Aravind Joshi
Automata theory: formal languages and formal grammars Chomsky hierarchyType-0—Type-1———Type-2——Type-3— Grammars(no common name)Linear context-free rewriting systems etc.Tree-adjoining etc.— LanguagesMildly context-sensitive Minimal automatonThread automata Each category of languages is a proper subset of the category directly above it. - Any automaton and any grammar in each category has an equivalent automaton or grammar in the category directly above it.
Wikimedia Foundation. 2010.
Look at other dictionaries:
Context-sensitive language — In theoretical computer science, a context sensitive language is a formal language that can be defined by a context sensitive grammar. That is one of the four types of grammars in the Chomsky hierarchy. Of the four, this is the least often used,… … Wikipedia
Context-sensitive grammar — A context sensitive grammar (CSG) is a formal grammar in which the left hand sides and right hand sides of any production rules may be surrounded by a context of terminal and nonterminal symbols. Context sensitive grammars are more general than… … Wikipedia
Context-free language — In formal language theory, a context free language is a language generated by some context free grammar. The set of all context free languages is identical to the set of languages accepted by pushdown automata. Contents 1 Examples 2 Closure… … Wikipedia
Deterministic context-free language — A deterministic context free language is a formal language which is defined by a deterministic context free grammar. The set of deterministic context free languages is called DCFL and is identical to the set of languages accepted by a… … Wikipedia
Context-free grammar — In formal language theory, a context free grammar (CFG) is a formal grammar in which every production rule is of the form V → w where V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals (w can be empty). The… … Wikipedia
Indexed language — Indexed languages are a class of formal languages discovered by Alfred Aho [ cite journal | last = Aho | first = Alfred | year = 1968 | title = Indexed grammars an extension of context free grammars | journal = Journal of the ACM | volume = 15 |… … Wikipedia
Recursive language — This article is about a class of formal languages as they are studied in mathematics and theoretical computer science. For computer languages that allow a function to call itself recursively, see Recursion (computer science). In mathematics,… … Wikipedia
Recursively enumerable language — In mathematics, logic and computer science, a recursively enumerable language is a type of formal language which is also called partially decidable or Turing acceptable. It is known as a type 0 language in the Chomsky hierarchy of formal… … Wikipedia
Deterministic context-free grammar — In formal grammar theory, the deterministic context free grammars (DCFGs) are a proper subset of the context free grammars. The deterministic context free grammars are those a deterministic pushdown automaton can recognize. A DCFG is the finite… … Wikipedia
Bach language — In theoretical computer science, the Bach language is the formal language over an alphabet of three distinct symbols containing all strings in which the three symbols occur equally often. The Bach language is a context sensitive language. Pullum… … Wikipedia