Mildly context-sensitive language

﻿
Mildly context-sensitive language

In formal grammar theory, mildly context-sensitive languages are a class of formal languages which can be efficiently parsed, but still possess enough context sensitivity to allow the parsing of natural languages. The concept was first introduced by Aravind Joshi in 1985.

Definition

Mild context-sensitivity is defined in terms of sets of languages. A set of languages is mildly context-sensitive if and only if

1. it contains all context-free languages.
2. it admits limited cross-serial dependencies.
3. all the languages are parsable in polynomial time.
4. all the languages have constant growth; this means that the distribution of string lengths should be linear rather than supralinear. This is often guaranteed by proving a pumping lemma for some class of mildly context-sensitive languages.

Formalisms

Some attempts at creating mildly context-sensitive language formalisms include:

The first two of these grammar classes define the same set of languages, while the last four define another single, strictly smaller class.[citation needed] The larger of the two classes may be parsed by thread automatons, while the other smaller one may be parsed by embedded pushdown automatons.

While all languages in both classes are mildly context-sensitive and both classes support some cross-serial dependency, neither class exhausts the full set of mildly context-sensitive languages.

Control Language Hierarchy

A more precisely defined hierarchy of languages that correspond to the mildly context-sensitive class was defined by David J. Weir. Based on the work of Nabil A. Khabbaz, Weir's Control Language Hierarchy is a containment hierarchy of countable set of language classes where the Level-1 is defined as context-free, and Level-2 is the class of tree-adjoining and the other three grammars.

Following are some of the properties of Level-k languages in the hierarchy:

• Level-k languages are properly contained by Level-(k + 1) language class
• Level-k languages can be parsed in $O(n^{3\cdot2^{k-1}})$ time
• Level-k contains the language $\{a_1^n \dots a_{2^k}^n|n\geq0\}$, but not $\{a_1^n \dots a_{2^{k+1}}^n|n\geq0\}$
• Level-k contains the language $\{w^{2^{k-1}}|w\in\{a,b\}^*\}$, but not $\{w^{2^{k-1}+1}|w\in\{a,b\}^*\}$

Those properties correspond well (at least for small k > 1) to the conditions of mildly context-sensitive languages imposed by Joshi, and as k gets bigger, the language class become, in a sense, less mildly context-sensitive.

• Weir, D. J. (1992), "A geometric hierarchy beyond context-free languages", Theoretical computer science 104 (2): 235–261, doi:10.1016/0304-3975(92)90124-X. .

Wikimedia Foundation. 2010.

Look at other dictionaries:

• Context-sensitive language — In theoretical computer science, a context sensitive language is a formal language that can be defined by a context sensitive grammar. That is one of the four types of grammars in the Chomsky hierarchy. Of the four, this is the least often used,… …   Wikipedia

• Context-sensitive grammar — A context sensitive grammar (CSG) is a formal grammar in which the left hand sides and right hand sides of any production rules may be surrounded by a context of terminal and nonterminal symbols. Context sensitive grammars are more general than… …   Wikipedia

• Context-free language — In formal language theory, a context free language is a language generated by some context free grammar. The set of all context free languages is identical to the set of languages accepted by pushdown automata. Contents 1 Examples 2 Closure… …   Wikipedia

• Deterministic context-free language — A deterministic context free language is a formal language which is defined by a deterministic context free grammar.[1] The set of deterministic context free languages is called DCFL[2] and is identical to the set of languages accepted by a… …   Wikipedia

• Context-free grammar — In formal language theory, a context free grammar (CFG) is a formal grammar in which every production rule is of the form V → w where V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals (w can be empty). The… …   Wikipedia

• Indexed language — Indexed languages are a class of formal languages discovered by Alfred Aho [ cite journal | last = Aho | first = Alfred | year = 1968 | title = Indexed grammars an extension of context free grammars | journal = Journal of the ACM | volume = 15 |… …   Wikipedia

• Recursive language — This article is about a class of formal languages as they are studied in mathematics and theoretical computer science. For computer languages that allow a function to call itself recursively, see Recursion (computer science). In mathematics,… …   Wikipedia

• Recursively enumerable language — In mathematics, logic and computer science, a recursively enumerable language is a type of formal language which is also called partially decidable or Turing acceptable. It is known as a type 0 language in the Chomsky hierarchy of formal… …   Wikipedia

• Deterministic context-free grammar — In formal grammar theory, the deterministic context free grammars (DCFGs) are a proper subset of the context free grammars. The deterministic context free grammars are those a deterministic pushdown automaton can recognize. A DCFG is the finite… …   Wikipedia

• Bach language — In theoretical computer science, the Bach language is the formal language over an alphabet of three distinct symbols containing all strings in which the three symbols occur equally often. The Bach language is a context sensitive language. Pullum… …   Wikipedia