link-parser (1) Linux Manual Page
link-parser – parses natural language sentences
Synopsis
link-parser [language] [-pp pp_knowledge_file] [-c constituent_knowledge_file] [-a affix_file] [-ppoff] [-coff] [-aoff] [-batch] [-<special "!" command>]Description
In Selator, D. and Temperly, D. "Parsing English with a Link Grammar" (1991), the authors defined a new formal grammatical system called a "link grammar". A sequence of words is in the language of a link grammar if there is a way to draw "links" between words in such a way that the local requirements of each word are satisfied, the links do not cross, and the words form a consistent connected graph. The authors encoded English grammar into such a system, and wrote link-parser to parse English using this grammar. This package can be used for linguistic parsing for information retrieval or extraction from natural language documents. Abiword also uses it as a grammar checker.
Options
- -pp pp_knowledge_file
- -c constituent_knowledge_file
- -a affix_file
- -ppoff
- -coff
- -aoff
- -batch
- -<special ! command>
Use
link-parser depends on a link-grammar dictionary which contains lists of words and associated metadata about their grammatical properties in order to analyze sentences. A link-grammar dictionary provided by the authors of link-grammar is usually included with the link-grammar package, and can often be found somewhere in the /usr/share/link-grammar/ hierarchy. When this is the case, only the two-letter language code needs to be specified on the command-line. Alternatively, a user can provide their own dictionary as an argument, in which case the dictionary’s directory should be specified. Hence, either of the commands
- link-parser en
- link-parser /usr/share/link-grammar/en
- will run link-parser using the english dictionary included with the parser.
While in a link-parser session, some example output could be:
- linkparser> Reading a man page is informative.
++++Time 0.00 seconds (0.01 total)
Found 1 linkage (1 had no P.P. violations)
Unique linkage, cost vector = (UNUSED=0 DIS=0 AND=0 LEN=12) +————————Xp———————–+
| +———Ss*g———+ |
| +——-Os——-+ | |
| | +—-Ds—-+ | |
+—-Wd—+ | +–AN–+ +—Pa—+ |
| | | | | | | | LEFT-WALL reading.g a man.n page.n is.v informative.a .
A P.P. violation is a post-processing violation; it is a post-linkage step used to reject invalid parses. The link types shown are specific to English; other langauges will have different link types.
link-parser can also be used non-interactively, either through its API, or via the -batch option. When used with the -batch option, link-parser passively receives input from standard input, and when the stream finishes, it then outputs its analysis. So one could construct an ad-hoc grammar checker by piping text through link-parser with a batch option, and seeing what sentences fail to parse as valid:
- cat thesis.txt | link-parser /usr/share/link-grammar/en/4.0.dict -batch
See Also
Information on the shared-library API and the link types used in the parse is avavailable at the Abiword website at http://www.abisource.com/projects/link-grammar/dict/index.htmlPeer-reviewed papers explaining link-parser can be found at the original CMU site at http://www.link.cs.cmu.edu/link/papers/index.html.
Author
link-parser was written by Daniel Sleator <sleator [at] cs.cmu.edu>, Davy Temperley <dtemp [at] theory.esm.rochester.edu>, and John Lafferty <lafferty [at] cs.cmu.edu>This manual page was written by Ken Bloom <kbloom [at] gmail.com>, for the Debian project (but may be used by others).
