The CG framework allows us to capture highly domain-specific lexical patterns such as words and cue phrases that have particular meanings or implications in the biological context, while still making use of more abstract linguistic structure such as clause and phrasal boundaries, established through part of speech tagging and shallow parsing, to constrain the recognition of patterns for protein/gene interactions in context. It furthermore accommodates the representation of domain-specific semantic properties of specific patterns — both to constrain recognition and to guide interpretation — without depending on the identification of deep structural relationships in the text . This approach increases the precision of the interaction extraction without requiring complete linguistic analysis. In this prototype, we focus on constructions at the clausal level that are tolerant to intervening modifiers not contributing to the main content of the clause.
Blaschke C and Valencia A.
(2002) The frame-based module of the Suiseki information extraction system,
IEEE Intelligent Systems 17: 14-20.
Fillmore, C. 1985. Syntactic
intrusion and the notion of grammatical construction. Goldberg, A. 1995. Constructions:
A Construction Grammar Approach to Argument Structure. Papcun, George, Kari Sentz, Andy
Fulmer, Jun Xu, Olaf Lubeck, and Yakushiji, Akane, Yuka Tateisi, Yusuke
Miyao and Jun'ichi Tsujii. (2001). Event extraction from biomedical papers
using a full parser. In the Proceedings of the sixth Pacific Symposium on
Biocomputing (PSB 2001).