Author and Group Insight through Linguistic Expression (AGILE)
Navy SBIR FY2012.1


Sol No.: Navy SBIR FY2012.1
Topic No.: N121-080
Topic Title: Author and Group Insight through Linguistic Expression (AGILE)
Proposal No.: N121-080-0306
Firm: DECISIVE ANALYTICS Corporation
1235 South Clark Street
Suite 400
Arlington, Virginia 22202
Contact: Peter David
Phone: (703) 414-5009
Web Site: http://www.dac.us
Abstract: The intelligence value of a document goes far beyond the face value of its content1. Clues to the identity, worldview, and even the psychological state of its author are encoded in features such as word choice, sentence structure, and explicit and implied statements of group membership. Years of research have shown that statistical and linguistic methods can shed light on a substantial amount of information about the identity and characteristics of an author. But traditional analysis techniques have been investigated in isolation, on a small scale, and with limited variety in the target documents. The Author and Group Insight through Linguistic Expression (AGILE) approach to author analysis extends DAC's text analytics platform by incorporating a variety of extensions to the standard set of stylometric features used to attribute authorship. AGILE uses DAC's existing semantic and sentiment processing technology to extract discourse-based features that capture the way authors perceive themselves and their relationships with other entities. The Phase I effort demonstrates how discourse features can be extracted from a variety of on-line sources of English and Arabic text. A series of experiments evaluates the power of discourse features to cluster documents and authors according to their social identity and world view.
Benefits: Years of research across multiple disciplines have shown that statistical, linguistic, and psycholinguistic methods can shed light on a substantial amount of information about the identity and characteristics of authors based on analysis of their documents. But this diverse set of analysis techniques has been investigated in isolation, on a relatively small scale, and with limited linguistic variety in the target documents. The Author and Group Insight through Linguistic Expression (AGILE) effort integrates a suite of diverse analysis techniques into a unified capability that addresses the large-scale, multi-media, and multi-language author analysis problem facing the Warfighter. Our unique vision extends DAC's powerful text analytics platform by incorporating a variety of powerful author analysis methods. This Phase I effort combines existing author analysis methods with novel features supported by our text analytics capabilities. Advanced grouping and clustering methods are used to determine authorship and community membership. We demonstrate and refine our analysis methods through challenging, real-world problems that mimic the scale and complexity of the document exploitation problems facing the Warfighter.

Return