Ghost Writer: Finding the Man behind the Pen
Navy SBIR FY2012.1


Sol No.: Navy SBIR FY2012.1
Topic No.: N121-080
Topic Title: Ghost Writer: Finding the Man behind the Pen
Proposal No.: N121-080-1156
Firm: Language Computer Corporation
2435 N. Central Expressway
Suite 1200
Richardson, Texas 75080-2747
Contact: Marc Tomlinson
Phone: (972) 231-0051
Web Site: www.languagecomputer.com
Abstract: Our proposed work addresses three significant challenges faced by systems which seek to enhance analysts' awareness and understanding of the authors and groups behind online documents. First, we will develop a novel feature based on cognitive psychology that proposes to identify authors by their area of expertise. In addition we will address the problem of identifying predictive features by exploring novel clustering techniques for jointly learning the relative importance of features for predicting the author and group responsible for producing a document. We will utilize our significant experience at parsing foreign languages to develop and extend the available set of authorship identification tools into Chinese, Spanish, and Russian. Lastly, following all software products at Language Computer we will follow a tiered design in development which allows for feature extraction, processing, and classification on real-time streaming text sources.
Benefits: Over the past 10 years, Language Computer Corporation (LCC) has been a pioneer in developing innovative solutions designed to help users unlock value from text corpora. We believe there is a growing need for systems capable of providing an accurate identification of the author and group behind the generation of an anonymou document within multiple sectors of the U.S. government, including the Departments of Defense, Department of Homeland Security, Federal Bureau of Investigation, and the national intelligence organizations overseen by the Office of the Director of National Intelligence. We anticipate that our prototype system developed as part of Ghost Writer will not only enhance the quality of Language Computer's CiceroCustom suite of open-domain customizable event extraction tools but will also serve the author identification needs of operational customers as well. Transitioning our technology into these organizations will support intelligence analysts, incident managers, intelligence preparation for the battlefield, and event detection / response collaboration. To accomplish our objectives, we will solicit our current contacts within the Intelligence and Homeland Security communities to demonstrate and tailor this capability to their needs. For DoD customers, we will work through our strategic partners to demonstrate and integrate conceptual visualization into their current and future systems. As our author identification tools become more robust, we expect the quality and coverage of our applications will drive demand for LCC's products in the civilian commercial sector as well.

Return