Information Extraction and Scoring System
Navy SBIR 2014.1 - Topic N141-037
NAVSEA - Mr. Dean Putnam - [email protected]
Opens: Dec 20, 2013 - Closes: Jan 22, 2014

N141-037 TITLE: Information Extraction and Scoring System

TECHNOLOGY AREAS: Materials/Processes

ACQUISITION PROGRAM: PEO IWS 2.0, Above Water Sensors

OBJECTIVE: Develop an automated information extraction and scoring system for Electronic Warfare (EW) systems to provide disambiguated and intelligent sustainment solutions

DESCRIPTION: Systems exist today that track ship-board inventories of spare equipment and other sustainment related needs. The use of these tracking systems requires that a user manually research and combine the information to make an objective conclusion on how to best maintain Electronic Warfare (EW) systems for optimal operational readiness. Obtaining a true picture of the ship�s operational status and risk factors is hampered by inaccurate or out-of-date information in the inventory systems and by a partial or limited view of the inter-dependencies upon systems, components, and parts. This fact is supported by the Government Accounting Office report GAO-03-887 (Ref 1), which shows the Navy recognizes that it has not met its supply goals for over 20 years.

There also exists a breadth of textual resources which remain apart from the tracking systems in use. These resources include such documents as Department of Defense (DoD) policy and instructions, Concept of Operations documents which address new equipment or new methods for using existing equipment, intelligence briefings on novel threats, systems engineering plans, system sustainment strategies and plans, revolutions in training, and technical manuals for onboard equipment. This textual information is essential to an accurate understanding of overall system design requirements, development and sustainment efforts, and their impact on the readiness and suitability of EW systems (such as identifying tools and parts inventory shortages that put the system�s readiness at risk, decreasing operating and support costs, and providing accurate data-driven corrective actions). Because this valuable information is only available in a non-machine readable format as textual documents, the Navy has no cost effective and timely method for determining whether the proper support solution is available to reduce repair time and maintain its equipment in an optimal operational readiness state.

The Navy needs an automated method that identifies a physical support solution and scores the impact that solution has on the system or part replacement events, diagnoses recorded by maintenance personnel, impacts to overall spare equipment inventory, and training effectiveness throughout the design, development, and sustainment activities. The innovation sought is a software system to (a) automate linking the system and sustainment dependencies identified in acquisition documentation to the physical support solution (for parts, tools, equipment, and others ) and (b) score the efforts (for example development, procurement, actual utilization rates, inventory impact estimates, and others) through unstructured text analysis.

Entity Extraction is the ability to automatically extract meaningful information trapped in a variety of non-structured technical documents and information. State-of-the-art entity extraction systems can be customized to identify tools, parts, and equipment available in non-machine-readable text narratives with high accuracy to extract the text data; however, this will not meet the need to identify the implicit information about overall system impact from the text sources and provide a recommended solution. The solutions offered should document the metrics used to monitor the accuracy of text extraction and the impact scoring analysis algorithms (see Ref 2 for examples of performance criteria typically used). It is expected that benefits can be gained by exploiting structured data available in current data sources (such as configuration data, technical manuals, parts inventories) to boost entity extraction performance and to aid in the resolution of similar and identical equipment (Ref. 3-4).

Given that the same physical piece of equipment will be identified in many different ways either by its function, by the manufacturer�s part number, by a configuration item number, or by a reference designator a proposed solution will necessarily have to support a form of entity disambiguation that has the capability to gather different identifiers from the current sustainment systems and also from the extracted text, which all refer to equivalent or identical parts, systems, and stock items and provide a recommended support solution. With this technology, the reliability and the overall operational readiness of the fleet will be improved while lowering operating and maintenance costs.

PHASE I: The company will develop a concept for an automated information extraction and scoring system that meet the requirements described above. The company will demonstrate the feasibility of the concept in meeting Navy needs and will establish that the concept can be feasibly developed into a useful product for the Navy. Feasibility will be established by demonstrating entity extraction and analytical modeling. The small business will provide a Phase II development plan that addresses technical risk reduction and provides performance goals and key technical milestones.

PHASE II: Based on the results of Phase I and the Phase II development plan, the small business will develop a prototype automated information extraction and scoring system for evaluation. The prototype will be evaluated to determine its capability in meeting the performance goals defined in the Phase II development plan and the Navy requirements. System performance will be demonstrated through prototype evaluation and modeling or analytical methods over the required range of parameters including numerous deployment cycles. Evaluation results will be used to refine the prototype into an initial design that will meet Navy requirements. The company will prepare a Phase III development plan to transition the technology to Navy use.

PHASE III: The company will be expected to support the Navy in transitioning the technology for Navy use. The company will develop an automated information extraction and scoring system according to the Phase III development plan for evaluation to determine its effectiveness in an operationally relevant environment. The company will support the Navy for test and validation to certify and qualify the system for Navy use.

PRIVATE SECTOR COMMERCIAL POTENTIAL/DUAL-USE APPLICATIONS: Possible commercial applications of these solutions include the medical and clinical records management environments where detailed physicians� and clinicians� notes can be extracted to track prescriptions, patient data, and others. The solutions developed here can also assist in other disciplines in which tracing textual impact on policies or availability of resources is an issue, such as for legal and regulatory agencies, software development efforts, and intelligence analysis.

REFERENCES:
1. U.S. Government Accountability Office. (2003, August). Opportunities Exist to Improve Spare Parts Support Aboard Deployed Navy Ships. (Publication No. GAO-03-887). <http://www.gao.gov/products/GAO-03-887>

2. "Automatic Content Extraction 2008 Evaluation Plan (ACE08), Assessment of Detection and Recognition of Entities and Relations Within and Across Documents." <http://www.itl.nist.gov/iad/894.01/tests/ace/2008/doc/ace08-evalplan.v1.2d.pdf>

3. Bratus, Sergey, Anna Rumshisky, Rajendra Magar, and Paul Thompson. "Using domain knowledge for ontology-guided entity extraction from noisy, unstructured text data." In Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data (AND '09). ACM, New York, NY, USA, 101-106. 2009. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.158.2357>

4. Rao, R. Bharat, Sriram Krishnan, and Radu Stefan Niculescu. "Data mining for improved cardiac care." SIGKDD Explor. Newsl. 8, 1 (June 2006), 3-10. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.105.9674&rep=rep1&type=pdf>

KEYWORDS: entity extraction; unstructured text analysis; entity disambiguation; information extraction and scoring; entity disambiguation; information analysis

** TOPIC AUTHOR (TPOC) **
DoD Notice:  
Between November 20 and December 19 you may talk directly with the Topic Authors (TPOC) to ask technical questions about the topics. Their contact information is listed above. For reasons of competitive fairness, direct communication between proposers and topic authors is
not allowed starting Dec 20, 2013, when DoD begins accepting proposals for this solicitation.
However, proposers may still submit written questions about solicitation topics through the DoD's SBIR/STTR Interactive Topic Information System (SITIS), in which the questioner and respondent remain anonymous and all questions and answers are posted electronically for general viewing until the solicitation closes. All proposers are advised to monitor SITIS (14.1 Q&A) during the solicitation period for questions and answers, and other significant information, relevant to the SBIR 14.1 topic under which they are proposing.

If you have general questions about DoD SBIR program, please contact the DoD SBIR Help Desk at (866) 724-7457 or email weblink.