Analyst-centric multimedia analysis scattered system (AMASS)
Navy SBIR FY2014.2


Sol No.: Navy SBIR FY2014.2
Topic No.: N142-122
Topic Title: Analyst-centric multimedia analysis scattered system (AMASS)
Proposal No.: N142-122-0024
Firm: Mayachitra, Inc.
5266 Hollister Avenue, Suite 229
Santa Barbara, California 93111
Contact: Jelena Tesic
Phone: (646) 379-6042
Abstract: Intelligence cues in multimedia data are result of multiple sources: inherent knowledge of multiple end-users (analysts), feature-rich digital data content (co-occurrence of specific behaviors and scenes in video, audio, other sensors), and intelligence context (where, when, why, how). Analysts need to fully access video and acoustic data content (when multiple constraints are present), formulate complex queries across features' modalities, and visualize patterns from retrieved results in multiple contextual spaces. To do this real-time requires the sophisticated back-end: storage, common representation, search, annotation, and tagging schemes to manage the rich and diverse information contained in sensor feeds (video metadata, acoustic files, analyst comments, spatial and temporal localisation, context). To do it accurately requires the sophisticated data retrieval that relies on the information fusion of various sources. Analysts expect from the system to produce time-critical actionable intelligence and insights beyond the querying. Sole domain techniques are not applicable here, as they solve only part of the problem (high-dimensional descriptor search for video and audio content or text search for transcripts). To do this effectively, the project will explore deep learning techniques to capture the underlying dynamic of useful insights. The project will develop an end-to-end solution that supports (a) back-end development and integration of a wide range of video and audio descriptions at different semantic levels through unified representation of content description, and inference of the stored semantic knowledge at retrieval time; (b) fast and versatile access (security and bandwidth wise) and addition of rich semantic description in collaborative environments (back-end and front-end feature annotation and tagging); and (c) sequencing, and discovery of information contained in distributed networked sensor data files at the frame level.
Benefits: The system will allow for more intelligent data manipulation and access based on the various cues the analyst wants to start the search with, for example: based on the analyst request, in one iteration the system will be able to efficiently retrieve all video frames and audio tracks from the distributed system, analyze the trends within answer sets and point to more interesting leads (e.g. most of the relevant videos were captured by "person x" or "strong audio pattern of silence is also part of the answer set").

Return