Back End Data Lake and Microservices (BEDLAM) Strategy for Battle Management Aid (BMA) Development

Navy SBIR 23.1 - Topic N231-015
NAVAIR - Naval Air Systems Command
Pre-release 1/11/23   Opens to accept proposals 2/08/23   Closes 3/08/23 12:00pm ET    [ View Q&A ]

N231-015 TITLE: Back End Data Lake and Microservices (BEDLAM) Strategy for Battle Management Aid (BMA) Development

OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Artificial Intelligence (AI)/Machine Learning (ML)

The technology within this topic is restricted under the International Traffic in Arms Regulation (ITAR), 22 CFR Parts 120-130, which controls the export and import of defense-related material and services, including export of sensitive technical data, or the Export Administration Regulation (EAR), 15 CFR Parts 730-774, which controls dual use items. Offerors must disclose any proposed use of foreign nationals (FNs), their country(ies) of origin, the type of visa or work permit possessed, and the statement of work (SOW) tasks intended for accomplishment by the FN(s) in accordance with the Announcement. Offerors are advised foreign nationals proposed to perform on this topic may be restricted due to the technical data under US Export Control Laws.

OBJECTIVE: Develop a prototype integrated multiple Battle Management Aid (BMA) data lake, and expose the data available to developers for reuse while maintaining proper security boundaries for the software applications to protect intellectual property rights of all developers.

DESCRIPTION: The U.S. Government is in need of a method to standardize and add desired data and microservices into a common repository for use and reuse. Data sources are often common between applications but the data is delivered to the application as needed, such that a common data source and common data delivery occurs asynchronously and takes up available bandwidth for intra-application sharing multiple times. Similarly, developers often develop BMA in nongovernment-controlled repositories, which, despite inherent common microservices, may not use the same sources for those microservices (e.g., time servers, network protocols, etc.). As developers deliver code into the U.S. Government�s environment, re-development is often required to integrate a replacement set of microservices over the original baseline to adhere to the environment�s available source requiring rework to recode to use the U.S. Government source.

In addition, current-state microservices are typically limited to basic data such as time or position. As increasingly complex BMAs are developed, the potential for reuse, and therefore optimization, of shared data across BMAs is limited without a unified data strategy for development/security/operations (DevSecOps) environments. This SBIR topic seeks to take advantage of a subset of known data requirements across current BMAs, to include more complex data available as an output from a multitude of existing applications and leverage the current-state DevSecOps environment to provide that data as one of the available microservices for development. As the U.S. Government seeks to require developers not only to deliver, but actually develop in the U.S. Government�s DevSecOps environment, a more flexible back end data lake to enable sharing of data across BMAs in the same environment is desired. An added benefit is that as this government purpose rights data lake is rendered, it should facilitate application porting between environments without imposing an overhead cost, because all used commercial DevSecOps environments may adopt and expand it.

This SBIR topic seeks to enable management of big data by standardizing the mechanism for delivering data repeatedly to multiple applications on a shared network, using data fields common between differing applications. This is also an enabler for network-aware applications to be developed, because one of the common data fields will natively become the required information to interface with a routing stack, and to build the request for data into a common mechanism across the same network. As capabilities like Communications-as-a-Service (CaaS), which is a Program of Record requiring compliant applications to proliferate, are fielded, the same applications may be useful across different platforms with minimal Non-Recurring Engineering to integrate them, and this data lake could become a commoditized government furnished software product to all developers in the future.

The desired solution should be flexible and adaptable. As a given developer requests data and access to data sources, the back end data lake will optimize the computational, storage, and communications bottlenecks inherent in a large monolithic traditional development where feasible, to enable the same data to be accessible to multiple developers, whether contractor or government. The goal is to avoid duplicative storage of data� � and therefore the design might include data translation capabilities as infrastructure the data lake will host. The data lake should also provide an adequate mechanism for the U.S. Government to write BMA requirements into contractual efforts to leverage it, such as an application programmable interface (API) such as RESTful. Note, REST is an Architecture, not a Standard, but rather, it's an architectural style that provides constraints that guide API design. Many APIs do not conform to every element of REST, which has caused some to use the term RESTful to describe the most common types of APIs.

As an example task, many applications exist that enable specific physical layer control over a tactical radio, but these are typically developed by the radio manufacturer, necessitating a license cost for any network controller leveraging that particular application. For next-generation Naval Tactical Grid (NTG) applications, the Department of Navy (DON) must optimize the backend data lake supporting the front end graphical user interface for control and management of many network applications, including, but not limited to, the physical layer devices, crypto, and legacy interfaces using the available standardized interfaces that are becoming required, such as Secure Network Management Protocol (SNMP) for Common Data Link (CDL) systems (Bandwidth Efficient CDL Rev B specification is classified, this is the protocol required), Dynamic Link Enhancement Protocol (DLEP) for Tactical radios, and so forth; for translating to legacy interfaces at the data layer before delivering the requested data to the user interface layer.

Developing a single data lake that can be a repository for relevant data coming from all hosted applications, cognitively recognize when duplicative data is being requested via a services-oriented architecture, such as Communications-as-a-Service (CaaS).

The proposed solution should minimize the bandwidth of duplicate data traversing the network backbone, support all BMA developers in providing a unified data management strategy during development in a DevSecOps environment by enabling a common data lake to then provide microservices (e.g., map data, timing, navigation signals, own-ship position, etc.) across applications, rather than uniquely requesting that data across. This is often a bandwidth constrained communications backbone during wartime operations.

Proposed solutions should support standardized network management protocols such as simple network management protocol (SNMP) and adhere to government-defined data models such as the YANG data model. Proposed solutions should support Containers-as-a-Service (CaaS) development, to provide a solution for an initial data lake that could be supported in a DevSecOps environment such as Overmatch Software Armory (OSA) to support all BMA developers toward the end state of a common microservices architecture and data lake. Design should consider acquisition constraints for current-state processes for fielding new systems and applications to a shipboard environment in the strategy for implementation.

Work produced in Phase II may become classified. Note: The prospective contractor(s) must be U.S. owned and operated with no foreign influence as defined by DoD 5220.22-M, National Industrial Security Program Operating Manual, unless acceptable mitigating procedures can and have been implemented and approved by the Defense Counterintelligence and Security Agency (DCSA) formerly Defense Security Service (DSS). The selected contractor must be able to acquire and maintain a secret level facility and Personnel Security Clearances. This will allow contractor personnel to perform on advanced phases of this project as set forth by DCSA and NAVAIR in order to gain access to classified information pertaining to the national defense of the United States and its allies; this will be an inherent requirement. The selected company will be required to safeguard classified material IAW DoD 5220.22-M during the advanced phases of this contract.

PHASE I: Develop and demonstrate feasibility of a design citing industry standard methods for merging together the superset of data inputs and outputs from a sample set of existing applications and rendering that into a backend data lake with an accessible API for use in development of new applications. Methods for manipulating data into multiple requested formats from the raw repository state, providing micro services requests to and from applications, methods of enabling parallel processing, and methods of data management to minimize redundancy and optimize network performance for multiple data requests of the same time to different endpoints, are all in scope of this effort. A proposed implementation plan, including a mechanism for publishing new data sources, formats, and micro services coded to specific applications that can be tailored, should be included. The Phase I effort will include prototype plans to be developed under Phase II.

PHASE II: Develop a prototype data lake solution and reference implementation of BMA developer resource requests and automated delivery of requested data. An example using existing BMAs is not required, but would provide a meaningful deliverable. Implement into U.S. Government DevSecOps environment (specified by the topic�s Technical Point of Contact) and support BMA development in response to a validated fleet requirement specified by the government at kickoff with the microservices and data lake on the back end as prototype.

Work in Phase II may become classified. Please see note in Description paragraph.

PHASE III DUAL USE APPLICATIONS: Integrate government-specified third-party developers in refactoring, new development, or interfacing of at least two government-specified, third-party-developed BMAs to prove out the concept and continue to refine the application from Phase II to at least two service-level platform systems across the joint community in response to a validated fleet requirement.

Private sector has an equal, if not greater, requirement for big data analytics and real-time performance (e.g., analysis of market trends driving a decision to invest or divest in a given stock, fund, or sector).

REFERENCES:

1.       Mapeso, R. (2020, September 18). Why data lakes are more powerful for the DOD than commercial industry. Nextgov. https://www.nextgov.com/ideas/2020/09/why-data-lakes-are-more-powerful-dod-commercial-industry/168475/

2.       Boyd, A. (2020, September 29). Air Force wants novel ideas for building "data scientist�s ecosystems" at operations centers. Nextgov. https://www.nextgov.com/analytics-data/2020/09/air-force-wants-novel-ideas-building-data-scientists-ecosystems-operations-centers/168859/

3.       Haystead, John. (1997, October 1). Show me the data: High-speed commercial serial buses square off for real-time, military and aerospace applications. Military & Aerospace Electronics. https://www.militaryaerospace.com/computers/article/16710126/show-me-the-data-highspeed-commercial-serial-buses-square-off-for-realtime-military-and-aerospace-applications

4.       Tek, M. (2017, April 12). MIL-STD-1553B in avionics: where data networking has been and where it�s going. Intelligent Aerospace. https://www.intelligent-aerospace.com/commercial/article/16544804/milstd1553b-in-avionics-where-data-networking-has-been-and-where-its-going

5.       Department of Defense. (2006, February 28). DoD 5220.22-M National Industrial Security Program Operating Manual (Incorporating Change 2, May 18, 2016). Department of Defense. https://www.esd.whs.mil/portals/54/documents/dd/issuances/dodm/522022m.pdf

 

KEYWORDS: Data Management; Data Strategy; Big Data; Optimization; Software Development; Battle Management Aid


** TOPIC NOTICE **

The Navy Topic above is an "unofficial" copy from the Navy Topics in the DoD 23.1 SBIR BAA. Please see the official DoD Topic website at www.defensesbirsttr.mil/SBIR-STTR/Opportunities/#announcements for any updates.

The DoD issued its Navy 23.1 SBIR Topics pre-release on January 11, 2023 which opens to receive proposals on February 8, 2023, and closes March 8, 2023 (12:00pm ET).

Direct Contact with Topic Authors: During the pre-release period (January 11, 2023 thru February 7, 2023) proposing firms have an opportunity to directly contact the Technical Point of Contact (TPOC) to ask technical questions about the specific BAA topic. Once DoD begins accepting proposals on February 8, 2023 no further direct contact between proposers and topic authors is allowed unless the Topic Author is responding to a question submitted during the Pre-release period.

SITIS Q&A System: After the pre-release period, and until February 22, 2023, (at 12:00 PM ET), proposers may submit written questions through SITIS (SBIR/STTR Interactive Topic Information System) at www.dodsbirsttr.mil/topics-app/, login and follow instructions. In SITIS, the questioner and respondent remain anonymous but all questions and answers are posted for general viewing.

Topics Search Engine: Visit the DoD Topic Search Tool at www.dodsbirsttr.mil/topics-app/ to find topics by keyword across all DoD Components participating in this BAA.

Help: If you have general questions about the DoD SBIR program, please contact the DoD SBIR Help Desk via email at [email protected]

Topic Q & A

2/27/23  Q. Operational or Development Data Lake � Are you looking for a Data Lake for developers to integrate new systems that included CICD pipelines, vendor data separation, Integration documentation, etc., or are you looking for an operational Data Lake where the contracted systems are integrated together and captures government owned operational data that systems previously developed can leverage?
   A. Either/Both.
2/8/23  Q. Is the Data Lake expected to live in a cloud? Are the BMAs expected to be cloud-based applications?
   A. It is not required that BMAs are Cloud-based applications, however, the current in-use DevSecOps environments are commercial cloud hosted government-owned environments. Once deployed to a given platform, the applications will be hosted on local hardware on that platform. It is recommended that applications be capable of providing service-request information to the network over which data lake hosted data will be passed. The specifications for this service request will be provided to Phase II awardees.
2/8/23  Q. What security considerations you would like to see addressed under this topic? Can we assume for now that all contents of the Data Lake are at the same level of classification? Can we assume that data sources have been vetted and are trustworthy? Can we assume the BMAs requesting information have been authenticated and are authorized?
   A. Assume that all contents of the Data Lake are managed at the appropriate classification level, have undergone an approval process for inclusion and sources are trustworthy.
All BMAs requesting information have undergone a cybersecurity approval process for the individual application (Authority to Operate) for the CANES network and comply with requirements for the current Software DevSecOps environment for deployment on CANES.
In Phase I, the government is exploring concepts. In Phase II, prototyping is anticipated but not required to occur within a DevSecOps environment controlled by the government. It is expected that commercial DevSecOps environments will potentially be used instead. In Phase III, the government would expect compliance with deployment and certification requirements to include onboarding the application into a government-owned DevSecOps environment, and potentially development within the environment if a suitable alternative is not available at that time.

2/8/23  Q. Can you give us examples (names) of BMAs that could benefit from this technology?
   A. Commercial Off-the-Shelf and Government-Developed Chat Applications. Situational awareness dashboard tools required to display real-time data such as platform position pulled from a data link feed.
2/8/23  Q. Are there specific BMAs and/or Programs of Record being targeted for transition of technology developed under this topic?
   A. One possibility is the Consolidated Afloat Networks and Enterprise Services (CANES) program of record. This is the common computing environment for more than 40 command, control, intelligence and logistics applications.
Point of Contact Program Executive Office Command, Control, Communications, Computers and Intelligence Tactical Networks Program Office 4301 Pacific Highway, San Diego, CA 92110

2/8/23  Q. Additionally, the government wants microservices that will provide BMAs with commonly used data in standard data formats. BMAs needing this data will use these microservices instead of building their own.
Efficient distribution of information to the BMAs. Reduce network bandwidth utilization by minimizing the number of times the same piece of information is sent to different BMAs.
   A. This is correct.
2/8/23  Q. Is it correct that there are two aspects to the problem being addressed by this solicitation:
Information discovery in the Data Lake. The Data Lake is a collection of structured and unstructured data. The govt is looking for mechanisms that will allow Battle Management Applications (BMAs) to readily find the information they need from this Data Lake.
   A. This is correct.
2/6/23  Q. 1. What are the key BMAs you would like supported first?
   A. 1. Generic chat application, a parser of positional data for a designated set of nodes in a network
2/6/23  Q. 2. Protecting developer IP is a stated objective. Can you elaborate on the scope and boundaries of developer rights and gov�t rights that you seek to enable?
   A. 2. As common development will occur in the shared environment, it is imperative that any data exposed to the data lake that might inadvertently provide insight into the intellectual property associated with the application is protected. This might be accomplished via a data tagging mechanism or perhaps the data lake will require some relinquishment of data harvest. We are open to new mechanisms.
2/6/23  Q. 3.Can you provide a validated fleet example of repeated asynchronous data availability and data delivery processes exceeding bandwidth?
   A. 3. Intermittently connected data links will provide an update to a known data model as connectivity is established (the asynchronous nature) but on a repeatable basis (whenever connectivity is established, which will be a function of line of sight or removal of obstruction by navigation).
2/6/23  Q. 4.Can you provide any insight into gov�t identified problems with the current unified data strategy for DevSecOps?
   A. 4. Currently, all Naval Acquisition SYSCOMs pursue software development via their own mechanism. While within a given SYSCOM (e.g. NAVSEA) there is often a common DevSecOps environment (e.g. "The Forge"), other SYSCOMs seeking to implement applications developed in the Forge often have to introduce the common developer into a second environment to tailor the common application, due to differing interpretations of the rules governing DevSecOps security, different standardized back end applications and services, etc. The primarily difficulty with DevSecOps is not so much the Dev, but the Continuous Integration/Continuous Deployment of common applications developed in these distinct environments, to multiple platforms or architectures. This common back end data lake seeks to minimize the duplicate instances of common data traversing the network on the ship primarily, but there is a case to be made for standardization of the data lake across environments in future.
2/6/23  Q. 5. Can you provide the most ideal common data fields you would like to see this SBIR address?
   A. 5. position, timestamp, unique ID, state functions for a network status
2/6/23  Q. 6. Can you provide examples of computational, storage and communications bottlenecks?
   A. 6. Often the ability to send data on and off ship/platform is limited to a low bandwidth link. Often new applications are fielded on a new piece of limited compute hardware with limited storage, and expecting robust logging of data files during operations requires interfacing to a common storage device instead of the host device.
2/6/23  Q. 7. Can you provide relevant background information on the CaaS program of record?
   A. 7. CaaS is part of PMW160 Advanced Digital Networking System (ADNS) Program of Record as a router. PMW160 also oversees the Consolidated Afloat Network Enterprise Services (CANES) backbone which provide inter-ship connectivity to applications and interfaces with ADNS. ADNS provides off-ship communications and CaaS governs the quality of service for prioritization of network traffic.
2/6/23  Q. 8. Can you provide relevant examples of Navy Tactical Grid (NTG) data that would better enable us to address the tasks in Phase I?
   A. 8. Chat data that does not require continuous connectivity to a host server, positional data that must be harvested from a tactical data link interface for display in a third party application for situational awareness.
2/6/23  Q. 9. Please provide a more clear example of the Acquisition realities of fielding to a shipboard environment.
   A. 9. This is a process governed by the individual SYSCOM and out of scope for Phase I.
2/6/23  Q. 10. What are the TPOC�s desired gov't DevSecOps environment(s)?
   A. 10. At this time, consensus is not established. Equally viable targets are the Software Forge (NAVSEA), Overmatch Software Armory (OSA), or an individual platform DevSecOps environment within NAVAIR.
2/6/23  Q. 11. Will the Navy provide at least 3 data source streams, or simulations, or specs so we can build mock streams that we can use to develop data lake ingest components?
   A. 11. Yes
2/6/23  Q. 12. Will the Navy provide at least 3 BMAS, or BMA simulators, or BMA specs so we can build mock BMAs that we can use to develop data lake clients?
   A. 12. Yes
2/6/23  Q. 13. Can we onboard engineers to the Overmatch Software Armory, or other chosen DevSecOps environment, in Phase 1 to inspect the environment to ensure data lake design is compatible with the target environment?
   A. 13. Potentially in Phase II. A technical exchange is possible at the conclusion of Phase I for scoping Phase II proposals.
2/6/23  Q. 14. Given that Overmatch supports multiple CSPs, what is the tradeoff preference between cloud native versus cloud agnostic implementation? Also, will there potentially be multi-cloud requirements (i.e. components operating in both AWS and Azure)?
   A. 14. Potentially in Phase II. A technical exchange is possible at the conclusion of Phase I for scoping Phase II proposals.
2/6/23  Q. 15.Would government consider data fabric or data mesh architectures as an alternative to a more traditional data lake (note, I'd need to get a little smarter about what these actually are)
   A. 15. Yes.
2/6/23  Q. 16. what is the anticipated volume and frequency of data ingested into the data lake?
   A. 16. This will be variable and dependent on the host platform governance structure for software updates.
2/6/23  Q. 17. is the government able to provide any details about components of the Overmatch Software Armory in order to replicate its functionality in an unclassified environment?
   A. 17. The Navy will provide this information to Phase I awardees (or Phase II or Phase III awardees).
2/6/23  Q. 18. is the primary use case to consolidate embedded system communications?
   A. 18. Yes and to reduce the duplicative development required when lacking awareness of common data available to the application (and duplicate interfaces that are app-specific).
2/6/23  Q. 19. Do you want to read data out to multiple sources, or do you want to write data to the datalake from multiple sources too? We assume that latter, but want to confirm
   A. 19. Both.
2/6/23  Q. 20. Is OSA or similar DoD supported platform (like Platform1, Black Pearl or Spork) required?
   A. 20. No, but knowledge of the requirements of those environments is beneficial to include.
2/6/23  Q. 21. - Does the design need to be proven to run inside of Openshift with standard pod security policies in place? Would that be sufficient, or are there further restrictions in the shipboard environment? If so, where could we find a specific list?
   A. 21. This would be ideal. Shipboard specific questions would need to be deferred to the technical warrant holder at time of transition and is out of scope for Phase I.

[ Return ]