The Data Ingestion Services support the automated attainment and import of unstructured and structured data from various sources for immediate use or storage in a database. In other words, these services extract data from the source where it was created or originally stored, and load it into a destination or staging area such as long-term storage in a data warehouse or data lake.
Ingested data can be streamed in real time or processed in batches. In real-time data ingestion, each data item is imported as the source emits it. When data is ingested in batches, data items are imported in discrete chunks at periodic intervals of time. The first step in an effective data ingestion process is to prioritize the data sources. Individual data object must be validated and data objects routed to the correct destinations.
A simple data ingestion pipeline might apply one or more light transformations enriching or filtering data before writing it to some set of destinations, a data store or a message queue. More complex transformations such as joins, aggregates, and sorts for specific analytics, applications and reporting systems may be done with additional Data Processing pipelines.
|
|
UUID | 7eb64aae-db21-4415-b39a-3fae9d2d3aed |
stereotype | Taxonomy Element |
C3T UUID | 7eb64aae-db21-4415-b39a-3fae9d2d3aed |
C3T URL | https://tide.act.nato.int/mediawiki/taxonomy/index.php/CR-1018 |
C3T Version | Generated from the Taxonony Wiki on 8 December 2022 |
C3T Date | 8 December 2022 |
Creator | HQ SACT |
Publisher | HQ SACT |
Classification | Unmarked |
Policy Identifier | Public |