The information explosion and the rise of the Internet of Things are creating new business opportunities. However, in order to exploit opportunities, organizations are facing new technological challenges. Traditional tools may not be adequate to address demands for scalability, performance, fault tolerance or real-time.
Timely and relevant information can fuel significant insights. From consumer or competitor patterns and behavior, anomaly detection, supply chain optimization and more. Developing a data ingestion strategy is therefore of quite importance. Data Ingestion is the process of loading a small or large amount of data, mainly for analytical needs, from a variety of sources coming in different formats and at different speed.
This is a necessary solution as, in data management, the amount of data is not the only valuable thing. The speed at which the data are produced and the time that is necessary to process them are also important factors.The sooner we have the information the sooner we can intervene.
A modern data ingestion platform should be able to:
- Integrate heterogeneous sources.
- Process small as well as large volume of data.
- Process heterogeneous formats.
- Accomodate data, which arrive at different speed.
- Support heterogeneous data sync (Polyglot Persistence).
- Elaborate complex logic and correlate with new or existing information.
- Support pull and push mechanisms.
- Be secure, fast, scalable and resilient.
- Be capable to run jobs on premise and/or on cloud.
- Support version control.
- Continuously ingest data using a unified model.
We designed a data ingestion solution and we called Unified Continuous Ingestion Platform. It’s a NoETL solutions, that brings us to the fast data approach, where we don’t see our data as big silo to process during several hours, but as a continuous stream flowing very fast and giving us meaningful up to date information.
Furthermore, our solution accommodates data, arriving at different speed; supports heterogeneous data sync (Polyglot Persistence); and elaborates complex logic and correlates with new or existing information.
Our platform is:
- unified: it uses the same language and the same technology for both batch and streaming; it is a more advanced system of delta architecture and overcomes the problems encountered with the lambda architecture because it does not need two different languages, two technologies and two changes;
- continuous: no task scheduler but active 7 days out of 7 and 24h; it’s full tolerant as restarts after the detection of a problem; it detects independently about which “track” send data (railway system);
- ingestion: it’s versionabile; it can anticipate how the data is to be processed and therefore anticipate the information and decisions; and finally it improves the research, processing and presentation of data.