Without effective data wrangling, efficacy of data analytics is compromised. SigmaStream has addressed this challenge by setting up “Data Wrangling” process. The service is implemented by a group of analytics teams, each specializing in a specific domain.
The team utilizes a combination of manual techniques with utilities, tools and products. Data quality is ensured as a part of this process.
The service is available as standalone service where the users can submit their data to the SigmaStream team, which delivers the clean data back. The data is ready to be used by any analytics engine.
The other engagement model is where SigmaStream team gets engaged as a part of SigmaStream product ecosystem setup activity. In this model, team runs data quality management utilities included in the YellowHammer framework. Critical channel identification and mapping is carried out, which is manual. Later, HummingBird is used to carry out data correction. Corrected data is consumed by BlueCardinal for operational events generation. In this model, after two iterations of data wrangling activity, 80% automation is achieved.