We wanted to be more efficient and organised even when data we were dealing with wasn’t really of high quality. Another challenge that we have encountered in upgrading our data operations was that every customer had different expectations thus every project was unique. We have started to map and label data activities in each of our projects and after some time we were able to put it all together. Our data journey story became an important milestone in our overall data strategy and is helping us to navigate through complex projects we are running for our customers.
In this article, we would like to take you on an 8-step data journey where you will discover our approach to tackling data projects.
So let’s begin!
1. Data Discovery
Once the key goals of a project are established, the first step for us for us is data discovery. By knowing the questions that our customers have – we know our destination. So, we need to discover the data sources that would be the most suitable to answer those questions. Using our domain knowledge we examine our rich, curated database and other reliable and accurate data sources that can be used to provide the results that our customers need. We are not scared of any data, be it structured, semi-structured or unstructured. We can handle anything!
2. Data Gathering
With courage, we then begin to gather all the data we will need to reach our destination. We know that sometimes we will need to combine heterogeneous data sources and formats that were never designed to be combined.
We use cutting edge technologies to gather data from external sources, from secure data collection portals to the self-validating data entry tools. We also assist our customers in the process of data collection providing our guidance through this often challenging process.
The techniques that are used to do this have been tried and tested over many decades, however, advanced technology such as genomic sequencing, machine learning and the internet of things have become far more mainstream in recent years. There is now an opportunity to benefit from these advanced techniques within the food manufacturing environment.
3. Data Profiling
As we move through the data gathering phase, we closely examine all gathered data. We know how important data quality is. Drawing from experience in all our previous data journeys we have developed a methodology to assess “data health” so that we can advise our clients on any challenges that might be encountered in relation to data obtained from external sources.
We also help our customers to increase the quality of their data by close examination and applying procedures to test for data completeness, stability and accuracy.
4. Data Modelling
Data modelling is the step where we work on the data structure. We translate the customer’s data requirements into appropriate data models. We collaborate across our teams to draw data maps. We build logical and physical data models which translate our customer’s requirements into functional schemas. Things are starting to get into shape!
5. Data Wrangling
Our Data Engineers roll up sleeves for data wrangling. Data is being checked, decoded, translated, cleaned, sorted, ordered, aggregated, joined, transposed, pivoted, split, merged and validated. We specialise in building data pipelines and complex ETL processes which serve as the foundation for further detailed analysis and predictive modelling.
6. Data Mining, Machine Learning and Statistical Modelling
Now we’re ready for the mountain top. Our Data Scientists work full steam to design and build predictive models that will give our customers the insights and predictions that they were looking for. These models and data sets are then deployed in our collaborative data science platform, Expert Models, so that our clients can access the model, run scenarios and develop predictions from the comfort of their own office chair.
7. Data Visualisation
Having all the answers that our customers need we can proceed to visualise the data and results in our Expert Models platform and create scientific reports full of clear explanations, graphs and charts. We hold high standards in delivering visualisation tools and final reports for our customers. We translate the results of our models into meaningful insights that our customers can understand so that they can make the best possible decisions. That phase marks the completion of the data journey.
8. Data storage
Our cloud based data storage solutions were supporting us all the way. Data journeys succeed in Creme Global because of our platform’s reliable architecture, handling data variety, growing volumes of data and high security standards. We continuously look for better solutions that will enable our data science platform to support the growing demands of large datasets and complex calculations.