An Azure data mesh platform with PaaS components that brings together data from many different sources. This allows Nutreco to make data-driven choices, optimize processes and respond more quickly to changes in the market and the demand of its customers. Curious how we did this? You can read it here.
Nutreco, an internationally operating animal feed producer, has the desire to respond more quickly to the changes in the market. Their customers are asking for higher yields and at the same time a lower use of antibiotics and a better balance in the use of raw materials taking into account food scarcity. To address this, Nutreco is pushing for a rapidly growing use of data, analyses and algorithms. In addition, customers of this company want to use more and more data about the products in the optimization of their own processes.
Timeliness, accuracy, availability and reliability of data is the foundation on which these new digital products are built.
Nutreco initiated a data strategy with the aim of getting more value from the data and the knowledge that is present in its own organization and external data sources. This data strategy resulted in the need for a data ecosystem consisting of data use cases, data governance and data policy that can guarantee the quality of data. They implement this strategy by realizing a data platform and data products. This realization is done in collaboration with internal data analysts and external data consumers throughout the value chain. A reliable data platform, a uniform use of master data management, reliable service levels and high-quality datasets in this data platform are essential for the value of these data products.
Therefore, Nutreco asked the AMIS Conclusion team to support them in designing and implementing the data platform and to increase the quality, speed, timeliness and availability of the data. In order to improve the use of data throughout the company. To this end, AMIS Conclusion created a data platform using Azure PaaS components based on the data mesh principle. The data platform provides a generic data processing tool that is used to serve all business units in harvesting, processing, enriching, and publishing actionable datasets for their users. In addition, the platform enables the animal feed producer to quickly deliver new data products.
Below you will find some practical results of this data platform. The Platform:
The data platform is built with Microsoft Azure PaaS and SaaS components. We designed a solution that is suitable for incremental extraction, loading, and transformation (ELT) of source data.
The solution uses Azure Data Factory (ADF) to automate the multiple ELT pipelines. Using pre-built ADF connectors, we connect many different sources such as ERP and CRM systems; relational and NoSQL databases and Sharepoint. The resulting raw dataset is stored in Azure Data Lake.
The next stage in the pipeline runs automatically to clean, filter, enrich, and transform the data. This is largely done in Python and is done from ADF in Azure Databricks. The resulting rich data is written to Azure Data Lake. Databricks is a leading, cloud-based data engineering tool used for processing and transforming massive amounts of data and exploring the data through machine learning models.
When all transformations are complete, data is retained in the user tier. Here, users and/or applications have fine-grained access to data that they are allowed to see. Users can search for and learn about data assets in the data catalog with details such as attribute definitions and dependencies, data lineage and quality, ownership, and sensitivity of data assets.
Het dataplatform is gebouwd met Microsoft Azure PaaS en SaaS componenten. We hebben een oplossing ontworpen die geschikt is voor incrementele extractie, laden en transformatie (ELT) van brongegevens.
De oplossing maakt gebruik van Azure Data Factory (ADF) om de meerdere ELT-pijplijnen te automatiseren. Met behulp van vooraf gebouwde ADF-connectoren verbinden we veel verschillende bronnen zoals ERP- en CRM-systemen; relationele en NoSQL-databases en Sharepoint. De resulterende onbewerkte gegevensset wordt opgeslagen in Azure Data Lake.
De volgende fase in de pijplijn wordt automatisch uitgevoerd om de gegevens op te schonen, te filteren, te verrijken en te transformeren. Dit gebeurt grotendeels in Python en wordt uitgevoerd vanuit ADF in Azure Databricks. De resulterende rijke gegevens worden geschreven naar Azure Data Lake. Databricks is een toonaangevende, cloud-gebaseerde data-engineeringtool die wordt gebruikt voor het verwerken en transformeren van enorme hoeveelheden gegevens en het verkennen van de gegevens via machine learning-modellen.
Wanneer alle transformaties zijn voltooid, worden gegevens bewaard in de gebruikerslaag. Hier hebben gebruikers en/of applicaties fijnmazige toegang tot gegevens die ze mogen zien. Gebruikers kunnen zoeken naar en meer te weten komen over gegevensassets in de gegevenscatalogus met details zoals kenmerkdefinities en -afhankelijkheden, gegevensafstamming en -kwaliteit, eigendom en gevoeligheid van gegevensassets.
With this platform, data users have access to a wide range of (master) datasets that were previously difficult to obtain. Data scientists can now take advantage of regularly updated datasets to use in their business intelligence, advanced analytics or machine learning models. The available datasets are not limited to internal financial data only, but may also include operational, production, logistics, sales, and inventory data. And external data sources such as IoT sensor data.
The data platform has a wide variety of stakeholders within Nutreco. The data comes from many different sources. This made standardization of the sources and master data a challenge. The data requirements were still changing during the development of the platform. This made the project complicated. Using an agile and iterative approach, AMIS Conclusion is able to demonstrate progress in all areas and deliver tangible results within the first few months. This result and progress supported the organization's confidence in the platform and gradually increased the number of datasets in the platform.
AMIS Conclusion designed and implemented the data platform using the latest best practices and design principles for cloud-based data platforms. Implementing a platform as code made the implementation and promotion of new changes extremely reliable and robust.
AMIS Conclusion makes a distinctive difference in realizing the data market and the data platform by deploying engineers who can see the added value and use cases on top of this platform. This makes everyone who works on the platform excited to get business value out of the datasets as quickly as possible. Having this mindset is also very supportive in making the right architecture decisions and staying focused on adding value to use cases as much as possible. This way of working made the team very productive and flexible in following new use cases and user requirements.
Retail, food, and agri is one of the markets in which the Conclusion ecosystem can make a real impact. Interested in learning more about our joint services in retail, food, and agri?
Want to learn more?
Plan a meeting with Robbrecht
Head of IoT bij AMIS Conclusion