CHALLENGE
Scalable and Centralization Data Transformation
Danone recognized the need to modernize their data infrastructure to improve operational efficiency. The existing Blitz Data Platform was hosted on-premises and was scheduled for decommissioning. To address this, Danone aimed to transform their data architecture by migrating to a cloud platform—Azure—to enhance analytical capabilities, scalability, and overall business performance.
This migration was essential not only for business continuity but also for transforming the platform into a modern, scalable solution. However, significant performance challenges emerged during the migration process, particularly in handling large datasets. The system struggled with inefficiencies and extended running times when processing large volumes of data. Previously, the data processing workflow took approximately 4 hours to complete for processing datasets ranging from tens to hundreds of GB in size. These issues had to be addressed to meet future demands, enable scalability, and enhance the user experience.
SOLUTION
Cloud Based Data Centralization and Processing
To address these challenges and modernize the data infrastructure, we implemented a two-fold solution.
Cloud Migration and Data Centralization
The first step in modernizing the Blitz Data Platform was migrating the existing ETL pipelines and data from the on-premises environment to the Azure Cloud. By centralizing the platform in the cloud, we enhanced its scalability, operational efficiency, and ease of management. This migration also provided the foundation for seamless data integration and improved collaboration across teams.
Optimized Data Processing to Address Performance Challenges
To overcome inefficiencies in handling large datasets, we applied the following strategies:
- Cluster 'Big' Option
Utilized a larger cluster configuration, increasing computational capacity to 28GB RAM and 8 cores through the 'Big' cluster option. This enhancement significantly improve processing speed for large datasets.
- Writing Data Based on Monthly and Yearly Input
A method was added to write data only for the specified month and year in certain tables, which reduced unnecessary processing and improved system efficiency.
- Selective Data Deletion in DMT
A process was implemented to selectively delete data in the Datamart Schema (DMT) based on user-inputted month and year values for large tables, further optimizing resource usage and reducing running times.
What Made the Solution Stand Out:
Our approach not only addressed immediate challenges but also laid the groundwork for sustainable improvements:
- Public Cloud Implementation
Migrating to Azure Cloud provided secure, scalable, and highly available data storage, eliminating the limitations of the previous on-premises setup.
- Efficient Data Processing
Optimized cluster configurations and selective data operations significantly improved processing times while reducing resource consumption.
- Centralized Cloud Data Management
Consolidating data in the cloud, teams benefited from improved collaboration, disaster recovery capabilities, and better support for business continuity.
RESULT
Centralized, Secured, and Scalable Data Platform
The migration to the Azure Cloud and subsequent performance optimizations delivered significant improvements across Danone’s data infrastructure.
Centralized and Scalable Data Platform
The Blitz Data Platform is migrated and centralized within the Azure Cloud environment, creating a modern, scalable infrastructure:
- Centralized Data Management
The Blitz Data Platform is consolidated within the Azure environment, simplifying access and management.
- Scalability
The platform is designed to handle growing data volumes with ease, ensuring it meets future business demands.
- Ease of Maintenance
The cloud environment has streamlined maintenance processes, allowing for faster updates and greater system reliability, enabling Danone to focus on strategic initiatives instead of infrastructure challenges.
Enhanced Performance and Efficiency
The optimization strategies significantly improved the system’s ability to handle large datasets, addressing the performance challenges identified during migration:
- Running Times Reduction
Running time for individual datasets within the Processing Group was reduced by nearly 70%, with an overall improvement of about 40%.
- Scalability and Efficiency
The system can now handle larger datasets effectively, supporting current operational needs and future scalability.
- Improved User Experience
Faster processing and streamlined workflows positively impacted users, enabling users to work more efficiently without long waits for data processing results.
Note: These improvements were first implemented in the DEV environment, where the Snowflake warehouse specifications are the smallest. Once deployed to the PROD environment with larger specifications, the improvements will be even more significant, and we expect no further issues.
Business Impact
- Faster Decision-Making
With reduced data processing times, decision-makers can now access insights more quickly, leading to faster, more strategic actions.
- Cost Savings
The migration to Azure helped reduce operational costs by eliminating the need for expensive on-premise infrastructure and maintenance. Optimized data processing also minimized resource usage, leading to further savings.
Long-Term Business Benefits:
- Scalable Infrastructure
The cloud-based solution provides the flexibility to scale data processing capacity as Danone's data needs grow, enabling the infrastructure to evolve alongside Danone's future requirements.
- Enhanced Data Security
The Azure Cloud platform’s built-in security features offer robust protection for sensitive data, aligning with industry best practices and helping Danone meet regulatory compliance standards.
- Continuous Optimization
The cloud environment lays the foundation for ongoing improvements. With the potential integration of AI and machine learning capabilities, Danone can further optimize its data processing workflows, enhancing operational efficiency and staying ahead of emerging technological advancements.
FUTURE ROADMAP
To ensure continued operational excellence and meet growing business needs, we are exploring the following initiatives for the future:
- Generative AI Integration
To further maximize big data potential, we try to explore the benefits of integrating Generative AI with Danone Big Data and business process.
- Improved Data Processing Algorithms
Continuously scaling algorithms that follows with database growth to ensure always efficient data processing.
- Advanced Security Compliance
The platform will continue to meet global security standards (e.g., ISO 27001, and GDPR) to ensure the protection of sensitive data and compliance with evolving regulatory requirements. This is critical as we scale and handle larger data volumes.
By leveraging the scalability, performance, and security of the Azure Cloud, Danone is well-positioned to continue its digital transformation and maintain a competitive edge in the market.
TESTIMONIAL
"The running time reduction by nearly 70% for individual datasets and around 40% overall has significantly boosted Blitz efficiency. The improvements in the DEV environment already show great promise, and we're confident that once deployed to production, the impact will be even more substantial. This has been a game-changer for the overall solution, all kudos to Sarah, Wiby and Xtremax Team for making it happened." - Henri Denafel, SR Data Analyst