2024
Gen Ai And Data Platform

Danone recognized the need to modernize their data infrastructure to improve operational efficiency. The existing Blitz Data Platform was hosted on-premises and was scheduled for decommissioning. To address this, Danone aimed to transform their data architecture by migrating to a cloud platform—Azure—to enhance analytical capabilities, scalability, and overall business performance.

This migration was essential not only for business continuity but also for transforming the platform into a modern, scalable solution. However, significant performance challenges emerged during the migration process, particularly in handling large datasets. The system struggled with inefficiencies and extended running times when processing large volumes of data. Previously, the data processing workflow took approximately 4 hours to complete for processing datasets ranging from tens to hundreds of GB in size. These issues had to be addressed to meet future demands, enable scalability, and enhance the user experience.

 

To address these challenges and modernize the data infrastructure, we implemented a two-fold solution.


Cloud Migration and Data Centralization

The first step in modernizing the Blitz Data Platform was migrating the existing ETL pipelines and data from the on-premises environment to the Azure Cloud. By centralizing the platform in the cloud, we enhanced its scalability, operational efficiency, and ease of management. This migration also provided the foundation for seamless data integration and improved collaboration across teams.


Optimized Data Processing to Address Performance Challenges

To overcome inefficiencies in handling large datasets, we applied the following strategies:

  • Cluster 'Big' Option
    Utilized a larger cluster configuration, increasing computational capacity to 28GB RAM and 8 cores through the 'Big' cluster option. This enhancement significantly improve processing speed for large datasets.
  • Writing Data Based on Monthly and Yearly Input
    A method was added to write data only for the specified month and year in certain tables, which reduced unnecessary processing and improved system efficiency.
  • Selective Data Deletion in DMT
    A process was implemented to selectively delete data in the Datamart Schema (DMT) based on user-inputted month and year values for large tables, further optimizing resource usage and reducing running times.


What Made the Solution Stand Out:

Our approach not only addressed immediate challenges but also laid the groundwork for sustainable improvements:

  • Public Cloud Implementation
    Migrating to Azure Cloud provided secure, scalable, and highly available data storage, eliminating the limitations of the previous on-premises setup.
  • Efficient Data Processing
    Optimized cluster configurations and selective data operations significantly improved processing times while reducing resource consumption.
  • Centralized Cloud Data Management
    Consolidating data in the cloud, teams benefited from improved collaboration, disaster recovery capabilities, and better support for business continuity.
 

Centralized, Secured, and Scalable Data Platform

The migration to the Azure Cloud and subsequent performance optimizations delivered significant improvements across Danone’s data infrastructure.


Centralized and Scalable Data Platform

The Blitz Data Platform is migrated and centralized within the Azure Cloud environment, creating a modern, scalable infrastructure:

  • Centralized Data Management
    The Blitz Data Platform is consolidated within the Azure environment, simplifying access and management.
  • Scalability
    The platform is designed to handle growing data volumes with ease, ensuring it meets future business demands.
  • Ease of Maintenance
    The cloud environment has streamlined maintenance processes, allowing for faster updates and greater system reliability, enabling Danone to focus on strategic initiatives instead of infrastructure challenges.


Enhanced Performance and Efficiency

The optimization strategies significantly improved the system’s ability to handle large datasets, addressing the performance challenges identified during migration:

  • Running Times Reduction
    Running time for individual datasets within the Processing Group was reduced by nearly 70%, with an overall improvement of about 40%.
  • Scalability and Efficiency
    The system can now handle larger datasets effectively, supporting current operational needs and future scalability.
  • Improved User Experience
    Faster processing and streamlined workflows positively impacted users, enabling users to work more efficiently without long waits for data processing results.

Note: These improvements were first implemented in the DEV environment, where the Snowflake warehouse specifications are the smallest. Once deployed to the PROD environment with larger specifications, the improvements will be even more significant, and we expect no further issues.


Business Impact

  • Faster Decision-Making
    With reduced data processing times, decision-makers can now access insights more quickly, leading to faster, more strategic actions.
  • Cost Savings
    The migration to Azure helped reduce operational costs by eliminating the need for expensive on-premise infrastructure and maintenance. Optimized data processing also minimized resource usage, leading to further savings.


Long-Term Business Benefits:

  • Scalable Infrastructure
    The cloud-based solution provides the flexibility to scale data processing capacity as Danone's data needs grow, enabling the infrastructure to evolve alongside Danone's future requirements.
  • Enhanced Data Security
    The Azure Cloud platform’s built-in security features offer robust protection for sensitive data, aligning with industry best practices and helping Danone meet regulatory compliance standards.
  • Continuous Optimization
    The cloud environment lays the foundation for ongoing improvements. With the potential integration of AI and machine learning capabilities, Danone can further optimize its data processing workflows, enhancing operational efficiency and staying ahead of emerging technological advancements.

 

To ensure continued operational excellence and meet growing business needs, we are exploring the following initiatives for the future:

  • Generative AI Integration
    To further maximize big data potential, we try to explore the benefits of integrating Generative AI with Danone Big Data and business process.
  • Improved Data Processing Algorithms
    Continuously scaling algorithms that follows with database growth to ensure always efficient data processing.
  • Advanced Security Compliance
    The platform will continue to meet global security standards (e.g., ISO 27001, and GDPR) to ensure the protection of sensitive data and compliance with evolving regulatory requirements. This is critical as we scale and handle larger data volumes.

By leveraging the scalability, performance, and security of the Azure Cloud, Danone is well-positioned to continue its digital transformation and maintain a competitive edge in the market.