The client is a leading health & fitness club with a worldwide presence having 5000+ franchised locations across different continents including countries like the USA, Canada, Belgium, Netherlands, Poland, Luxembourg, UK, Qatar, India, Singapore, Australia, Japan & more (50+ countries).
Each country/geography had its own data sources like individual CRMs, individual accounting systems, and club management software.
Essentially there are different versions and silos of data that needed to be analyzed, combined, and cleansed to be used by businesses for any analytics. The client needed a single verifiable source of data for all analytics across the organization.
Another problem was data discovery as there was no central data catalog and each source often conflicted with another source when providing similar information.
But the biggest problem of all was that the consumer data had a large number of duplicates.
Solution–Data Management & Analytics Solution
Contata was engaged as an extended team of experts who could work with the stakeholder and different system database administrators to profile, analyze and establish data flow between data sources. Contata provided a complete data management & analytics solution with dashboards and reports for the stakeholders to analyze customer activity, and revenue generated and have a quick, at-a-glance look at the franchised location performance.
Data were extracted from both domestic and international data sources (>30 countries) and stored in Azure Data Lake via both Azure functions and Azure data factory pipelines. Then selected data was bubbled up to an Azure SQL data warehouse which was used as a single verified source of truth for all enterprise reporting.
Personal traits of consumers were used to de-duplicate consumers across the countries and zones using a modern complex ML algorithm. Data security was maintained so that confidential data (PII/PHI/PCI) was secured both at rest and in transit.
Daily ETL runs were monitored via an automated operations dashboard and notifications were sent using Azure alerts.
Duplicates in historical data were identified by de-dupe algorithms.
Manhours/day saved across organization by the power bi reports and certified data sets provided for self-service BI
Sources of data provisioned into data lake and catalogued for self-service BI.