Data is the new pot of gold for companies that play in a competitive landscape such as retail, and for an organisation like Coles Group, it’s a massive asset.
Building an effective data management platform during the past year has become an integral part of its business and the way in which it pens itself up against its main competitors.
During the Databricks Data + AI Asia Pacific virtual conference, Richard Glew, head of data engineering and operations at Coles Group, discussed the path the supermarket chain was taking through developing its own electronic data processing platform (EDP), looking at why and what sort of problems it was hoping to solve, along with the challenges in moving from an on-premises environment and shifting its data management into the Azure cloud.
Glew pointed out some of the things that Coles Group would like to do that it can’t today within its current on-premises environment, such as taking advantage of machine learning. He also addressed issues such as finding the data, do they have the right hardware? And timeliness in accessing the data.
“We have a team of data scientists that do some awesome work, but they’re very limited by the environment they have to work in today,” he said. “And even if we can do something, being able to do it quickly is another matter.”
Governance and compliance were also high on the agenda for Glew, along with security.
“We’re looking for ways on how to make that better, but at the same time, make it so that navigating those complexities is easier than what we’ve had to do in the past,” he said. “Our current environment, while it is secure, it’s secure in a way that doesn’t necessarily make it easier to do things with our data quickly, and in the spirit of continuous improvement, there’s always things that we can do better.”
Glew said the vision for its EDP platform is to help re-imagine how it manages and treats data.
“In order for Coles to stay as a leading supermarket in Australia, we’ve got some work to do, there’s plenty of competition out there, and we know data is going to be central to that story,” he said.
“Our EDP platform is designed to be a universal data repository for all the data that we want to share either internal or external, and fully catalogue that,” he said.
Glew said it wants the platform to be very scalable, managed, governed and agile by improving the time to value for people who want to use data and take advantage of the elastic nature of the cloud, so it can meet business demands as they arise and scale back when necessary.
He pointed out the importance of data architecture and how it thinks of data “as a first class asset rather than an afterthought,” along with data quality.
“We’re using the scalability of the cloud and bringing all the data into one place,” he said. “There’s a lot of great things that a platform like this will enable us to do, that we would struggle with in our existing environment.”
Coles currently runs large Oracle data warehouses, and in turn is using Oracle Golden Gate to help shift large amounts of data from its on-premises environment into the Azure cloud, where it is also taking advantage of Data Factory and Event Hubs.
Databricks is being used as its central processing technology - undertaking all the data preparation, transformation and data modeling, which sits on Azure Data Lake Storage Gen2 alongside Delta Lake.
Glew said the company was also using the open source product Amundsen, which has been linked up with Apache Atlas and JanusGraph to offer good lineage and metadata repository. It also uses Alex solutions, which down the track, will be used to help with metadata discovery.
“We picked products that were open source friendly, so we have the ability to move things around pretty easily,” he said.
Glew said there are some operating model challenges that come with moving to the cloud.
“Like most organisations that move to the cloud for the first time, it’s a bit of a shock to the system from procurement to security, finance and HR, it can be challenging to deal with the operating model challenges that moving to the cloud has,” he said. “Additionally, on the data management and governance side of things, there’s been some cultural challenges there and we are working with those people closely, but it is going quite well in terms of changing the way of how we do governance in taking a more automated approach.
“We want to catalog everything, but govern the absolute minimum.”
Last year Coles revealed it had picked Microsoft's Azure as its 'cloud platform of choice’ as part of a strategic partnership between the tech company and the supermarket chain.
Coles revealed details of the alliance today, saying that it planned to migrate its applications to Microsoft’s cloud platform as part of an effort to drive “simplicity and efficiency”.
It also signed a long-term agreement with Accenture to implement a range of large-scale tech projects, with the retailer aiming to realise savings of around $1 billion over four years.
Accenture will help drive the overhaul of its supply chain, building on an existing relationship between the two companies. Coles’ tech-heavy Smarter Selling initiative is intended to deliver $1 billion in cumulative cost savings by FY23.