Apex and Databricks worked as one unit to provide a multinational oil and gas corporation with best-in-class consultation and engineering regarding their Databricks Unity Catalog tool.

SITUATION​

The client wanted to validate and assess the Databricks Unity Catalog tool within their existing data environment, but they did not have any experts in house. They had a highly federated and complex environment of over 150 workspaces that would make it nearly impossible to share the data required for Generative AI use cases and enterprise machine learning initiatives. Unity catalog allows the organization to get more control over data, add security, and centralize governance for both data and AI assets. It also enables the enterprise to be AI-ready, provide metadata to enterprise data catalogs, and use policies defined in enterprise catalogs. The intended outcome was not only to verify if the Unity Catalog was the right tool for their environment, but to establish a plan for wider adoption within the enterprise. They wanted to assess the features, functionality, and ease of use when standing up Databrick's Unity Catalog product. Due to previous experience with the technologies in the project and an existing relationship, the client chose Apex as a partner in this work. Apex's existing partnership with Databricks allowed both teams to work closely together, provide the right leadership, and create or test new accelerators.​

Custom Catalog Design for Multiple Databricks Workspaces​

SOLUTION​

The team assembled by both Apex and Databricks were able to work as one unit to provide our mutual client with best-in-class consultation and engineering. This partnership allowed us to approach the client with one voice and minimize the risks of the implementation. This team included one of our solution leaders who would provide direction. Apex and Databricks designed a plan for implementation that would provide our mutual client with an environment they could use to assess the Unity Catalog by using the client’s existing data environment. The approach stood up the initial Unity Catalog environment and used two workspaces as a platform for that assessment. Our team quickly engaged all stakeholders outside of the central platform team to make sure their voices were represented and to give an incentive to participate in the migration. Our team performed UCX assessments across the inventory of workspaces to identify common themes that need addressed. They held workshops with infrastructure teams to lay the groundwork for UC onboarding and workshops with data engineering teams to classify pipelines into buckets based on source/target and type of processing. Based on these buckets, the team built out template workflows to showcase the solution and shared them with engineering teams for best practices and guidance. Automation tools and scripts were refined throughout the project. Additionally, we worked with the client’s big data and warehousing team to identify efficiencies that would decrease costs and increase performance.​

RESULT​

Apex helped central IT team to build the automation required to onboard 100+ workspaces with ease and repeatability. Apex helped the client create and run a Databricks Center of Excellence to foster collaboration and reuse. Apex was able to identify cost savings, performance gain along way while reducing tech debt and improving overall security posture. Our team provided catalog design for multiple Databricks workspaces and provided a plan for scaling Unity Catalog across the client’s enterprise.​