Calling data engineering a cost center is outdated. In today’s world, without high quality data, you simply cannot stay competitive. This has become even more apparent with the rise of data hungry AI companies, where data essentially is the business.
I define data engineering as preparing and transforming data from any source into a useful form for a business purpose including but not limited to analysis, reporting, monetization, product development, marketing, and decision making. It’s about building and maintaining the infrastructure to make data useful at scale. There are many definitions and interpretations of data engineering, but this is the one I have in mind here.
These are some of the challenges keeping data engineering from being associated more closely with revenue and growth (as I believe it should be):
Visibility: Unfortunately, data engineers work behind the scenes, wrangling data, building pipelines, maintaining infrastructure, ensuring data quality, security and compliance, making their contributions harder to quantify.
Complexity: Modern data engineering is complex and requires data domain knowledge, infrastructure, and software engineering. This complexity can make it difficult to communicate the value of data engineering to non-technical stakeholders.
Cost: Data engineering is expensive, especially with respect to modern data tools, compute, and talent. This can make it difficult to justify investments and support ongoing or new data engineering initiatives. This is why data engineering is often seen as a cost center.
And here are a few thoughts for how data engineering can address these challenges and start driving revenue growth:
Data catalogs can help unlock value and increase visibility. A data catalog can help data engineers understand exactly what data they have, where it comes from, who has access and other hidden attributes. Even more importantly, data catalogs maintained by the data engineering team can create the needed visibility, support, and awareness necessary to identify how data can drive revenue and growth for the business. Data catalogs also help enforce compliance by tagging sensitive data, ensuring it’s anonymized or aggregated before use. This means businesses can monetize their data more confidently both internally and externally. In addition, internal teams like marketing, sales, and product can also use data catalogs to find high-value customers, increase retention, identify new features to build, and more.
Develop and responsibly monetize data products to drive new profits. A strong data engineering culture that can anonymize and aggregate data easily means a company can create valuable data products for both internal or external consumption—including APIs that aggregate industry benchmarks, online SaaS products, and much more. This can drive revenue directly through data monetization partnerships and indirectly by increasing customer retention, acquisition, and satisfaction. This builds trust with customers and partners, leading to new revenue opportunities. Data enrichment services can also be offered to customers as well as internal teams to help them make better decisions.
Invest in proven open-source tools and technologies where possible. Open-source tools can help reduce costs and increase the potential profitability of a data engineering team. Automation, orchestration, and monitoring tools can help data engineers focus on the most important tasks, reduce errors, and increase the speed of data delivery without increasing costs.
By increasing visibility, simplifying complexity, and leveraging open-source tools, data engineering can move out of the shadows and into the spotlight as a critical revenue enablement function.