It is the best of the times and it is the worst of the times. Well, at least if you are in the business of data or integration.
Back when I was starting my career as a programmer, I remember writing a stored procedure that would crawl through over 100,000 records of doctors to update the insurance plans they supported. This was based on a new data feed from several sources each week and the updated information would show up on a website. The stored procedure would run for hours then stop whenever it found a record that would be either incorrect or incomplete. This process went through many painful iterations until I finally figured out most of the patterns in detecting dirty and incomplete data. While the state of programming and debugging techniques have taken a quantum leap over the years, sadly, the state of data quality and complexity around processing data has not changed that much. In fact, it is estimated that U.S. businesses lose about $600 billion every year due to bad data.
To make matters more difficult, the complexities around acquiring the data, conditioning the data, cleaning the data, and, finally, syndicating the data have continued to worsen. Over the years businesses have been trying to solve for this by throwing technologies and different architectural patterns at the integration problem. Hub and Spoke, Enterprise Service Bus (ESB), Enterprise Application Integration (EAI) and Integration Platform as a Service (iPaaS) to name a few have chipped at some of the problems but only took the two worlds of integration and data analysis further apart.
At the end of the day though, businesses are still trying to use data as a currency and the continued focus on trying to address that core problem of finding ways to give data scientists access to clean, accurate data in order to solve business challenges has finally led to something refreshing that I heard last week.
If you are a business leader, imagine a world where you could make real time decisions and mine actionable insights from your data without ever having to worry about the changing landscape of integration and proliferation of applications and APIs.
Data Platform as a Service (dPaaS) promises to be that approach. Here are some fundamental tenets of dPaaS that are true indicators of maturity in this area:
- It provides all the productivity benefits of iPaaS at the data layer enabling the business to focus on data assets and extract insights for business value
- It puts control of the data in the hands of the enterprise data experts
- It provides full transparency into the data and data flow especially around the heuristics of the data itself (more about this in my next blog)
- It delivers integration as a fully managed service through a professionally curated integration Center of Excellence
- It uses sophisticated data technologies like polyglot persistence and schema on read
Back to my earlier story… When I was writing my stored procedures, I wish I had had a way to read into the health of the data before I embarked on trying to process it. But the technologies to do that are finally here and are part of the dPaaS methodology. Stay tuned for more on this as we embark on the next exciting part of the journey of bringing integration and data back together to solve for complex business problems.