Here’s an absurd scenario for you: I decide to build a car. I design it, make a list of all the parts I need, acquire them, and build the car. It turns out great. Then, I decide to build another model of the car—perhaps one with a few more features. So I restart the design process from a totally blank page, make an all-new list of all the parts I need, re-source them from the ground up, and—finally—build the new model. Why would I, or anyone do that? I wouldn’t. It would be a total waste of time, labor, and intellectual capital.
Yet, every day, almost every company does this—not with cars, but with IT projects. For every new information system they design and implement, IT re-performs the process, from the ground up—especially vis-à-vis data integration.
Data integration costs can range upwards of 85% of the total cost of an analytics project. What’s more, probably 65% of the data to be integrated on a new project can be found in a composite of your last five projects. A little quick math tells us that if your project budget is $1 million, $850,000 of that will be data integration costs. Of that, about $550,000 would be money you’d be spending to re-integrate data that has been dealt with on previous projects.
That’s a lot of wasted time, effort, and money. However, for most companies, the problem lies in knowing which data has already been integrated and which is new data that must be assimilated into the new system. Fortunately, there’s a solution: artificial intelligence—specifically machine learning.
Machine learning (ML) is broadly defined as the use of statistical algorithms to enable information to mimic human learning. Thus, theoretically with ML, computers can examine their past experience, parse the data from it, and learn from the experience, on their own.
In using ML for data integration tasks, metadata is the key. Project teams can apply ML algorithms to analyze descriptive, structural, and administrative metadata to gain a clear picture of how data is used and organized throughout the various organizational systems it inhabits.
Using that information, ML algorithms learn about data structures, flows, and needs and can use that information to automate data integration tasks such as:
- Deducing appropriate data schemas and structures
- Cataloging data used across applications—both repetitive and unique elements
- Recommending transformation tasks
- Mapping metadata elements between applications
Think about the impact of that automation on your business. By enabling you to better understand your data and—more importantly—which of that data is common across applications, you can significantly decrease the time and cost for data integration for new projects.
In turn, because, automation takes over many of the human tasks that can bog down data integration efforts, your human resources are freed up for other tasks—using all your resources more wisely. Also, with the machine taking on the bulk of the effort, human error is greatly reduced, and speed to completion is increased significantly. This can help you get systems up and running faster—not to mention running with all the data they need, without being weighed down by data they don’t.
In the end, automating your data integration tasks will increase user productivity and reduce the time you spend on building workflows for integration. The result? You’ll be able to improve the accuracy of your integration efforts and significantly cut costs. You’ll have the data you need to improve your agility and response time to market conditions and changes, which can ultimately have positive impact your business outcomes. What’s not to like about that?