Q&A: Emerging Tech Trends Pave Path to Integration-Platform-as-a-Service
- By Linda L. Briggs
- July 1, 2016
In this interview, SnapLogic CEO Gaurav Dhillon, the co-founder and former CEO of Informatica, discusses emerging technology trends, including those driving iPaaS, or integration-platform-as-a-service. "Cloud application integration has been the primary use case for iPaaS solutions, but the definition is broadening," Dhillon says. "In the era of social, mobile, big data analytics, and cloud computing, iPaaS represents a new approach designed to solve old and new data and app integration challenges."
Dhillon's company, SnapLogic, was founded in 2009 and offers a unified data and application iPaaS. Its hybrid cloud architecture is powered by 300-plus prebuilt "Snaps" -- integration components that simplify and automate complex enterprise integration patterns and processes. The prebuilt components are designed to enable enterprises to connect faster to cloud applications and big data investments.
What are some of the most interesting trends you're seeing in the BI, analytics, and data warehousing space?
Gaurav Dhillon: There are three trends that I believe are fundamentally changing the world of data.
The first is the shift to the cloud. Rapid provisioning, ease of use, and cost are just a few of the drivers as data gravity continues to shift.
(Data gravity means that when a large data set is sitting in a large Hadoop or Splunk instance in an on-premises system, it doesn't make sense to load all that data into the cloud to run analytics functions. Instead, one would ship the function to the data and return results. Being able to do this seamlessly can greatly simplify integration pipelines.)
Also driving this trend is the fact that cloud data warehousing and analytics have moved from rogue departmental use cases to enterprise deployments.
The second trend is the data lake and how to complement, extend -- and in some cases replace -- the traditional data warehouse with a reference architecture that is built to handle all new and future sources and enable more proactive and predictive analytics.
The third trend is the Internet of Things (IoT). It's already happening today in some industries with data velocity, variety, and, of course, volume. What's more important, though, is the kinds of analytics and insights that will become possible because of IoT sensors, wearables, and devices -- once organizations figure out how to separate signal from noise through the right data management techniques.
What is driving the Integration Platform-as-a-Service (iPaaS) model today?
Integration platform-as-a-service (iPaaS) is the confluence of a number of technologies and innovations, as well as market convergence and timing. I would break the drivers into a few categories:
- The end of a cycle: Extract, transform, and load (ETL) and enterprise service bus (ESB) technologies were purpose-built for a different era. One was for batch-oriented, structured data movement. The other was for real-time data synchronization between enterprise applications.
These distinctions are no longer relevant in the era of social, mobile, big data analytics, and cloud computing. IPaaS represents a new approach designed to solve old and new data and app integration challenges. The ROI can be fast and the cost savings of eliminating legacy "technical debt" can be tremendous.
- The need for speed: Just as "self-service" has been a key requirement for business intelligence tools, if you have to wait in an IT backlog to get access to integrated, trusted data, all kinds of rogue hand-coding starts in the business. IPaaS solutions must provide a clicks-not-code metadata development environment while meeting the governance and security needs of enterprise IT.
- Hybrid infrastructure: Front-office applications have all but moved to the cloud (CRM, HR, and so forth). Every application category you can think of, including financial management and ERP, is moving in the same direction. The same is true of platforms and infrastructure, as well as data and analytics. The adoption of iPaaS is being driven by cloud and big data adoption, the need to respect data gravity by running integrations as close to the data as possible, and by the need to scale out elastically to meet variable workloads.
How much of a challenge do companies face in integrating data from applications versus integrating data from BI applications? Can you explain the issues involved?
Operational integration between applications is typically closely aligned with specific business processes that span multiple departments (for example, new employee onboarding or "quote to cash"). The requirements are for event-based, real-time synchronization between disparate systems with a need for robust monitoring and broad connectivity. Analytical integration is becoming more closely aligned with business initiatives as the demand for more real-time and predictive analytics grows.
Historically, these were different toolsets, teams, and approaches (think ESB vs. ETL), but in the world of cloud and big data, convergence has accelerated. Speed and self-service are the drivers and legacy technologies built for either scheduled, structured data integration or low-latency, message-based application integration have struggled to keep up.
One emerging technology is certainly data lakes. What does data lake technology bring to the picture, especially regarding iPaaS? What do lakes mean for the data warehouse and its storage formats?
Cloud application integration has been the primary use case for iPaaS solutions, but the definition is broadening. New thinking around the data lake and how it complements -- and in some cases replaces -- the traditional data warehouse has led to new thinking around data ingestion, preparation, and delivery.
We worked with industry consultant Mark Madsen on a white paper titled "Will the Data Lake Drown the Data Warehouse?" and he put it this way: "JSON is the new CSV and streams are the new batch." An enterprise iPaaS solution should be able to handle the new data streams, scale elastically, and be much better suited for big data than legacy ETL tools.
How important is governance in all this? How does it fit into the data lake picture?
The G-word and metadata have become the hottest terms in the world of big data as adoption continues to move from development and testing to production. Any time you introduce self-service capabilities, you have to balance them with the administrative requirements of governing who can access what, when, and how. The promise of a data lake is going to be realized only by following a new set of organizing principles that is starting to emerge, principles that respect the prior learnings from data warehousing.
An iPaaS solution for hybrid cloud and big data integration use cases must not be considered a rogue, point-to-point decision by the line of business. Think longer term. Look for a streaming integration platform that balances ease of use with what your IT organization is going to require. You want to eliminate some of the manual, day-to-day data engineering tasks and ensure you get maximum value from all your enterprise application and analytics investments.
According to a recent Gartner Magic Quadrant report, "By 2019, iPaaS will be the integration platform of choice for new integration projects, overtaking the annual revenue growth of traditional application integration suites on the way." Can you expand on that statement?
I still believe that Gartner is looking at iPaaS primarily from an application integration-centric perspective. iPaaS has already proven to be the right approach to integrate cloud and on-premises systems, so the 2019 forecast is somewhat conservative. You don't see a lot of new enterprise service bus (ESB) installations in 2016. Add-ons to legacy implementations will persist, of course.
On the other hand, data integration approaches are also being disrupted. ETL has come to mean "Everything Too Late" in the post-data-warehouse world. A new approach to addressing old and new data integration challenges is required, and the vendors who can handle both app and data requirements (any source, any speed, anywhere) in a unified platform will be the right choice today and in the future.
Incorporating new data models and uses will be just as crucial for the future of integration. Organizations will be looking for faster, simpler ways to integrate containers and data lakes in order to get the most out of potential insights.
New streaming data processing engines and open source messaging systems, such as Apache Kafka, are coming into play all the time. Companies want to be able to embrace them quickly and make sure they seamlessly integrate with the rest of their infrastructure. The need for legacy migrations and add-ons will undoubtedly dwindle, but iPaaS will continue to grow in value to organizations.