The global and independent platform for the SAP community.

Confluent Integration Into SAP: Interview With Greg Murphy From Confluent

E3 Magazine spoke with Greg Murphy, Staff Product Marketing Manager at Confluent, about Confluent's data streaming platform and its integration with SAP Datasphere on the BTP.
Laura Cepeda
March 28, 2024
avatar
This text has been automatically translated from German to English.

Greg Murphy is the Staff Product Marketing Manager focused on developing and evangelizing Confluent’s technology partner program. He helps customers better understand how Confluent’s data streaming platform fits within the larger partner ecosystem. Prior to Confluent, Greg held product marketing and product management roles at Salesforce and Google Cloud.

Greg Murphy, Staff Product Marketing Manager, Confluent

E3: Can you tell us more about the integration of Confluent into SAP?

What we're focused on with our SAP offering is providing SAP customers and BTP customers the ability to move their ERP data anywhere it needs to go. So, when SAP customers are thinking about building out real-time customer experiences, modern analytics, the types of solutions their business needs, that doesn't always mean that that data can stay within SAP; it needs to move downstream to third-party tools and applications to bring this to life.

So when we think first and foremost about that, you've added the options that are available to SAP customers. Most commonly, what's being leveraged by those customers today is open source Apache Kafka, which is a data streaming technology that allows customers to pick up that activity in an event-based fashion and stream it downstream, wherever it needs to go. Confluent is built by the original creators of Apache Kafka. What we've done is built out a cloud-native, fully managed data streaming platform that has Kafka at its Epicenter. It allows businesses to completely eliminate the operational overhead, the expenses, the challenges of managing the open-source technology. Customers are able to experience the benefits of data streaming so that they're able to become a real-time intelligent business that can work with that ERP data, wherever it's needed.

E3: What exactly does a data streaming platform do?

Apache Kafka is an open-source technology that was founded at LinkedIn in 2011. It was founded by Confluent founders because there was a need for real-time data movement within LinkedIn at that time with the technology that didn't exist yet. So, thinking about event-based activity, every single time in LinkedIn that somebody was liking a message or responding, any type of activity, there were a number of systems that wanted awareness and insights into that activity to be able to build processes around it. That's what Kafka was originally created for.

The core initial use case is a PubSub model, where producers are able to send messages into the data streaming platform, and any downstream consumers that want to work with that data in a one-to-many fashion are able to pick it up and elegantly work with that data. So rather than a point-to-point solution, where every single upstream application or tool is maintaining that direct integration with any downstream services that need that data, this is a much more elegant model that allows all those producers to produce once into the data streaming technology, and then consumers are able to subscribe or read from as many of those feeds as they want to. It gives a business a very elegant way of sending all their data throughout the business and really establishes a central nervous system in real-time. That is Apache Kafka. At the core, it's a distributed system.

E3: What makes the Confluent Data Platform different?

While Apache Kafka is a very powerful widely adopted technology for data streaming and real-time movement of data throughout an entire business, it is very challenging for most businesses to operate and scale and really find success within it. It's going to take multiple years for most businesses to really find true value within the technology when accounting for infrastructure resources, full-time employees that are going to be responsible for keeping it afloat, the risks and any challenges that might be tied to it.

Kafka is very powerful, but it is a cumbersome, operationally tolling, and extensive technology that ultimately provides value but pulls teams away from the core focus on a business. For example, a retailer, who's dependent upon Apache Kafka isn't going to get any love—any congratulations—from their downstream customers because they're great at managing Kafka. What that retailer wants to be focused on is net new customer impacting use cases that are leveraging that real-time data that they have access to. Confluent's goal is to really build on top of that open-source technology and provide customers across a full range of industries with the full value of a data streaming platform, while eliminating the burdens, the challenges, the expenses of managing that open-source technology, because that's not where our customers find value and not what they want to bring into the market. That's what we do. And that's where our expertise is.

E3: Can you explain that more in depth?

There are three pillars that really differentiate Confluent Data Streaming Platform from that open source technology. The first is that Confluent is cloud-native. We took Apache Kafka, the open source technology, and completely re-architected it for the cloud. And it is a major investment, taking that technology, putting it in the cloud, allowing customers to spin up, in a couple of clicks, a cluster that can run the entire business, which would have taken teams and months, if not years to stand up.

All that functionality comes together in what we call the Kora engine. The Kora engine is the Apache Kafka engine that was built specifically for the cloud. It is elastically scaling. So individual clusters can go up to 20 gigabytes per second. They have infinite storage.

E3: Could you give me an example of these features and how they’re impacting customers?

What we've done within the Kora engine is provided elastic scalability, up to 20 gigabytes per second. What that means is, first of all, it's massive, massive throughput; the largest organizations in the world have easy single-click access, scale up and scale down availability of Kafka compared to what would have taken, in the past, one of two options for a business running open source. They would either need to provision Kafka with expensive infrastructure to be ready for those peak performance moments that might be a holiday season or a special moment for the business, always having an infrastructure sitting at a high level ready for those moments, but carrying all that expense. Or they would need to have a major team that is dedicated to projects to prepare for those moments to add infrastructure, and then respond to those moments by pulling back all that infrastructure in order to avoid those costs.

Either way you spend it, that is a majorly distracting and expensive activity for businesses to be prepared for. Within Confluent cloud, with that elastic scaling, what that means is we will automatically scale up with the activity to make sure that the customers have the throughput and the capacity that they need to handle any traffic that's coming through the platform. And then we will automatically scale down whenever that traffic recedes, to make sure that we are operating efficiently on cost, and nobody needs to be overpaying.

E3: What about storage in the Kora engine?

Within the Kora engine, there's infinite storage. It's not just real time data streaming through the platform that customers have access to. They can store data and they can store an infinite amount with us. We provide an uptime SLA (Service Level Agreement) of 99.99 percent. So, we give high guarantees that the platform is going to be available. And we also have low latency guaranteed throughput through the platform. So all in all, what we've done with that cloud native pillar is take in all of that operational burden and that expense and distraction of Apache Kafka and rebuilt the entire service in a cloud experience the way that a customer would expect.

E3: What is the second pillar that differentiates the Confluent Data Streaming Platform?

The second pillar is that Confluent is complete. It's not just Apache Kafka that our customers need in order to build out real time experiences. What they need is a complete data streaming platform that allows them to build it around Apache Kafka and really work efficiently. That includes data integrations, SAP being one of those that we offer. We have more than 120 pre-built source and sync integrations into the platform. When we talked before about the data streaming technology, being able to pull data from anywhere that it's being produced and sending it anywhere that it needs to be consumed. Each one of those individual integrations, our customers have told us takes up to six months to develop plus a lifetime of support and management to keep it running. We again have over 120 for this prebuilt integration, so customers can save that time, automatically access all the data that they need, and send the data everywhere that it needs to go.

SAP being a core offering, they're built directly within the SAP console. It's not just a matter of easily accessing SAP data and being able to move it downstream to any one of the locations that we provide. There is also high value data that's sitting outside of SAP system as well. So, thinking about IoT data, data coming from marketing tools, real time to click streams from the web—there's any number of data sources that could be necessary. We allow SAP customers to access their SAP data, and also merge it in real time with all those different data sources, so that they can send it in real time as a full data product downstream to databases, data warehouses, data lakes, AI ML tools.

A part of that complete pillar, as well as real time stream processing, we have a full suite of security tools, governance tools; it’s the industry's only fully managed governance suite for Apache Kafka, which secures and ensures high quality data, but also opens it up to the rest of the business ensuring that more individuals within the business have access to real time data to efficiently build the experiences that are expected today.

E3: What is the third Pillar that differentiates the Confluent Data Streaming Platform?

And then lastly, it is an offering that is available everywhere. Whether that be in the cloud, we're available on AWS, Azure GCP, across clouds, the ability to link all those environments together and send data between them, on premises with Confluent platform, and also connectivity there as well. So between an on prem environment, cloud environment, establishing a hybrid environment, so really a central nervous system. Real time data throughout the entire business is ultimately what can be built.

E3: How can the two platforms be integrated specifically with SAP BTP, the Business Technology Platform?

In December, we made an external announcement about Confluent’s availability within the SAP Store. As a part of that announcement, what we were introducing was our direct integration between SAP Datasphere, within BTP and Confluent Cloud.

What this is giving SAP customers is the ability to access Confluent data streaming platform and fully managed data streams from directly within SAP Datasphere. So as I am working with BTP, I have access to S/4 Hana ECC, other tools on the SAP side. As a user there, I have the ability to configure a real time writing of that data over to fully managed data streams on the Confluent side. And what that does at that point is it really unlocks that ERP data from within SAP and allows it to move downstream to fuel applications and analytics with real time data.

E3: In which cases would these platforms not be applicable or ideal for solutions for an IT landscape?

I think that the use cases are very broad. They're very open. We on the Confluent side, say you know that the customers that we work with have really opened up a near unlimited number of use cases. We understand that the solutions that customers are looking to build are not always dependent upon exclusively data that's coming from within SAP, that data alone is not always what is going to be building out a use case. So, it is critical to SAP customers that they have access to data beyond BTP, as well as beyond SAP systems. And that's something that we do make easily available on the platform. It's not just the prebuilt connectors into the platform that allow you to easily move data downstream, it also allows you to pick up additional data, merge it with SAP data, and build something that's going to be more complete and ready for downstream use.

E3: What is an advantage of the Confluent Data Streaming Platform?

For the cloud native offering, our offering is proven to lower the total cost of ownership for Apache Kafka by up to 60 percent for businesses. So, when we talk about the value of data streaming—and there's major demand that's coming there from SAP customers—they want to be working with Apache Kafka; they want to use that for their real time customer experiences. Confluent is able to provide a better solution for Kafka and then a complete data streaming platform alongside it, one that's going to lower that total cost of ownership for Kafka by 60 percent. What that's giving is time back to our customers to not be focused on infrastructure management, but instead focused on what's actually going to drive their business forward. And what's going to surprise and delight their customers, whether those be internal or external.

E3: What options would you give customers who are looking for integration with S/4 Hana in the cloud and with BTP?

Yeah, the best route to go there is going to be the Datasphere integration that we built. So that is directly tied to the Datasphere tool within BTP. It's pulling from S/4 Hana ECC, BW, there's a list of at least five, six or seven different sources on the SAP side, but S/4 Hana is at the front there. That is really the flagship interest on the SAP side. So that is by far the recommendation on our side on the easiest way to unlock that ERP data to get it onto the data streaming platform and move it downstream wherever it's needed. For real time ecommerce inventory management, manufacturing, use cases IoT, AI, ML—that is going to be the plan recommendation from our side.

E3: What does this mean for ERP customers in particular?

What we think is important for the market to understand about the Confluent offering is, first and foremost, it becomes that real time intelligent business; it fuels those downstream applications with ERP data in real time. The real three pillars that we’re focused on specific to SAP are: one, build real time apps at a lower cost. That is done with the Kora engine and reducing the total cost of ownership for Kafka by up to 60 percent.

The second pillar is streaming SAP data everywhere it needs to go and merging with other sources. So that is with the 120 connectors, and stream processing with Apache Flink (an open source batch-processing framework).

The last pillar is ensuring and allowing customers to maintain strict security governance standards, while they move from BTP over to Confluent cloud. We have an enterprise grade suite of security features that are standard on the platform. For example, data encryption at rest and in motion. We have a number of different customization features, private networking, audit logs, a whole suite of security features that are available on the platform. They have a default set of compliance standards.

E3: Can you tell us a bit more about the governance features on the platform?

We offer the industry's only fully managed governance suite for Apache Kafka. It consists of several pillars: stream quality, stream catalog and stream lineage. Stream quality is probably what people think about most often when they think about stream governance, and that's data integrity, data rules, contracts, ensuring that standards are in place for all data that passes through the platform. This also ensures that the data can be easily reused later. This is in place by default.

We also have a stream catalog and both the stream catalog and the stream lineage are the other side of governance - a more modern view of governance. They're part of the same suite, but they're not focused on locking down data, they're focused on opening up that data to the company's stream catalog.

Our data portal allows customers on the data streaming platform to understand what are, for example, those high value data streams that are coming from SAP. So while we're making it incredibly easy for SAP customers to produce those data streams, to move them downstream to Confluent, data catalog on our side, and data portal specifically, is making sure that customers within Confluent cloud—those that are spending their day using our platform—can see those data strams, are aware of them. They know what data is available to pick up and can build the tools that they want to.

E3: Could you name an example of this?

For example, we have a BTP user and a Confluent cloud user. The Confluent cloud user might find a data stream via catalog that the BTP user has that is for holding the manufacturing IoT data that they want and need. So that's great. They’ve found something that drives the project forward. The next question might be, however, where did this data actually come from? What has happened to it along the way? Was it merged with many other data? How can I trust that this data is going to be good for me to use within my project?

Stream lineage is the third piece of the governance suite that gives customers an end-to-end kind of “Google Maps” view of those data streams, understanding where they came from, where they're going, what's happening to them along the way, so that customers can with that governance, understand that they’re only finding high quality data. They’ll be able to do so via the catalog. And when they find it with lineage, they can easily understand what all this data is and how to put it to use. So that is third piece I wanted to add, the continuation of strict security, governance, and compliance standards that can be maintained for SAP customers when they move all that data on to Confluence data streaming platform.

E3: Thank you for the interview!

confluent.io

avatar
Laura Cepeda

Laura Cepeda is Managing Editor for e3mag.com.


Write a comment

Working on the SAP basis is crucial for successful S/4 conversion. 

This gives the Competence Center strategic importance for existing SAP customers. Regardless of the S/4 Hana operating model, topics such as Automation, Monitoring, Security, Application Lifecycle Management and Data Management the basis for S/4 operations.

For the second time, E3 magazine is organizing a summit for the SAP community in Salzburg to provide comprehensive information on all aspects of S/4 Hana groundwork. All information about the event can be found here:

SAP Competence Center Summit 2024

Venue

Event Room, FourSide Hotel Salzburg,
At the exhibition center 2,
A-5020 Salzburg

Event date

June 5 and 6, 2024

Regular ticket:

€ 590 excl. VAT

Venue

Event Room, Hotel Hilton Heidelberg,
Kurfürstenanlage 1,
69115 Heidelberg

Event date

28 and 29 February 2024

Tickets

Regular ticket
EUR 590 excl. VAT
The organizer is the E3 magazine of the publishing house B4Bmedia.net AG. The presentations will be accompanied by an exhibition of selected SAP partners. The ticket price includes the attendance of all lectures of the Steampunk and BTP Summit 2024, the visit of the exhibition area, the participation in the evening event as well as the catering during the official program. The lecture program and the list of exhibitors and sponsors (SAP partners) will be published on this website in due time.