Infinite Retention for Apache Kafka in Confluent Cloud
Creating centralized platform for all current and historic event streams with limitless storage and retention
This is a Press Release edited by StorageNewsletter.com on July 8, 2020 at 2:23 pmConfluent, Inc. announced the next stage of its Project Metamorphosis initiative, which aims to build the next- gen event streaming platform any organization can put at the heart of their business.
Click to enlarge
As part of the Infinite release, the company announces Infinite retention, a capability in Confluent Cloud that creates a centralized platform for all current and historic event streams with limitless storage and retention. Organizations can now democratize access to event data and ensure all events are stored and accessible as long as needed. By combining all relevant past and current event data, organizations can build richer application experiences and make more informed data-driven decisions.
“Without the context of historical data, it’s difficult to take action on real-time events in an accurate, meaningful way,” said Jay Kreps, co-founder and CEO. “We’ve removed the limitations of stoin and retaining events in Apache Kafa with infinite retention in Confluent Cloud. With event streaming as a business’s central nervous system, applications can pull from an unlimited source of past and present data to quickly become smarter, faster, and more precise.“
Contextually rich, personalized applications are in high demand as digital experiences have replaced in-person interactions during the pandemic. In order to build these sophisticated touch points, applications need input data on what is happening right now and how that relates to what happened in the past. This is challenging and costly for existing data architectures, especially when new events move through organizations at gigabyte-per-second scale. And due to high storage costs and complexities in data balancing, events are typically retained in Apache Kafka for only seven days. This limits event streaming use cases, like Y/Y analysis and predictive ML, and is not often a long enough time for compliance reasons.
“Although Apache Kafka is widely used for event streaming, many limitations still exist because of high infrastructure costs associated with storing data for longer periods of time,” said Dave Menninger, SVP and research director, Ventana Research. “Being able to extend from days or weeks of retention to several years with less operational overhead, greatly increases the value event streaming brings to any organization.“
Introducing fully managed event streaming service with unlimited storage and Infinite retention
With the Infinite retention capability in Confluent Cloud, the company solves the technical and economic strains put on organizations by the rapidly growing volume of real-time event streams. Organizations can quickly and cost-effectively establish a central source of truth for all events across their entire ecosystem, unlocking more use cases for pervasive event streaming and mitigating the rising cost of Kafka storage.
Implement event streaming as central nervous system for all real-time data
In traditional data architectures, silos exist between storage systems that record past data and messaging services that process future events. On top of that, there are hundreds of in-house systems, SaaS applications, and micro-services linked together by point-to-point connections that create huge operational burdens. With infinite retention in Confluent Cloud, organizations can build one central nervous system where all events flow through and can be stored. Event streaming can become a single source of truth for all other systems, making it easy to scale and ensure data integrity across an entire business.
Do more with streaming applications
Within Kafka, compute and storage are tightly interlocked making it difficult to retain high volumes of data while efficiently scaling storage as traffic grows. Infinite retention decouples compute and storage and also automates scaling so storage instantly grows based on traffic. Without storage limitations, organizations are able to leverage event streaming for more use cases like providing a persistent log of all events for compliance audits that require several years of data. Infinite retention also makes it possible to train ML models to make real-time predictions based on a historical stream of data. With more data to draw from, infinite retention can also improve the accuracy and intelligence of existing event streaming use cases like recommendation engines and customer 360 analytics.
Reduce storage costs and billing complexities
To meet storage demands and avoid downtime, businesses often over provision clusters and end up overpaying for more infrastructure and compute than is needed. Infinite retention provides elastically scalable storage that automatically grows with your traffic, and with the benefits of Confluent Cloud, organizations only pay for data that is retained rather than what is pre-provisioned. High storage costs that traditionally come with retaining massive amounts of data is no longer a barrier to achieving pervasive event streaming.
Infinite retention is available in July for Confluent Cloud customers using AWS with rollout to additional cloud providers planned for this year.
Store Infinite amounts of data on self-managed Kafka with tiered storage in Confluent platform
Tiered Storage in Confluent Platform was built upon innovations in Confluent Cloud. Released as a preview earlier this year, it has the potential to reduce storage costs by up to 70% and enable use cases across financial services and retail that improve customer experience, meet regulatory requirements for data retention, and improve ML models.
The community has proposed KIP-405 to bring tiered storage support to Apache Kafka, and Confluent engineers are helping the design proposal based on their experience with the Tiered Storage preview in its platform.
Resources:
Blog: Project Metamorphosis Month 3: Infinite Storage in Confluent Cloud for Apache Kafka
Project Metamorphosis
Video: Introducing Project Metamorphosis from Confluent