Qubole Democratizes Access to Real-Time Streaming Data with Qubole Pipelines Service

New solution allows enterprises to manage real-time data pipelines and lower operational costs through Qubole’s open data lake platform

SANTA CLARA, Calif., Aug. 18, 2020 Qubole, the open date lake company, today launched Qubole Pipelines Service, a new offering that makes it much easier and faster for customers to build robust, scalable streaming data pipelines and capitalize on the rapid growth of real-time data in their businesses. With Qubole Pipelines Service, data teams can now build, test, deploy, monitor, and manage hundreds of streaming data pipelines from a single platform, resulting in increased productivity, greater innovation, and reduced operating costs.

By 2022, more than half of all new major business systems will adopt continuous intelligence systems that use real-time data in order to drive important business decisions, according to Gartner. As data-driven innovations such as AI/ML and the Internet of Things (IoT) increasingly drive competitive advantage for businesses, troves of streaming data are continuously generated from multiple internal and external sources. A rapidly growing number of organizations need a solution that makes it faster and easier for developers to unlock the value of streaming data, and Qubole Pipelines Service fully addresses this need.

With Qubole Pipelines Service, businesses can complement their existing data lake with advanced features that help them instantly capture streaming data from various sources, accelerate the development of streaming applications and run highly reliable and observable production applications at the lowest cost. All in a managed environment via the public cloud of their choice. Salient new features include:

  • Accelerated Development Cycle: Numerous built-in connectors, code generation wizard, dry run framework, and quick-start options that help accelerate development lifecycles. A pipeline can be developed within minutes without writing even a single line of code and be deployed instantly.
  • Robust and Cost-Efficient Stream Processing Engine: Leveraging Apache Spark Structured Streaming, Qubole added several enhancements including Rocksdb state storage, direct writes, and memory pressure scheduling, among others for reliably building and deploying long-running streaming applications.
  • Comprehensive Operational Management: Qubole Pipelines Service includes a broad set of APIs and user interfaces for engineers to holistically manage the lifecycle of streaming applications and get continuous operational insights.
  • Data Management and Consistency: The new Pipelines service uses Qubole’s ACID framework to efficiently compact small files in the background while allowing concurrent read / write operations, without impacting performance.

“Today, organizations continue to struggle with stream processing at scale, and we found that they needed a comprehensive solution that would tackle the most complex pain points around reliability, scalability, and cost efficiencies,” said Joydeep Sarma, CTO, and co-founder at Qubole. “Through our extensive discussions with Qubole’s customer and partner ecosystems, we knew that solving these problems was a large undertaking. The arrival of Pipelines Service not only equips data teams and engineers with the most comprehensive solution to quickly build streaming data pipelines and analyze massive streams of data And it also underscores Qubole’s longstanding tradition of providing businesses with an open data lake platform for batch and streaming analytics.”

“MiQ has developed a series of predictive targeting solutions that allow our customers to gather large volumes of data about real-world events, generate insights, and then take appropriate actions—all in real-time,” said Rohit Srivastava, Engineering Manager at MiQ. “We use Qubole Pipelines Service to build these solutions because it provides our teams with the ease of use and scalability they need to quickly build and deploy the high-performance targeting applications our customers have come to expect.”

Qubole Pipelines Services is now available on AWS, Google Cloud, Microsoft Azure. To learn more, visit here.

For more information about how Qubole simplifies machine learning, streaming analytics, and data exploration, visit Qubole.com.


Qubole is the open data lake platform for analytics and machine learning that large enterprises depend on to quickly harness the power of data and gain valuable business insights. Only Qubole provides a truly open platform that works with all major cloud providers and data processing engines. The company’s unified environment includes optimized versions of Spark, Presto, Hive, and Airflow, with intelligent automation that scales usage up or down to meet service-level needs and minimize cloud costs. Based in Santa Clara, Calif., Qubole has offices in New York City, San Francisco, London, Singapore, and Bangalore. For more information, visit us online.