Qubole Data Service (QDS) makes Spark enterprise ready with Spark processing on the AWS Cloud and Google Cloud Platform. QDS provides flexibility; simplifying time to deployment, making self-sufficient business users, and accelerating time to value.
Apache Spark can process data from a variety of data repositories, including the Hadoop Distributed File System (HDFS) and Amazon Simple Storage Service (S3). Spark supports in-memory processing to boost the performance of big data analytics applications and also supports disk-based processing.
Like all big data solutions supported by QDS, clusters scale up and down automatically during query execution. And you don't have to worry about starting or stopping Spark clusters... SEE MORE
Like all big data solutions supported by QDS, clusters scale up and down automatically during query execution. And you don't have to worry about starting or stopping Spark clusters. QDS does all the heavy lifting for you.
QDS makes it easy to debug both active and historical jobs with a Spark Application UI. Results and logs are always available even without active running clusters.
QDS supports a wide variety of Amazon EC2 instance types for your Spark workload, giving you the freedom to optimize instance selection for your workload requirements and AWS pricing options.
QDS lets you automatically incorporate Amazon spot instances that can be up to 90% less than the cost of on-demand instances.
With QDS' pay-per-use pricing model, you'll only pay for what you actually use by compute hour.
QDS gives you user interface options to match your use case. The Spark Notebook and a web-based UI are suited for interactive analysis, and the SDK'ss and the REST API are ideal for programmatic access.
QDS supports Amazon VPC, a service that extends your private network into the cloud. Amazon VPC provides fine-grained access control both to and from Amazon EC2... SEE MORE
QDS supports Amazon VPC, a service that extends your private network into the cloud. Amazon VPC provides fine-grained access control both to and from Amazon EC2 instances in your virtual network. Plus, you can launch dedicated instances within a VPC on single-tenant hardware.
By allowing user programs to load data into a cluster's memory and query it repeatedly, Spark is well suited to processing machine learning algorithms. In addition, Spark's MLlib provides common machine learning algorithms such as classification, regression, clustering, collaborative filtering, dimensionality reduction, as well as underlying optimization primitives. SparkR lets you apply many additional algorithms. Using QDS for Spark, you can deliver machine learning applications that turn your data into actionable predictive intelligence, including recommendation engines, sentiment analysis, fraud detection, customer segmentation and many other applications.
Spark's in-memory capabilities can provide for faster interactive exploration. This works well with a Spark Notebook, an interactive computational environment, in which you can combine code execution, rich text, mathematics, plots, and rich media.
QDS preserves the Spark Notebook even when clusters are not in use.
Spark provides support for additional use cases. Spark Streaming is useful for real-time processing of streaming data such as log files. Spark SQL supports relational data processing. And, because Spark typically caches recently-read data in memory, applications requiring fast SQL execution benefit from Spark's speed advantage over slower running Hadoop MapReduce jobs.
In QDS, work with Spark, Hadoop MapReduce, Presto, and Hive as part of one unified interface with unified metadata. Choose the right solution for the right workload rather than being locked into any single technology. Use Spark for machine learning and other use cases that benefit from in-memory data and fast response time. Switch to Hive and MapReduce for batch workloads. Similarly, Presto is a proven scalable SQL engine for simple, interactive analysis at companies such as Facebook, Netflix, and Airbnb.
To help accelerate adoption of big data tools such as Spark running on the AWS cloud, Qubole is offering a promotion for commercial AWS users. AWS will cover two weeks of AWS usage for Proof-of-Concepts based on eligibility.Let Us Fund Your POC!