QDS optimizes MapReduce to run on Amazon Web Services (AWS), Google Compute Engine (GCE) and Microsoft Azure so that you can have the flexibility you need to succeed. Choose the cloud that’s right for you, knowing that QDS will make it simple, fast, cost effective and secure to process your big data. Running MapReduce jobs on QDS provides the most control for ETL and data transformation for data scientists.
Like all big data solutions supported by QDS, clusters scale up and down automatically during query execution. And you don’t have to worry about starting or stopping Hadoop clusters. QDS does all the heavy lifting for you.
Read and write optimization for cloud storage dramatically enhances query performance and the user experience and reduces processing costs. Plus, with advanced auto-scaling, you’ll pay only for resources actually used
If you’re running on AWS, QDS lets you automatically incorporate Amazon spot instances that can be up to 90% less than the cost of on-demand instances.
With QDS’ pay-per-use pricing model, you’ll only pay for what you actually use by compute hour.
QDS gives you user interface options to match your use case. The web-based Workbench UI are suited for interactive analysis, and the SDKs and the REST API are ideal for programmatic access.
You can take care of your batch processing needs by scheduling your MapReduce jobs to run at periodic intervals.
You can debug the logs and analyze results of MapReduce jobs even when the cluster is not running.
MapReduce is a core part of the Hadoop ecosystem and works well with large datasets for ETL and batch processing jobs.
With Qubole’s support for Hadoop MapReduce, you get the most granular control for your ETL processing needs. Take your unstructured data and transform that into structured data using custom-defined logic.
Use Qubole’s built-in scheduler and workflow scapabilities to define a set of jobs that get run on a recurring schedule. Qubole’s cluster lifecycle management will automatically bring up clusters when the jobs start and when all jobs are done, the cluster will be turned off automatically. All logs and results are persisted so you can still debug even without the running cluster.
“With Pinterest’s current setup, Hadoop is a flexible service that’s adopted across the organization with minimal operational overhead. Pinterest has over 100 regular MapReduce users running over 2,000 jobs each day through QDS’ web interface, ad-hoc jobs and scheduled workflows.”
Pinterest Data Engineer
“DataXu needs high performance for its Big Data queries, and Qubole optimizes performance several ways including MapReduce split computations, and S3 I/O optimization.”
Vice President, Technology, DataXu
QDS gives you the freedom to work with Spark, Hadoop MapReduce, Presto, and Hive as part of one unified interface with unified metadata. Choose the right solution for the right workload rather than being locked into any single technology. MapReduce gives you the most control for defining data transformation of your data from unstructured to structured data. MapReduce is ideal for data engineers and developers that are comfortable with lower-level Hadoop functionality. In addition, Qubole offers services that allow for higher-level analysis, such as SQL querying with Hive and scripting analysis with Pig.
Qubole offers 2 weeks of QDS usage for free to explore MapReduce and other data engines. Users simply need to authenticate with SSO or enter their choice of cloud credentials to begin interacting with their data in their own cloud environment. Try MapReduce Today!
Qubole is a significantly more polished product than EMR. Data scientists can explore their data in S3, create tables and query those tables all via an easy-to-use web UI
Qubole’s fantastic support has been key in our successful deployment. They continue to deliver of new features and revisit the ones that we ask for
Our goal at MediaMath was to take our existing industry leading infrastructure to the next level handling new complex analytics tasks. Qubole has helped us enable this goal with minimal risk.
Instead of worrying about provisioning clusters of machines or job flows or whatever, Qubole lets you focus on your data and your queries … The Qubole guys have been extremely helpful!
The service spins up users’ clusters only when a job is started, then automatically scales or contracts them based on the workload, and spins the servers down once the job is done.
Qubole’s Hadoop and Hive interfaces are vastly superior to the default CLIs, which scare business analysts and hinder meaningful analyses of the gaming logs that we collect. With Qubole, business analysts are self-sufficient in using a Big Data platform to meet their advanced analytic needs.
Online Gaming Company
top-performing technologies in the data industry are definitely taking aim at democratizing data tools and bringing the power of data to smaller businesses. This is a major change in the data industry, and Qubole Data Service is a great example
I’m very happy to be using Qubole in production. Qubole has saved me a lot of time, effort, and trouble in getting my data processing pipelines up and running. My data pipelines process Appnexus data in Amazon S3 which is then stored in Vertica. The engineering team understands the complexities and provided awesome support!
Real-time Ads Retargeting Startup
There’s a whole world of web companies, SMBs and other non-Facebooks or Yahoos that will want to use Hadoop but not want to run it in-house…offering a cloud service makes it easier for these users to get started with the platform and for Qubole to keep improving.
Qubole offers a big data ETL and exploration service through auto-scaling Hadoop clusters with a web user interface for data exploration and integration with various data sources. The service can do (nearly) everything EMR can do, and it goes further
Big Data Republic
Simba knows Big Data access. Qubole knows Big Data. Qubole’s founders authored Apache Hive, built key parts of the Hadoop eco-system and brought Apache HBase to Facebook
“The integration of Tableau and Qubole makes it faster and easier for our customers to operationalize Big Data…lowers the resource barriers to deriving the benefits of Big Data because customers can deploy our joint solution seamlessly and cost effectively.”