Selecting the right big data solution for your business is critical to ensuring a successful big data project.

Considerations on where data must be stored or migrated to, how the big data solution scales, and the amount of time required before a big data analytics project can begin must all be taken into account.

With such vast options of big data analytics solutions, identifying key difference between vendors is difficult. The chart below provides a side-by-side comparison of all major on-premise and cloud big data vendors.

Try Qubole, get started now.

 

Hortonworks
Cloudera
HDInsight
Databricks
AmazonEMR
Qubole
Deployment
Model
underline

On-Premises/Hosted
 
On-Premises/Hosted
 Cloud Cloud
 Cloud Cloud
Product Summary
underline
100% open Source HadoopOpenSource Hadoop with proprietary managementBig Data Infrastructure as a Service in the Azure CloudStandalone Spark ServiceBig Data Infrastructure as a Service in the AWS CloudCross-platform Big Data Service with Unified Metadata
Autonomous
underline
NO
Limited insight
 NO
Limited insight
 NO NO NOYES
Alerts, Insights, Recommendations and Agents
Out-of-the-box Data Processing Engines
underline
Installation requiredInstallation requiredMapReduce, Hive, Pig, Spark, HBase, StormSparkMapReduce, Hive, Pig, HBase, Cascading, Impala, Spark, PrestoMapReduce, Hive, Pig, HBase, Cascading, Spark, Presto
Must Migrate Data To Platform
underline
YESYES NO*NO** NO**NO***
Data Store
underline
On-PremisesOn-PremisesAzureAWSAWSAWS, GCP, Azure
Setup
underline

Manual
 
Manual
 
Automatic

Automatic
 
Automatic

Automatic
Management
underline
Support and 3rd Party ConsultingSupport and 3rd Party ConsultingNo Big Data-specific Support Full Management and SupportNo Big Data-specific SupportFull Management and Support
Economic Structure
underline
Software License and Support, Infrastructure Purchase and PersonnelSoftware License and Support, Infrastructure Purchase and Personnel Elastic compute pricing Elastic compute pricing Elastic compute pricingElastic compute pricing
ScalabilityFixed ClusterFixed ClusterManual scaling, elastic,
on-demand,
no graceful downscaling
Manual scaling, elastic,
on-demand,
no graceful downscaling
Manual scaling, elastic,
on-demand
Automatic, elastic, on-demand