As topics of conversation go, the terms “big data” and “Hadoop functionality” seem more appropriate for IT and CIOs than for CEOs and CFOs. Yet, choosing the right Hadoop provider for your business is every bit as much a business decision as it is a technical decision. After all, the ultimate goal of big data analytics is to obtain actionable insights and gain a competitive business advantage. To this end, here are the top 5 questions to ask, from a business perspective, when choosing between a cloud-based and an on-premise Hadoop provider.
Choosing the right analytics platform and provider really comes down to how to store, manage and analyze massive amounts of data safely, effectively, and, above all, affordably. A traditional on-premise Hadoop platform tends to be quite expensive. After all, it is a physical platform requiring large numbers of servers, a large facility to house them, and large amounts of electricity to run them. Additionally, on-premise Hadoop platforms require on-site IT teams to make sure that everything runs smoothly. In contrast, cloud storage requires no expensive on-site hardware or support. In addition, companies that implement with Hadoop in the cloud providers have the benefit of purchasing access to a fully scalable storage and analytics platform while only paying for what they use.
On-premise platforms come with hard limits on storage capacity and performance, all due to their physical nature. As a company’s data demands increase, more physical servers must be added to the cluster, and this process can be time-consuming and costly. With a cloud platform, there is total scalability, meaning that companies can access unlimited storage space on demand. If needed, literally thousands of virtual servers can be spun up in the cloud in minutes. Here again, companies only pay for the actual space that they use to meet increased data demands.
With analytics platforms, productivity is a function of data accessibility. The drawback of on-premise platforms is that they come with set limitations regarding how quickly and easily data may be accessed. However, by using a cloud-based Hadoop platform, data can be accessed anytime from anywhere including on smartphones and tablets through an Internet connection. The result of this greater and faster access to data is increased productivity.
For organizations, the ability for individuals and teams to collaborate on projects in real-time is a big advantage. But, with on-premise Hadoop, this type of collaboration just isn’t possible. However, Hadoop in a virtual environment means that syncing can occur, ensuring that files that are being worked on by individual employees are automatically updated across all platforms. Then, regardless of size, these files can be shared between other co-workers and teams, ensuring full collaboration in real-time.
Although this may sound like an IT question, the degree of corporate security and protection that a platform provides can have a direct effect on business. When it comes to security, on-premise Hadoop platforms have a well-deserved reputation for excelling in this area. After all, sensitive data can safely be kept behind the corporate firewall. In contrast, the idea of storing sensitive information offsite with a cloud provider can make corporate business executives a bit nervous. However, today’s cloud service providers typically adhere to modern cloud security protocols such as built-in encryption, to protect data during transfer and at rest.
When choosing between an on-site and a cloud-based Hadoop platform, both IT and business executives need to work together to make sure that the solution works best from both a technical and a business standpoint.
Qubole is a significantly more polished product than EMR. Data scientists can explore their data in S3, create tables and query those tables all via an easy-to-use web UI
Qubole’s fantastic support has been key in our successful deployment. They continue to deliver of new features and revisit the ones that we ask for
Our goal at MediaMath was to take our existing industry leading infrastructure to the next level handling new complex analytics tasks. Qubole has helped us enable this goal with minimal risk.
Instead of worrying about provisioning clusters of machines or job flows or whatever, Qubole lets you focus on your data and your queries … The Qubole guys have been extremely helpful!
The service spins up users’ clusters only when a job is started, then automatically scales or contracts them based on the workload, and spins the servers down once the job is done.
Qubole’s Hadoop and Hive interfaces are vastly superior to the default CLIs, which scare business analysts and hinder meaningful analyses of the gaming logs that we collect. With Qubole, business analysts are self-sufficient in using a Big Data platform to meet their advanced analytic needs.
Online Gaming Company
top-performing technologies in the data industry are definitely taking aim at democratizing data tools and bringing the power of data to smaller businesses. This is a major change in the data industry, and Qubole Data Service is a great example
I’m very happy to be using Qubole in production. Qubole has saved me a lot of time, effort, and trouble in getting my data processing pipelines up and running. My data pipelines process Appnexus data in Amazon S3 which is then stored in Vertica. The engineering team understands the complexities and provided awesome support!
Real-time Ads Retargeting Startup
There’s a whole world of web companies, SMBs and other non-Facebooks or Yahoos that will want to use Hadoop but not want to run it in-house…offering a cloud service makes it easier for these users to get started with the platform and for Qubole to keep improving.
Qubole offers a big data ETL and exploration service through auto-scaling Hadoop clusters with a web user interface for data exploration and integration with various data sources. The service can do (nearly) everything EMR can do, and it goes further
Big Data Republic
Simba knows Big Data access. Qubole knows Big Data. Qubole’s founders authored Apache Hive, built key parts of the Hadoop eco-system and brought Apache HBase to Facebook
“The integration of Tableau and Qubole makes it faster and easier for our customers to operationalize Big Data…lowers the resource barriers to deriving the benefits of Big Data because customers can deploy our joint solution seamlessly and cost effectively.”