Blog

 
 

5 Big Data Infrastructure Implementations

  • By Dharmesh Desai
  • July 28, 2016
624x154-5-big-data-infrastructure
 

One of the great things about the big data industry is how willing practitioners are to share their knowledge, thinking process and experience. We love it when our customers talk about their implementations and it’s amazing to see what they’ve accomplished. Here’s a collection of some our favorite blog posts: 1.Powering Big Data at Pinterest […]

 
Read More..

Hadoop Happenings: Data Lake

  • By Ari Amster
  • July 26, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed building a data lake, applications of big data in retail, and Workday’s acquisition of Platfora. Read the full stories below. 1. 9 Ways Retailers are Using Big Data and Hadoop Datanami.com- From identifying a path to purchase to […]

 
Read More..

Up to 80% savings with AWS Spot Instances

  • By Dharmesh Desai
  • July 21, 2016
spot instances
 

In a previous post, we outlined the case for selecting cloud infrastructure over an on-premises deployment for managing big data workloads. Taking advantage of Spot instances to realize substantial cost savings is one of the benefits of selecting the cloud. Spot instances are a feature of AWS consisting of spare EC2 instances offered at a […]

 
Read More..

Hadoop Happenings: Cancer Cure?

  • By Ari Amster
  • July 19, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed application of big data in the healthcare industry, the new partnership between Microsoft and Boeing, and the unique storage demands of IOT projects. Read the full stories below. 1. Democratize big data: How to bring order and accessibility […]

 
Read More..

Optimize Queries with Materialized Views and Quark

  • By Rajat Venkatesh
  • July 14, 2016
 

This blog post explores how queries can be sped up by keeping optimized copies of the data. First we will explore the techniques and benchmark some sample results. Later, we talk about how one can use Quark (which we detailed in a previous post) to easily implement these performance optimizations in a Big Data analytics […]

 
Read More..

Qubole and WANdisco Move Enterprises to the Cloud with Cloudera Migration Program

  • By Ari Amster
  • July 11, 2016
 

New partnership with WANdisco reduces cost and complexity, and eliminates downtime during cloud migration MOUNTAIN VIEW, CALIF. — July 11, 2016—Qubole, the big data-as-a-service company, today announced the launch of its Cloudera Migration Program to assist enterprises in expanding their use of big data by leveraging the advantages of the cloud. As part of the […]

 
Read More..

Build or Buy: The Case for Cloud Infrastructure

  • By Ari Amster
  • July 7, 2016
624x154-big-data-cloud-alternative
 

Managing big data creates several challenges for data infrastructure teams: 1. Managing “bursty” and unpredictable workloads 2. Coordinating ad hoc and batch workloads 3. Storing rapidly growing data stores that require the capability to scale quickly 4. Integrating data generated at the edge 5. Managing storage and compute costs While an on-premises deployment has been […]

 
Read More..

Hadoop Happenings: Big Data Intelligence

  • By Ari Amster
  • July 6, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed the convergence of big data and artificial intelligence, applications of big data in healthcare, and the ethics of big data in higher education. Read the full stories below. 1. Hadoop Summit News: ecosystem order, and fragmentation Zdnet.com- A […]

 
Read More..

Hadoop Happenings: HDFS Weaknesses

  • By Ari Amster
  • June 28, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed the weaknesses of HDFS, the ODPi announced several distributors had adopted its runtime specifications, and Pepperdata offered a free health check for Hadoop users. 1. HDFS: Big data analytics’ weakest link InfoWorld.com- This post argues Hadoop’s distributed file […]

 
Read More..

Quark: Control and Optimize SQL Across Hadoop and RDBMS

  • By Rajat Venkatesh
  • June 27, 2016
 

One of the important functions of a database administrator is to manage storage structures to optimize performance in a relational database. Admins use tables, views, index, and cubes to tune the database as well as control the behavior of users (e.g., discourage full table scans and cross joins). There are similar well-known techniques in the […]

 
Read More..

Qubole Makes Key Hires to Leadership Team to Support Accelerating Market Demand

  • By Ari Amster
  • June 21, 2016
 

Company Appoints David Hsieh as Senior Vice President of Marketing and Ken Tamura as Vice President of Finance MOUNTAIN VIEW, CA–(Marketwired – Jun 21, 2016) – Qubole, the big data-as-a-service company, today announced that it has made two new additions to its executive team in key leadership roles. David Hsieh has been appointed senior vice […]

 
Read More..

Hadoop Happenings: Strategies for Success

  • By Ari Amster
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed strategies for successful Hadoop adoption, overarching methodologies of Hadoop adoption, and implications of big data databases for backup and recovery. Read the full stories below. 1. Two Methodologies Drive Hadoop Enterprise Adoption Data-Informed.com- Many businesses adopting big data […]

 
Read More..

RubiX: Fast Cache Access for Big Data Analytics on Cloud Storage

  • By Shubam Tagra
 

Qubole introduced first-generation Caching for S3 files in Presto in 2014 and documented the observed performance gains. In a nutshell: for CPU-efficient engines like Presto and Spark, caching remote files on local disk storage improves performance by removing bottlenecks in network IO. Our users also benefited from these performance gains, as this blog post from […]

 
Read More..

Hadoop Happenings: Fragmentation

  • By Ari Amster
  • June 14, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed the danger of fragmenting open source technology. Doug Cutting weighed in on the impact of the Internet of Things, and several posts addressed big data security concerns. See the full stories below. 1. Spark fragmentation undermined community Techcrunch.com- […]

 
Read More..

The Numbers Don’t Lie: Apache Spark is on the Rise

  • By Ari Amster
  • June 10, 2016
624x154-spark-usage-trends
 

Apache Spark remains a growing force in the realm of big data. Perhaps that shouldn’t come as a surprise considering the overall momentum behind big data analytics, but the growth in just the past few months has been nothing short of impressive. No doubt part of the reason behind that growth — besides a greater […]

 
Read More..

Qubole’s HBase-as-a-Service is Generally Available on AWS

  • By Rajat Venkatesh
  • June 9, 2016
624x154-apache-hbase
 

The HBase team at Qubole is happy to announce the general availability of QDS HBase-as-a-Service on AWS. Through the Beta program, QDS has helped administrators run HBase at scale in production with higher uptime and reliability while exploiting cloud elasticity for more agile deployments. In building our HBase offering, we worked closely with early customers […]

 
Read More..

Hadoop Happenings: Big Data Benchmark

  • By Ari Amster
  • June 7, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week a new big data benchmark was released, several companies made announcements in preparation for the Spark Summit, and Splice Machine open-sourced its Hadoop-based RDBMS. Read the full stories below. 1. Big Data Benchmark Gauges Hadoop Platforms Datanami.com- The Transaction Processing […]

 
Read More..

The Future of Deep Learning

  • By Ari Amster
  • June 2, 2016
Deep Learning
 

Once an obscure academic topic like “big data” used to be, deep learning has evolved into one of tech’s most exciting and promising disciplines in the field of AI—all in just a few short years. And in light of recent breakthroughs, deep learning technology is literally positioning itself (no pun intended) to transform AI altogether. […]

 
Read More..

Hadoop Happenings: Integrating R

  • By Ari Amster
  • May 31, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week, posts offered strategies for integrating R with Hadoop, tips for protecting data in HDFS, and advice on how to implement a big data strategy. Read the full stories below. 1. Integrating R with Apache Hadoop r-bloggers.com- This post explores three […]

 
Read More..

Big Data and the Rise of Self-Service Analytics

  • By Ari Amster
  • May 27, 2016
624x154-self-service-analytics
 

In the beginning, analyzing massive datasets on open source Hadoop was a complex process best left to PhDs. But over the past few years that has dramatically changed. Today, cloud platforms, paired with powerful business intelligence tools, have ushered in the rise of self-service analytics, enabling data analysis power users—along with users lacking a technical […]

 
Read More..

Hadoop Happenings: Hadoop 3.0

  • By Ari Amster
  • May 24, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed Hadoop 3.0, how big data is being applied in the healthcare industry, and the importance of machine learning. See the full stories below. 1. How Spark and Hadoop are Advancing Cancer Research Datanami.com- Researchers are using big data […]

 
Read More..

Qubole Meets BI Tools: 5 Machine Learning Libraries and their Big Data Use Cases

  • By Ari Amster
  • May 19, 2016
machine learning libraries
 

In an ongoing effort to extract more useful information and insights from massive volumes of structured and unstructured data, many organizations have turned to cloud based Hadoop big data analytics solutions such as Qubole. And as effective as these solutions are at capturing and analyzing large data volumes, their ability to interact with powerful Business […]

 
Read More..

Hadoop Happenings: Apache’s Wacky Recipe

  • By Ari Amster
  • May 17, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles focused on resources for mastering big data skills, compliance challenges, and Apache’s recipe for big data development. Read the full stories below. 1. Can IT keep up with big data? TechRepublic.com- Communication rifts between business users and IT are […]

 
Read More..

Hadoop Happenings: HBase Plateaus

  • By Ari Amster
  • May 10, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed the plateau of HBase adoption, common challenges of using Hadoop, and how businesses are overcoming the big data skills shortage. Read the full stories below. 1. HBase: The database big data left behind InfoWorld.com- HBase adoption has plateaued […]

 
Read More..

Which Programming Language Should You Use For Your Big Data Project?

  • By Ari Amster
  • May 6, 2016
624x154-programming-language
 

Big data projects are becoming much more common as organizations seek to take advantage of all that big data has to offer. While many companies are on board with the idea of implementing a big data project, properly executing one is another matter entirely. Many factors have to be considered, from what types of legacy […]

 
Read More..

Hadoop Happenings: Strengthening Authentication

  • By Ari Amster
  • May 3, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles explored big data applications including strengthening authentication and predicting the NBA playoffs. Read the full stories below. 1. Rise of Artificial Intelligence and Machine Learning Bloomberg.com-Artificial Intelligence is gaining prominence in the software industry. Read More 2. Elephant in […]

 
Read More..

The Big Data Lifecycle At TubeMogul

  • By Ari Amster
  • April 29, 2016
 

This post was written by Chris Chanyi, Senior Data Architect at TubeMogul. It originally appeared here. TubeMogul handles over a trillion HTTP requests a month. To understand how we handle this amount of data, it’s important to understand how we started. Read on for an in-depth look at our big data history. One of our […]

 
Read More..

3 Major Challenges to Implementing Big Data

  • By Ari Amster
data management
 

With all the hype, it’s little wonder that organizations are getting caught up by the idea of having their own big data initiatives. But as promising as that idea sounds, the reality is that over half of all big data projects never reach fruition. And when it comes to on-premise big data initiatives, the majority […]

 
Read More..

Qubole and Looker Join Forces to Empower Business Users to Make Data-Driven Decisions

  • By Ari Amster
  • April 27, 2016
 

Qubole, the big data-as-a-service company, and Looker, the company that is powering data-driven businesses, today announced that they are integrating Looker’s business analytics with Qubole’s cloud-based big data platform, giving line of business users across organizations access to powerful, yet easy-to-use big data analytics. Business units face an uphill battle when it comes to gleaning […]

 
Read More..

Qubole Extends Big Data-as-a-Service Platform with StreamX

  • By Ari Amster
  • April 26, 2016
 

Qubole, the big data-as-a-service company, today announced it has open sourced StreamX, an ingestion service to help data teams efficiently and reliably capture large scale, real-time data. Qubole will be adding support for StreamX as a managed service on the Qubole Data Service (QDS) platform to simplify and automate the ingestion of data for big […]

 
Read More..

Hadoop Happenings: Quality of Service

  • By Ari Amster
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed developments in managing Hadoop, the release of Apache Storm, and the challenge of quality of service. Read the full stories below. 1. 3 Reasons Why In-Hadoop Analytics are a Big Deal Dataconomy.com- Improvements to the leading SQL-on-Hadoop technologies […]

 
Read More..

Will Poor Data Management Cause Your Big Data Project to Fail?

  • By Ari Amster
  • April 22, 2016
624x154-poor-data-management
 

Most organizations have grand visions when it comes to using big data. Needless to say, there’s been a lot of hype surrounding big data analytics, with a lot of emphasis placed on businesses starting their own big data projects. Perhaps your company is interested in a big data project or has already started one. While […]

 
Read More..

Hadoop Happenings: Apache HAWQ

  • By Ari Amster
  • April 19, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles covered big data use cases in construction, privacy concerns and updates to Apache HAWQ. Read the full stories below. 1. Hadoop versus Spark: Who’s winning? LinkedIn.com- Will Spark overthrow Hadoop? Or will the two continue to coexist. Read More […]

 
Read More..

Is Your Big Data Initiative Scalable?

  • By Ari Amster
  • April 14, 2016
Scaling Big Data
 

The benefits of big data in the enterprise are no longer in question. Thanks to Hadoop, organizations both large and small are finding real value in capturing, storing, and analyzing large volumes of unstructured data. However, as data volumes continue to rise at exponential rates, organizations looking to stay profitable and competitive must be able […]

 
Read More..

Hadoop Happenings: Announcements and New Releases

  • By Ari Amster
  • April 12, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week LinkedIn released another Hadoop tool, Hortonworks made several announcements, and MarketShare shared what it has learned since its Hadoop deployment. See the full stories below. 1. MarketShare’s big data do-over: Hadoop deployment overhaul ZDNet.com- MarketShare initially tackled the problem of […]

 
Read More..

Hadoop and the Data Warehouse: A Winning Combination for Your Business

  • By Ari Amster
  • April 7, 2016
Hadoop vs Data Warehouse
 

This post was originally published August 2014 and has since been updated. Once the subject of speculation, big data analytics has emerged as a powerful tool that businesses can use to manage, mine, and monetize vast stores of unstructured data for competitive advantage. As a result, the rate of adoption of Hadoop big data analytics […]

 
Read More..

Qubole Open Sources Quark for SQL Virtualization

  • By Ari Amster
  • April 5, 2016
 

Qubole, the big data-as-a-service company, today announced that it has open sourced Quark, a cost-based SQL optimizer that helps to simplify and optimize access to data for data analysts. Traditionally, the data sets generated by data teams are aggregated and copied to multiple analytics systems to balance performance and cost, making it near impossible to […]

 
Read More..

Hadoop Happenings: Cloudera Valuation

  • By Ari Amster
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week we’re fresh off of Strata + Hadoop world. It was also announced that Cloudera’s value has been slashed. Read the full stories below. 1. To SQL or NOSQL? That’s the database questions Arstechnica.com- The line between SQL and NoSQL databases […]

 
Read More..

The Growth of the Industrial Internet of Things

  • By Ari Amster
  • March 31, 2016
Industrial Internet of Things
 

The Internet of Things (IoT) has become a popular topic of discussion as it represents the direction the world is likely headed. Imagining a world filled with connected devices, each communicating with each other, opens up so many intriguing possibilities that can change our lives in new and exciting ways. While the prospect of the […]

 
Read More..

Hadoop Happenings: Runtime Specification

  • By Ari Amster
  • March 29, 2016
hadoop-happenings
 

Grab the latest news and commentary in one place with this week’s Hadoop Happenings. This week the Open Data Platform regrouped and released its first project. Other articles discussed big data’s next steps to providing true business value. See the full stories below. 1. Newcomer Galatic Exchange can spin up a Hadoop cluster in five […]

 
Read More..

Moving past infrastructure limitations

  • By Ari Amster
  • March 24, 2016
 

This is a guest post written by Rory Sawyer, Software Engineer at MediaMath Here at MediaMath, we’re quite fond of data. It would be surprising to hear someone say they’re not fond of data, of course, but we’ve spent the last 18 months proving to ourselves and our clients that we really mean it. Our […]

 
Read More..

Big Data Applications: Use Cases for Big Data

  • By Ari Amster
Big Data Applications
 

The lure of big data analytics is unmistakeable and strong, and with good reason. Businesses have quickly caught on to the numerous advantages big data can give them. The benefits and potential are tremendous, and companies are responding by freeing up more budget for big data endeavors. A survey from Gartner in 2014 indicated that […]

 
Read More..

Hadoop Happenings: Simplifying Hadoop

  • By Ari Amster
  • March 22, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed the complexity of managing Hadoop, from learning how to configure Hadoop to choosing between the plethora of open source management tools. See the full stories below. 1. Don’t Expect Your DBA to Do a Hadoop Expert’s Job Data-Informed.com- […]

 
Read More..

Qubole Appoints its Head of Web Services Division

  • By athusoo
  • March 18, 2016
 

The appointment of Suresh Ramaswamy will help Qubole scale its multi-tenant SaaS platform and develop highly responsive big data platforms to cater to industry demands. Qubole, the big data-as-a-service company, today announced that it has appointed Suresh Ramaswamy as Qubole’s Head of Web Services. In this role, Suresh will help Qubole scale the web services […]

 
Read More..

Big Data and Customer Service: A Guide to Call Center Analytics

  • By Ari Amster
  • March 17, 2016
Big Data Customer Service
 

In today’s ultra competitive business world, mobile technology, social media have made the customer king. No longer is it enough for a company to have quality products and services. In order to truly stand out from the competition and build a solid reputation, companies need to provide quality customer service on a consistent basis. Fortunately, […]

 
Read More..

Hadoop Happenings: LinkedIn Leveraging Big Data

  • By Ari Amster
  • March 14, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week MapR updated Hadoop to provide persistent storage, learn how LinkedIn uses Hadoop to leverage Big Data, and MapR President and COO explains why he left software giant Oracle. Read the full stories below. 1. MapR invites Docker and Mesos to […]

 
Read More..

Top Apache Spark Use Cases

  • By Ari Amster
  • March 10, 2016
624x154-top-apache-cases-expanded
 

This post was originally published in July 2015 and has since been expanded and updated. Apache Spark is quickly gaining steam both in the headlines and real-world adoption. UC Berkeley’s AMPLab developed Spark in 2009 and open sourced it in 2010. Since then, it has grown to become one of the largest open source communities […]

 
Read More..

Qubole Appoints its First Chief Information Security Officer

  • By athusoo
 

Andrew Daniels brings more than 20 years of experience in enterprise security to address industry-specific needs Qubole the big data as-a-service company, today announced that it has appointed Andrew Daniels as Qubole’s first chief information security officer (CISO) and vice president of security, compliance and privacy. As CISO, Daniels will focus on developing industry-leading security […]

 
Read More..

Qubole Extends Customer Support with New Education Program

  • By Ari Amster
  • March 7, 2016
 

Qubole, the big data as-a-service company, announced today it will be extending its customer support services with the launch of Qubole Education, an extensive resource to empower data users throughout an organization with the skills needed to successfully implement a cloud-based data project. Qubole’s cloud-agnostic big data platform allows users to implement the right data […]

 
Read More..

Survey: State of Big Data Adoption

  • By Ari Amster
Big Data Adoption
 

In a recent survey of 766 respondents, Qubole uncovered several insights about the state of big data adoption. Among those currently using a big data implementation, Big Data-as-a-Service users were 33% more likely to be satisfied with their big data projects. The survey also demonstrated significant growth of BDaaS adoption in the enterprise, and echoed […]

 
Read More..

Applications of Business Intelligence in Banking and Finance

  • By Ari Amster
  • March 3, 2016
Business Intelligence and Finance
 

Technology is transforming the banking and finance industry. Thanks to the Internet and the proliferation of mobile devices and apps, today’s financial institutions face mounting competition, changing client demands, and the need for strict control and risk management in a highly dynamic market. At the same time, technology has given rise to powerful business intelligence […]

 
Read More..

Hadoop Happenings: SQL-on-Hadoop Benchmark

  • By Ari Amster
  • March 1, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week Yahoo open-sourced CaffeOnSpark, a benchmark displayed the strengths of various SQL-on-Hadoop engines, and one CTO weighed in on how he’d use data science if he were running for president. Read the full stories below. 1. Yahoo open-sources CaffeOnSpark deep learning […]

 
Read More..

Cashing in on the New Currency- 5 Ways to Monetize Data

  • By Ari Amster
  • February 24, 2016
624x154-monetizing-data
 

“Show me the money.” That’s not just a line made famous in a Tom Cruise movie. It’s what the CEOs and CFOs of organizations that have bought into Big Data initiatives are now demanding of their IT departments—“Show us how you are deriving monetary value from our data.” It’s a valid request. After all, today’s […]

 
Read More..

Hadoop Happenings: Apache Arrow

  • By Ari Amster
  • February 23, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week Google’s Dataproc was made generally available, Spark continues to piggyback on Hadoop, and a post from Teradata discussed several deployment models. See the full stories below. 1. Google launches Cloud Dataproc service out of beta VentureBeat.com- Google’s Cloud Dataproc, a […]

 
Read More..

Qubole’s Marcy Campbell Honored with the Silicon Valley Business Journal’s Women of Influence Award

  • By Ari Amster
  • February 19, 2016
 

MOUNTAIN VIEW, CALIF. – Feb. 19, 2016 – Qubole, the big data-as-a-service company, is pleased to announce that Silicon Valley Business Journal is honoring Marcy Campbell, Qubole’s senior vice president of worldwide sales and business development, with its Women of Influence Award. The Women of Influence honorees exercise power and influence within their industry and […]

 
Read More..

Qubole Donates Access to Big Data Cloud Platform for University Research

  • By Ari Amster
  • February 18, 2016
 

Students Will Be Able to Conduct Data Analysis on Any Size Data Sets Using the Latest Technologies Such as Apache Spark, Presto, Hive and Hadoop on Qubole’s Self-Service, Infinitely Scalable Cloud Platform Qubole, the big data as-a-service company, announced today it will be donating time on the Qubole Data Service (QDS) to university classes, giving […]

 
Read More..

The Role of Machine Learning in Big Data

  • By Ari Amster
624x154-role-of-machine-learning
 

With businesses eagerly pursuing big data analytics, it only stands to reason that they’d look for the methods and strategies that will best help them get the most out of it. There are many ways to perform analytics, and each will change depending on the type of business and what insights organizations want to gain. […]

 
Read More..

Open Source Integration of Airflow and Qubole

  • By Xing Quan
  • February 17, 2016
 

This post was written by Yogesh Garg and Sumit Maheshwari, who are Members of the Technical Staff at Qubole. We are pleased to announce that Qubole has open sourced an Airflow extension to connect with Qubole Data Service (QDS). Using this extension, our customers will be able to use Airflow for creation and management of […]

 
Read More..

Hadoop Happenings: Gartner Re-Boot

  • By Ari Amster
  • February 16, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week Gartner re-worked its BI Magic Quadrant, surveys once again demonstrated the growth of Apache Spark, and Hortonworks proved it’s not in financial straits. Read the full stories below. 1. Back to Work Hadoopers, Hortonworks Is Fine CMSWire.com- Hortonoworks’ secondary stock […]

 
Read More..

5 Tips for Boosting Public Cloud Security

  • By Ari Amster
  • February 11, 2016
624x154-5-tips-securing-data-in-cloud
 

It’s a long held belief that data stored on-premises is a lot more secure than storing that data in the public cloud. However, that may not be the case. While cloud security concerns have been around as long as cloud computing has existed, cloud providers have gone to great lengths to address them, improving their […]

 
Read More..

Hadoop Happenings: Warning Signs

  • By Ari Amster
  • February 9, 2016
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week articles discussed warning signs that a Hadoop cluster is underperforming, predictions for Hadoop’s second decade and how big data is used for humanitarian aid. Read the full stories below. 1. Why Most Business Intelligence Tools Fail the ‘Hadoop Test’ Information-Management.com- […]

 
Read More..

Our own Swati Singhi at the Grace Hopper Celebration

  • By Xing Quan
  • February 8, 2016
 

Swati Singhi, a Member of the Technical Staff at Qubole, was recently featured as a speaker at the Grace Hopper Celebration of Women in Computing, held in Bangalore, India. The Grace Hopper Celebration is the world’s largest technical conference for women in computing, and it is designed to bring the research and career interests of […]

 
Read More..

Optimizing S3 Bulk Listings for Performant Hive Queries

  • By Amogh Margoor
 

Introduction We previously wrote about the optimizations we made to optimize Hadoop and Hive on S3. Since then, we’ve applied those same changes across the rest of our Big Data analytics offerings, including Spark and Presto. Today, we’ll discuss some new recent optimizations we’ve made to further make querying of data performant and efficient for […]

 
Read More..

Infographic: Big Data Belongs in the Cloud

  • By Xing Quan
  • February 4, 2016
qubole-infographic-blog-2
 

Big Data infrastructure is complex, difficult to build and operate, and often requires highly specialized talent to maintain. To alleviate these challenges, businesses are turning to the cloud to provide simplicity, flexibility and agility. The graphic below highlights Qubole customers’ leadership due to the ease of administration, scaling, lifecycles, flexibility, and costs.     Qubole […]

 
Read More..

CIO Focus 2016: Technology and Team Management

  • By Ari Amster
  • February 3, 2016
Modern CIO 2016
 

In today’s world of big data, information technology is advancing at unprecedented rates. This presents some major challenges for organizations in general, and CIOs in particular, as they search for ways to boost growth and profits in the face of mounting competition. Not long ago the terms “big data” and “competitive advantage” were dismissed as […]

 
Read More..

Hadoop Happenings: Happy Birthday Hadoop!

  • By Ari Amster
  • February 2, 2016
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week industry thought leaders weighed in on Hadoop’s 10th birthday. Multiple posts addressed potential big data use cases including applications of GeoSpatial data. See the full stories below. 1.Hadoop turns 10, Big Data industry rolls along ZDNet.com- Hadoop’s founder Doug Cutting […]

 
Read More..

Big Data’s Moment in the Cloud Has Been Acknowledged

  • By Xing Quan
  • January 29, 2016
 

We were delighted to see the announcement of the latest version of Cloudera Director, and a corresponding write up on Curt Monash’s DBMS2 blog. The industry’s movement toward cloud-optimized features, such as support for Spot Instances and dynamic creation and termination of clusters, validates the direction that we’ve set for our company and product. Qubole’s […]

 
Read More..

Cassandra vs. Hadoop: A Comparative Look

  • By Ari Amster
  • January 28, 2016
cassandra vs hadoop
 

Technology is reshaping our world. The proliferation of mobile devices, the explosion of social media, and the rapid growth of cloud computing have given rise to a perfect storm that is flooding the world with data. The challenge for enterprises is that, according to Gartner estimates, 80 percent of this “big data” is unstructured, and […]

 
Read More..

Hadoop Happenings: Hadoop Just Getting Started

  • By Jonathan Buckley
  • January 26, 2016
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week a new Hadoop survey was released, Qubole secured additional funding, and PCWorld covered why open source is the new normal. Read the full stories below. 1. Why open source is the ‘new normal’ for big data PCWorld.com- Talend’s CEO believes […]

 
Read More..

Building a Collaborative Team With Data Scientists, Business Analysts, and Developers

  • By Jonathan Buckley
  • January 21, 2016
624x154-building-collabrotive-team
 

This blog post originally appeared on the Import.io blog. Start the new year off right by making sure your Big Data team is aligned. It is the goal of many business leaders to effectively utilize big data analytics to improve their companies. That means having the best people on the job as part of a […]

 
Read More..

Qubole Closes $30 Million Investment to Extend Leadership in Big Data in the Cloud

  • By Jonathan Buckley
  • January 20, 2016
 

IVP leads Series C financing along with existing investors CRV, Lightspeed Venture Partners and Norwest Venture Partners Qubole, the big data-as-a-service company, today announced that it has closed a $30 million Series C financing, bringing its total funding to $50 million. IVP led the financing and General Partner Somesh Dash will join the Qubole board […]

 
Read More..

Hadoop Happenings: Spark on the Rise

  • By Jonathan Buckley
  • January 19, 2016
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week surveys indicated a growing interest in Spark deployment, and Datanami discussed the growing SQL on Hadoop market. Read the full stories below. 1. How Barclays is cashing in on big data & Hadoop to stay ahead in fintech CBROnline.com- Head […]

 
Read More..

Meetup: Machine Learning at Scale Using Spark and Hive

  • By Jonathan Buckley
  • January 14, 2016
624x154-oracle-qubole-presentation
 

A large crowd recently attended the Boulder/Denver Big Data Meetup group hosted by Oracle where experts from Qubole discussed their latest findings from a real world case study. The evening’s presentations were titled “Case Study: Machine Learning at Scale using Spark and Hive” and detailed practical ways businesses can implement machine learning techniques using the […]

 
Read More..

Hadoop Happenings: Cognitive Analytics in 2016

  • By Jonathan Buckley
  • January 12, 2016
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles offered predictions on the future of big data and cognitive computing and discussed why Hortonworks’ stock value continues to falter. Read the full stories below. 1. 16 for ’16: What you must know about Hadoop Spark right now InfoWorld.com- […]

 
Read More..

Building Qubole: Metrics and Alerts

  • By Rajat Venkatesh
  • January 11, 2016
 

In this blog post, we’ll show you how we collect metrics and set up alerts to ensure the availability of Qubole Data Service (QDS).   QDS Architecture Before getting into the details about monitoring, we’ll give a quick introduction to the QDS architecture.   QDS runs and manages Hadoop/Spark/Presto clusters in our customers’ AWS, GCP, […]

 
Read More..

5 Signs You’re Failing at Data Science

  • By Jonathan Buckley
  • January 7, 2016
624x154-five-signs-failing-data-science
 

Most businesses understand that big data analytics is where it’s at. They view data science as the one new thing they need to truly improve their operations and become even more successful as an organization. The problem, though, is that too many companies are failing at data science. One report from Pricewaterhouse Coopers (PwC) and […]

 
Read More..

Hadoop Happenings: Data Governance

  • By Jonathan Buckley
  • January 5, 2016
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week articles focused on data governance and getting the most out of a Hadoop deployment. InformationWeek offered its predictions for the coming year. See the full stories below. 1. Data governance process taxed by self-service BI, big data Techtarget.com- Data governance […]

 
Read More..

The Public Cloud Market Continues to Expand

  • By Jonathan Buckley
  • December 31, 2015
public cloud growth
 

Businesses are truly coming around to all that cloud computing has to offer. While the cloud has been around for years, only recently has it reached levels of popularity where it isn’t hyperbole to refer to it as a global phenomenon. This has lead to many companies taking advantage of the public cloud’s many benefits, […]

 
Read More..

Hadoop Happenings: Looking to 2016

  • By Jonathan Buckley
  • December 29, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. It was a short news week due to the holiday. This week stories covered the expected growth of big data analytics and the growing data science talent shortage. Read the full stories below. 1. The Top 3 Big Data Trends of 2016 […]

 
Read More..

Qubole Appoints Jonathan Trail as Vice President of Customer Success

  • By Jonathan Buckley
  • December 22, 2015
 

Qubole, the big data as-a-service company, today announced that it has appointed Jonathan Trail as Qubole’s first Vice President of customer success. As VP of customer success, Trail will work closely with Jonathan Buckley, SVP of marketing, and Marcy Campbell, SVP of worldwide sales and business development. Together, they will work to continue the company’s […]

 
Read More..

Hadoop Happenings: New Performance Benchmark

  • By Jonathan Buckley
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week the Transaction Processing Performance Council release a new performance benchmark, a podcast discussed big data in the election season, and an article explores how eBay uses big data. 1. Hadoop in Banking: Changing the Game PredictiveAnalyticsWorld.com- Big data has multiple […]

 
Read More..

Apache Spark vs. Hadoop Which Big Data Framework is the Best Fit?

  • By Jonathan Buckley
  • December 17, 2015
spark vs hadoop
 

In the early days of big data, Apache Hadoop wasn’t just the “elephant in the room”, as some have called it. Hadoop was the room. But that is all changing as Hadoop moves over to make way for Apache Spark, a newer and more advanced big data tool from the Apache Software Foundation. There’s no […]

 
Read More..

Qubole Ignites Apache Spark on Google Cloud Platform

  • By Jonathan Buckley
 

Qubole, the big data-as-a-service company, today announced the availability of Apache Spark on Qubole Data Service (QDS) for Google Cloud Platform. The integration will enable Google Cloud Platform customers to use QDS’s 1-click persistent Spark Notebooks for fast data analysis, and auto-scale Spark clusters that deliver the right compute power for specific workloads. Qubole Data […]

 
Read More..

Getting started with Spark on QDS for Google Cloud Platform

  • By Ashish Sachdeva
 

Starting today, Qubole Data Service (QDS) users can launch Auto-scaling Spark Clusters and 1-click Persistent Notebooks to analyze data persisting in Google Cloud Storage. To set up a trial account, follow the instructions in our Google Cloud Platform Quick Start Guide. With auto-scaling, you no longer need to manually set the cluster size to achieve […]

 
Read More..

Hadoop Happenings: Apache Kylin

  • By Jonathan Buckley
  • December 15, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week Apache Kylin was moved to top-level status, discussion continued on Apache Spark vs. Hadoop, and Forbes offered big data predictions for the coming year. See the full stories below. 1. CIO Explainer: What is Hadoop? WSJ.com-This post provides a brief […]

 
Read More..

Keeping Big Data Safe: Common Hadoop Security Issues and Best Practices

  • By Jonathan Buckley
  • December 10, 2015
624x154-keep-big-data-safe
 

The big data explosion has given rise to a host of information technology tools and capabilities that enable organizations to capture, manage and analyze large sets of structured and unstructured data for actionable insights and competitive advantage. But with this new technology comes the challenge of keeping sensitive information private and secure. Big data that […]

 
Read More..

Hadoop Happenings: Adoption Barriers

  • By Jonathan Buckley
  • December 8, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week articles addressed Hadoop’s complexity and barriers to Hadoop adoption. See the full stories below. 1. Why Spark and Hadoop are Both Here to Stay ReadWrite.com- This post debunks common myths about Hadoop including that Spark will replace Hadoop. Read More […]

 
Read More..

Where’s the Value in Big Data—Storage or Apps?

  • By Jonathan Buckley
  • December 3, 2015
624x154-value-in-storage-or-apps
 

Big data has become a big industry. The lofty promise of big data analytics to deliver actionable insights and create competitive advantage is being realized. And organizations that once dismissed the idea of implementing a big data strategy are giving it a second look as they consider the benefits of capturing, managing and analyzing mountains […]

 
Read More..

Hadoop Happenings: Optimal Big Data Platform

  • By Jonathan Buckley
  • December 1, 2015
hadoop-happenings
 

Grab the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week articles discussed the managerial challenges of using Hadoop, overall job satisfaction among data scientists and survey data indicating continued growth in Hadoop adoption. See the full stories below. 1. Three Reasons Why I Love Hadoop, and You Should Too! SupplyChainshaman.com- […]

 
Read More..

Hadoop Happenings: Personalized Medicine

  • By Jonathan Buckley
  • November 24, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in the latest Hadoop Happenings. This week articles explored various big data use cases from personalized medicine to mapping the waters around Antarctica. See the full stories below. 1. Spark or Hadoop: Which is the Best Big Data Framework? DataScienceCentral.com- This post discusses the key differences between […]

 
Read More..

The Main Types of Big Data Vendors: A Comparative Look

  • By Jonathan Buckley
  • November 19, 2015
624x154-main-types-of-big-data-vendors
 

The big data boom has given rise to a host of vendors, each promoting their own unique ways of meeting the growing data demands of today’s businesses. As a result, businesses seeking a big data solution have a fairly long list of big data vendors to choose from. Selecting the right vendor is both a […]

 
Read More..

Hadoop Happenings: Thick Data

  • By Jonathan Buckley
  • November 17, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week a new study was released on the state of big data jobs, articles focused on boosting big data security, and one post promoted combining big data with thick data. See the full stories below. 1. Top 10 Priorities for a […]

 
Read More..

Share RDDs Across Jobs with Qubole’s Spark Job Server

  • By Rohit Agarwal
  • November 16, 2015
 

When we launched our Spark as a Service offering in February, we designed it to run production workloads. Users would write standalone Spark applications and run them via our UI or API. We then enhanced the offering by adding support for running these standalone Spark applications on a schedule using our scheduler or as part […]

 
Read More..

4 Tips For Breaking Down Data Silos

  • By Jonathan Buckley
  • November 12, 2015
624x154-breaking-silos
 

Companies are eager to use big data analytics to improve their business operations, but many have found that fully implementing the strategy is extremely difficult. Granted, big data can be complex, but many of the challenges businesses have encountered have nothing to do with big data itself. The real problem lies in the organizational structure […]

 
Read More..

Riding the Spotted Elephant

Riding-the-Spotted-Elephant
 

Introduction: One of the benefits of moving Hadoop workloads to the cloud is reducing cost and risk. No up front capital expense on hardware is required and on-going expenditure scales only in response to actual usage. This greatly lowers risk. Services like Qubole eliminate administration overhead as well. Amazon EC2 offers multiple instance purchasing options. […]

 
Read More..

Hadoop Happenings: Most Failing at Big Data

  • By Jonathan Buckley
  • November 10, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week addressed the surging opportunities presented by big data technology coupled with the challenges of hiring big data talent and keeping big data secure. Read the full stories below. 1. Big Data Knowledge Base: Hadoop, Spark, Flink SparkBigData.com- This post provides […]

 
Read More..

Building Blocks of a Data-Driven Organization

  • By Jonathan Buckley
  • November 5, 2015
624x154-building-blocks-big-data
 

Organizations have seen the value that big data can add. It’s no mistake that so many businesses have chosen to adopt big data solutions in recent years, since the potential those solutions bring can be monumental. Success always seems right around the corner when using big data, but too often, success can be hard to […]

 
Read More..

Share Data Across Accounts with Data Exchange

  • By Xing Quan
  • November 4, 2015
 

This post was written by Vikram Agrawal and Aswin Anand, who are both lead engineers at Qubole. Qubole has the concept of users and accounts. While customers sign in as a single user, they can also belong to one or more accounts. This account segregation provides some nice logical separation for compute clusters and metadata. […]

 
Read More..

Hadoop Happenings: Applications Platform

  • By Jonathan Buckley
  • November 3, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week focused on applications of Hadoop in energy and agriculture and the growth of Hadoop in the enterprise. Read the full stories below. 1. The Real Scoop on Hadoop SAS.com- Cloudera’s Mike Olson discusses the latest trends and changes to Hadoop. […]

 
Read More..

Introducing Hadoop, Spark, and Presto Clusters With Zero Local Disk Storage

  • By Sourabh Goyal
  • November 1, 2015
 

We’re excited to announce that Qubole can now run Hadoop, Spark, and Presto clusters with zero local disk storage. We now support AWS M4 and C4 instance types, which do not include local disk storage and instead utilize either S3 (for long-lived data) or EBS (network attached disk-storage for holding intermediate and temporary data) for […]

 
Read More..