Blog

 
 

5 Ways to Leverage Social Media Data For Your Business

  • By Jonathan Buckley
  • September 3, 2015
social media data
 

Big Data is transforming the business world. The ability to capture, manage and analyze massive volumes of unstructured data for insights that lead to competitive advantage is a game-changer for businesses large and small. With the explosion of social media, never ending streams of data flowing in from Facebook, Twitter, Pinterest, and other social sites […]

 
Read More..

Hadoop Happenings: Rethinking Enterprise Search

  • By Jonathan Buckley
  • September 1, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week articles focused on big data’s growth across industries, data security and governance and improving enterprise search. See the full stories below. 1. Are you a data hoarder? Hadoop offers little choice InfoWorld.com- Data governance tools are being […]

 
Read More..

Causes of Dirty Data and How to Combat Them

  • By Jonathan Buckley
  • August 27, 2015
624x154-clean-data-points
 

By now, most businesses understand the appeal of using big data analytics. With big data, companies can improve their efficiency, increase productivity, and gain valuable insights that drive their work forward. Few will deny the important role big data now plays in organizations all over the world, but gaining those unique benefits requires having high […]

 
Read More..

Hadoop Happenings: Spark won’t Die

  • By Jonathan Buckley
  • August 25, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week posts focused on why Spark will continue to grow, case studies within HR and retail, and Hortonworks’ recent acquisition of Onyara. See the full stories below. 1. Hortonworks buys better Hadoop data flow management InfoWorld.com- Hortonworks has […]

 
Read More..

Multi-tenant Job History Server for Ephemeral Hadoop and Spark Clusters

  • By Rohit Agarwal
 

Introduction Qubole Data Service (QDS) allows users to configure logical Hadoop and Spark clusters that are instantiated when required. These clusters auto-scale according to the workload and shut down automatically when there is a period of inactivity, resulting in substantial cost savings. This feature, however, presents an additional challenge for supporting and debugging logs. For […]

 
Read More..

The Benefits of Decoupling Storage and Compute

  • By Jonathan Buckley
  • August 20, 2015
decoupling storage and compute
 

Big data has come to dominate advantages in nearly every type of business out there, and the need to gather and analyze enormous amounts of data has become extremely important. To make the most of big data, many companies are utilizing big data platforms capable of sorting all that information into actionable data. Such platforms […]

 
Read More..

Infographic: 5 Crucial Considerations for Big Data Adoption

  • By Jonathan Buckley
 

Big data has the potential to enhance, evolve and drive business, but big data adoption must be carefully planned and executed in order to be effective. The graphic below highlights 5 crucial factors that all organizations should take into account before selecting a big data vendor. Interested in learning how the cloud can help you […]

 
Read More..

Hadoop Happenings: SQL-on-Hadoop Evaluation

  • By Jonathan Buckley
  • August 18, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week LinkedIn open-sourced its Hadoop plugin, Pearson offered a SQL-on-Hadoop evaluation, and MapR’s Ted Dunning weighed in on open source projects that aren’t really open. Read the full stories below. 1. Hadoop: What is it and Why Does it Matter? SAS.com- […]

 
Read More..

5 Best Practices for Big Data Project Management

  • By Jonathan Buckley
  • August 13, 2015
624x154-best-practice-data-project-management
 

Big data has gone mainstream. The constant, exponential growth of volumes of structured and unstructured data has significantly increased the number of big data projects, especially over the last few years. Thanks to the increased availability of the open-source Hadoop analytics platform, and the growth of big data in the cloud services, big data’s barriers […]

 
Read More..

SQL-On-Hadoop Evaluation by Pearson

  • By Nate Philip
 

This is a guest post written by Sumit Arora, Lead Big Data Architect at Pearson, and Asgar Ali, Senior Architect at Happiest Minds Technologies Pvt., ltd. About Pearson Pearson is the world’s leading learning company, with 40,000 employees in more than 70 countries working to help people of all ages to make measurable progress in […]

 
Read More..

Hadoop Happenings: Navigating Hadoop’s Ecosystem

  • By Jonathan Buckley
  • August 11, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week articles focused on navigating the complex Hadoop ecosystem and understanding the differences between Hadoop and Spark. See the full stories below. 1. Big data startup Platfora appoints Jason Zintak as CEO; founder Ben Werther steps down VentureBeat.com- […]

 
Read More..

5 Factors That Impact the Performance of Your Big Data Project

  • By Jonathan Buckley
  • August 6, 2015
624x154-Impact-Performance
 

The drive to make the most out of big data is in full swing, with companies eagerly looking into big data analytics tools designed to get the most out of the valuable information they are collecting. The insights gained from proper analysis of big data can lead to big dividends later on, but getting to […]

 
Read More..

Hadoop Happenings: Compliance and Dirty Data

  • By Jonathan Buckley
  • August 4, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week articles focused on meeting regulatory compliance, avoiding dirty data, and the role of bias in machine learning. See the full stories below. 1. Lack of Legacy Lets Capital One Build Nimble Infrastructure ThePlatform.net-Capital One has always relied on data analytics […]

 
Read More..

Top Apache Spark Use Cases

  • By Jonathan Buckley
  • July 30, 2015
apache spark use cases
 

Apache Spark is quickly gaining steam both in the headlines and real-world adoption. UC Berkeley’s AMPLab developed Spark in 2009 and open sourced it in 2010. Since then, it has grown to become one of the largest open source communities in big data with over 200 contributors from more than 50 organizations. This open source […]

 
Read More..

Qubole’s Big Data as a Service Platform Gains Rapid Traction in Mobile Data Applications

  • By Nate Philip
 

MOUNTAIN VIEW, Calif.—July 30, 2015—Qubole, the big data-as-a-service company founded by the team that developed Facebook’s data infrastructure, today reported rapid adoption of its self-service big data analytics platform for mobile applications in the first half of 2015. The Qubole big data as a service platform processes data stored on the three major public clouds: […]

 
Read More..

Hadoop Happenings: Big Data Is Doing The Thinking For Us

  • By Jonathan Buckley
  • July 27, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week’s articles focus on big data’s growing ability to think for us and the safety precautions every company should be taking. 1. What Big Data Strategists Can Learn From a Con Artist Forbes.com – Avoid getting lost in […]

 
Read More..

6 Tips for Big Data Marketing

  • By Jonathan Buckley
  • July 23, 2015
624x154-6-tips-for-big-data
 

Big data has ushered in the era of data driven marketing. Massive volumes of data, streaming in at lightening speeds from a variety of channels, is rich with raw customer information containing valuable insights marketers can use to create more personalized, relevant and effective campaigns. McKinsey studies show that companies that factor data insights heavily […]

 
Read More..

Hadoop Happenings: Best Practices

  • By Jonathan Buckley
  • July 21, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week articles focused on best practices for securing and managing Hadoop. VMWare also published a benchmark test of virtualized Hadoop. See the full stories below. 1. Your Checklist for getting started with Hadoop SAS.com- 8 items you need […]

 
Read More..

The Future of Big Data

  • By Jonathan Buckley
  • July 16, 2015
624x154-future-of-big-data
 

Big Data is both revolutionary and evolutionary. The early promise that organizations could derive valuable insights through the analysis of massively large sets of unstructured data was seen as a potential game changer for how businesses operate and compete. However, early big data adoption was complex and costly, requiring large investments in hardware and teams […]

 
Read More..

Presto-Amazon Kinesis Connector for Interactively Querying Streaming Data

  • By Sivaramakrishnan Narayanan
 

This content was authored by Qubole and originally published on the AWS Big Data Blog. Amazon Kinesis is a scalable and fully managed service for streaming large, distributed data sets. As applications (particularly on mobile and wearable devices) start to collect more and more data, Amazon Kinesis is becoming the starting point for data ingestion […]

 
Read More..

Drag-n-Drop upgrades of Hadoop, Spark and Presto Clusters

  • By Mayank Ahuja
  • July 15, 2015
 

Introduction As the Big Data stack has matured, many companies have started using large clusters for running business critical applications. Workloads in such clusters are often long running (for hours or even days) and restarting a cluster poses a big problem: What happens to jobs that are already running? Restarting all these jobs wastes a […]

 
Read More..

Hadoop Happenings: The March Continues

  • By Jonathan Buckley
  • July 14, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop all in one place with this week’s Hadoop Happenings. This week, discussion continued on Apache Spark and the growing Hadoop ecosystem. An article from Forbes discussed the complexity of hiring a data scientist, and articles covered additional big data use cases. Read the full stories below. 1. […]

 
Read More..

Hive JDBC Storage Handler

  • By Divyanshu Goyal
 

As a part of my summer internship project at Qubole, I worked on an open-source Hive JDBC storage handler (github). This project helped me improve my knowledge on distributed systems and gave me exposure of working on a team on large projects. In many big data projects, integrating data from multiple sources is a common […]

 
Read More..

NoSQL and Big Data: Is a NoSQL Database for You?

  • By Jonathan Buckley
  • July 9, 2015
nosql databases
 

Big data is getting bigger and more chaotic every day. Thanks to the Internet, social media, mobile devices and other technologies, massive volumes of varied and unstructured data—streaming in at unprecedented speeds—are bombarding today’s businesses both large and small. This explosion of data is proving to be too large and too complex for relational databases […]

 
Read More..

Hadoop Happenings: A Better Hadoop Cluster

  • By Jonathan Buckley
  • July 7, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week Gartner once again addressed the question of, “what is hadoop?” Several posts addressed big data challenges, and LANDR gained funding for its intelligent mastering engine. See the full stories below. 1. Now, What is Hadoop? Gartner.com- In […]

 
Read More..

Announcing Saved Queries for Qubole Data Service

  • By Raghunandan Balachandran
  • July 2, 2015
 

We are always striving to add features to simplify the experience of our customers using Qubole Data Service (QDS). One of the major feature asks which has come up time and again is the ability to design queries and save them in a design time repository. This concept would allow separation of design time artifacts […]

 
Read More..

Qubole Recognized as Advanced Technology Partner by Amazon Web Services

  • By Nate Philip
  • July 1, 2015
 

With Qubole on AWS, any size organization can become data-driven with self-service access to the latest big data technologies MOUNTAIN VIEW, Calif., July 1, 2015—Qubole, the big data-as-a-service company founded by the team that developed Facebook’s data infrastructure, today announced it is now an Amazon Web Services (AWS) Advanced Technology Partner. Qubole’s self-service platform for […]

 
Read More..

Hadoop Happenings: Business Applications

  • By Jonathan Buckley
  • June 30, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop all in one place with this week’s Hadoop Happenings. This week posts focused on industry use cases from industry management to HR. Read the full stories below. 1.comScore CTO shares big data lessons CIO.com- Mike Brown, CTO at comScore, shares some of the lessons he’s learned in […]

 
Read More..

Hadoop is Hard! But Big Data Doesn’t Have To Be

  • By Jonathan Buckley
  • June 25, 2015
Hadoop-is-hard-624x154
 

When it comes to big data analytics, Hadoop has been heralded as the all-in-one solution for the enterprise. And while the many benefits of Hadoop adoption tend to support all the praise, the reality is that organizations that attempt to manage Hadoop themselves quickly discover that doing so is flat out hard, if not impossible. […]

 
Read More..

CUBE Keyword in Apache Hive

  • By Rajat Venkatesh
  • June 19, 2015
 

Introduction As part of a recent project – I had to experiment with CUBE functionality in Hive. This functionality was added somewhat recently to Hive (version 0.10) and is an advanced use case in Hive. Perhaps for these reasons – it is difficult to find examples other than the one in the Hive Wiki. In […]

 
Read More..

Big Data Challenges: Why the Majority of Big Data Projects Fail

  • By Jonathan Buckley
  • June 18, 2015
big-data-project-fails
 

  To truly experience growth in the future, most businesses are turning to big data. In many cases, big data is seen as the new trend guaranteed to make companies more successful. Businesses frequently turn to big data solutions for special projects designed to integrate data into normal operations and open up new business opportunities. […]

 
Read More..

Hadoop Happenings: Spark Rises

  • By Jonathan Buckley
  • June 16, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week Apache Spark and other new open source projects came to the forefront. Meanwhile, use cases for Hadoop abounded at the recent Hadoop Summit. See the full stories below. 1. Companies Move on From Big Data Technology Hadoop […]

 
Read More..

5 Tips for Creating a Data-Driven Culture

  • By Jonathan Buckley
  • June 11, 2015
624x154-data-driven-culture
 

  The way businesses operate is rapidly changing every single day. Perhaps no example better illustrates this than the rapid growth and adoption of big data solutions. To say many companies are seeking to become more data-driven would be an understatement. Right now, organizations are working hard to utilize new business tools intended for the […]

 
Read More..

Hadoop Happenings: Success Stories

  • By Jonathan Buckley
  • June 9, 2015
hadoop-happenings
 

Grab all the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week a new real-time engine for Hadoop was released, discussion continued on Hadoop’s growth, and commentators discussed several big data success stories. Get the full stories below. 1. Can Hortonworks Dominate the Hadoop Market? Forbes.com- CEO of Hortonworks […]

 
Read More..

Rebalancing Hadoop Clusters for Higher Spot Utilization

  • By Hariharan Iyer
 

Running Hadoop clusters efficiently is an important customer use case at Qubole. When running in AWS, this often means using Spot instances efficiently. In this post we introduce the notion of Rebalancing Hadoop clusters to achieve a higher mix of Spot instances – while still maintaining reliability and meeting SLAs. Spot Instances At Qubole, many […]

 
Read More..

Apache Hadoop 2.6.0 Now Generally Available on Qubole

  • By Xing Quan
  • June 4, 2015
 

We’re excited to announce that Apache Hadoop 2.6.0, the latest stable release* of Apache Hadoop, is now generally available on Qubole. Hadoop 2.6.0 is compatible with all of the usual services that Qubole offers, including Spark, Hive, Pig, and MapReduce. In addition, the optimizations that we’ve made for operating in the cloud, such as auto-scaling […]

 
Read More..

How to Choose a Big Data-as-a-Service Company

  • By Jonathan Buckley
Big Data as a Service
 

The world of big data is all around us. Transactions, sensors, social media, mobile devices, wearables, and a host of other sources are generating datasets of unprecedented volume, velocity and variety. This big data explosion presents enormous opportunities for organizations that are able to capture, manage, and analyze massive volumes of disparate data for insights […]

 
Read More..

Hadoop Happenings: Semantic Data Lake

  • By Jonathan Buckley
  • June 2, 2015
hadoop-happenings
 

Grab all of the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week commentary continued on how and why Hadoop will move into the mainstream. The semantic data lake was introduced, and a new list of the most influential people in big data was released. See the full stories below. 1. […]

 
Read More..

Choosing the Right Infrastructure: The Key to Success With Big Data

  • By Jonathan Buckley
  • May 29, 2015
624x154-cloud-vs-on-premise-banner
 

The benefits of big data analytics are no longer debatable. Businesses large and small are enjoying greater profitability and competitive advantage through the capture, management, and analysis of vast volumes of unstructured data. The main debate with big data now is whether an on-premise big data analytics infrastructure offers the flexibility needed to be successful […]

 
Read More..

Hadoop Happenings: It’s Still Complicated

  • By Jonathan Buckley
  • May 26, 2015
hadoop-happenings
 

Grab all the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week vendors and analysts pushed back against a Gartner study indicating that Hadoop adoption is slowing. Apache Drill and Apache Hive were updated, and big data is taking off in the oil industry. Get the full stories […]

 
Read More..

5 Reasons Savvy New Gen Companies Turn to the Cloud for Big Data

  • By Jonathan Buckley
  • May 21, 2015
624x154_5_reasons_Saavy_Companies
 

Of all the current trends in technology, few have created as much buzz as cloud computing and big data. While both grew in popularity, it only stands to reason that they would eventually cross paths. This is exactly what has happened in recent years as the number of cloud services based around big data analytics […]

 
Read More..

Hadoop Happenings: Hadoop Growth Slowing?

  • By Jonathan Buckley
  • May 19, 2015
hadoop-happenings
 

Grab all the latest news and commentary about Hadoop in this week’s Hadoop Happenings. Gartner’s latest survey indicating slow growth in demand for Hadoop garnered extensive media attention this week. Commentators pointed out that the high opportunity cost for deploying Hadoop can be overcome by Hadoop-as-a-Service solutions, and others dismissed the concerns altogether. See the […]

 
Read More..

Hadoop Happenings: ORC, Spark and Flink

  • By Jonathan Buckley
  • May 12, 2015
hadoop-happenings
 

Grab all the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week Apache ORC became a top-level project. Commentary continued on Apache Spark and Apache Flink, and Forbes discussed whether big data will have an impact on next year’s presidential election. See all the stories below. 1. Apache ORC Launches as […]

 
Read More..

7 Big Data Security Concerns

  • By Jonathan Buckley
  • May 7, 2015
624x154-Big-Data-Security
 

Big data is more than just some trending business phrase that’s big on style and low on substance; it brings with it tangible benefits for any company willing to use it. The advantages of leveraging big data are real and oftentimes far-reaching, which is why so many organizations have adopted big data for their own […]

 
Read More..

Announcing Qubole’s HBase-as-a-Service for AWS

  • By Jonathan Buckley
  • May 6, 2015
 

Today we are pleased to announce the Beta offering of Qubole’s HBase-as-a-Service. QDS can now provide fully managed HBase 1.0.0 running on Hadoop 2.6.0 as a managed service on the AWS Cloud. Introduction to HBase Apache HBase is an integral part of the Apache Hadoop ecosystem. When fast reads and writes with high concurrency and […]

 
Read More..

Hadoop Happenings: Actionable Big Data

  • By Jonathan Buckley
  • May 5, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop and big data in this week’s Hadoop Happenings. This week focus turned to big data use cases and asking the right questions when performing data analytics. Apache Parquet was also announced as a top-level project. 1. Apache Parquet paves the way for better Hadoop data storage Infoworld.com- […]

 
Read More..

Opportunities and Challenges of Big Data in Ad Tech

  • By Jonathan Buckley
  • April 30, 2015
624x154-opportunities-challenges-banner-sm
 

Big data is transforming numerous industries around the world, so perhaps it shouldn’t come as a surprise that advertising technology is one of those recipients. While ad tech has been around for a while now, only in the past few years have companies latched onto the idea that big data can make online advertising that […]

 
Read More..

Hadoop Happenings: ODP Fireworks

  • By Jonathan Buckley
  • April 28, 2015
hadoop-happenings
 

Grab all of the latest news and commentary about Hadoop in one place in this week’s Hadoop Happenings. This week the debate continued over the Open Data Platform. SAP highlighted its support for Hadoop in the enterprise, and a data engineer from Shazam discussed why he chose Presto over Apache Hive. See the full stories […]

 
Read More..

In Case You Missed It: Webinar – Getting to 1.5M Ads/sec

  • By Jonathan Buckley
  • April 23, 2015
Qubole-DataXu-AWS-Post-Webinar
 

At the end of March, Qubole, along with DataXu and Amazon Web Services, hosted a special webinar detailing DataXu’s work with big data and the special platforms provided by both Qubole and AWS. Speakers at this highly informative webinar included Scott Ward, a Solutions Architect at Amazon Web Services, Ashish Dubey, a Solutions Architect at […]

 
Read More..

Hadoop Happenings: Positioning Battles

  • By Jonathan Buckley
  • April 21, 2015
hadoop-happenings
 

  Grab all of the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week vendors continued their positioning battle as Pepperdata got more funding, Hortonworks acquired a new partner, and commentators dismissed much of the messaging as a distraction. See all of the stories below. 1. Pepperdata Scores $15M for Hadoop […]

 
Read More..

Connecting Offline and Online Data: A Powerful Tool for Marketers/Advertisers

  • By Jonathan Buckley
  • April 16, 2015
Connecting-offline-online-data-624x154
 

The rise of big data has ushered in a new era for marketers and advertisers— the era of data driven marketing. With massive amounts of rich online data constantly flowing in from multiple sources, marketers can use analytics to gain insights about customer habits, behaviors and preferences that the analysis of offline data could never […]

 
Read More..

Hadoop Happenings: Cloud Rises

  • By Jonathan Buckley
  • April 14, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week focus turned to the cloud as more vendors are seeking to offer cloud deployments. Apple and IBM also teamed up to offer more advanced digital health data, and Think Big released its Dashboard Engine for Hadoop. 1. Why Cybersecurity Needs […]

 
Read More..

6 Big Data Mistakes Businesses Make

  • By Jonathan Buckley
  • April 9, 2015
624x154_big_data_mistakes
 

Big Data is reaping big benefits for businesses. But the key to success with big data lies in doing it right. All too often businesses jump on the big data bandwagon with unclear strategies and unreasonable expectations for what big data can do. As a result, the full potential and value of big data are […]

 
Read More..

Hadoop Happenings: Spark, Hadoop, Infrastructure

  • By Jonathan Buckley
  • April 7, 2015
hadoop-happenings
 

Grab all of the latest news about Hadoop in this week’s Hadoop Happenings. This week discussion continued on Apache Spark with some arguing its hype could be a self-fulfilling prophecy and others claiming the debate shouldn’t be about the technology at all but rather the infrastructure. See the full debate and other commentary below. 1. […]

 
Read More..

Big Data, Ad Tech, and Privacy Concerns: How The Digital Advertising Industry Can Boost Transparency

  • By Jonathan Buckley
  • April 2, 2015
624x154-Big-Data-Privacy
 

Big data is a game changer for the digital advertising industry. Thanks to powerful big data tools, digital advertisers can analyze mountains of insight-rich data from multiple sources, enabling them to deliver online and mobile ads to consumers that are more personalized and targeted than ever before. This new era of data driven marketing is […]

 
Read More..

Hadoop Happenings: Velocity and Quality

  • By Jonathan Buckley
  • March 31, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week conversation centered on businesses ensuring the quality of their data. Hotels.com provided an interesting use case, and Forbes published an article focused on the velocity of data. 1. Apple acquires big data analytics firm Acunu AppleInsider.com- Apple […]

 
Read More..

Qubole Presents New Webinar: Getting to 1.5M Ads/sec

  • By Jonathan Buckley
  • March 27, 2015
qubole-dataxu-aws-webinar
 

Qubole, DataXu, and Amazon Web Services are set to present a new webinar coming up on March 30, 2015 at 10 a.m. PDT / 1 p.m. EDT. The webinar is titled “Getting to 1.5M Ads/sec.” Much of the focus of the upcoming webinar involves DataXu and how the company manages their big data. DataXu works […]

 
Read More..

5 Tips to Attract and Retain Talent From the STEM Fields

  • By Jonathan Buckley
  • March 26, 2015
retain-talent-from-stem-banner624x154
 

Every day more and more companies are turning to big data and analytics to become more competitive and profitable. The natural result of this growing trend is an increase in demand for talented individuals in the Science, Technology, Engineering, and Mathematics (STEM) fields. This presents a problem for companies in the Ad Tech, Internet, and […]

 
Read More..

Hadoop Happenings: Growth, Jabs and Questions

  • By Jonathan Buckley
  • March 24, 2015
hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week discussion continued on the Hadoop vs. Spark debate. Security continues to be a top concern with enterprises adopting Hadoop, and big data adoption continues to grow. 1. Is Apache Spark going to replace Hadoop? Aptuz.com- This post […]

 
Read More..

5 Tips To Communicate Your Tech Company’s Benefits To Potential Customers

  • By Jonathan Buckley
  • March 19, 2015
5_Tips_Sm_v01
 

In 1999, NASA’s $125 Million Mars Climate Orbiter was forever lost in space because one team of scientists measured in metric and the other team used English imperial. While this fundamental lack of communication on NASA’s part is laughable, the loss of potential customers due to an inability to effectively communicate the benefits of their […]

 
Read More..

Qubole connects to Amazon Redshift

  • By Jonathan Buckley
  • March 18, 2015
 

A few weeks ago, we announced the addition of Apache Spark to the Qubole Data Service. This new capability has been received with tremendous excitement among customers who can now take advantage of Spark’s blazing fast speed. Today, we’ve extended the Qubole platform even further with a connector to Amazon Redshift. This makes data scientists’ […]

 
Read More..

Hadoop Happenings: Apache Tajo Released

hadoop-happenings
 

Grab the latest news and commentary about Hadoop all in one place in this week’s Hadoop Happenings. This week Apache Tajo was officially released for commercial use, commentary focused on the ever-changing Hadoop ecosystem, and discussion continued on Hadoop security. 1. Navigating the Hadoop Ecosystem Oreilly.com- An introductory overview of Field Guide to Hadoop, this […]

 
Read More..

Bridging HDFS2 with HDFS1

  • By Rajat Jain
  • March 14, 2015
 

Industry is rapidly moving to adopt Hadoop 2.x. With every upgrade process — especially one that is so big in nature — there is a level of complexity involved. Qubole has already started offering a beta service to our customers. Our customers have started to try out Hadoop 2 as well, and as with any […]

 
Read More..

Hadoop with Enhanced Networking on AWS

  • By Hariharan Iyer
  • March 13, 2015
 

Introduction At Qubole, many of our customers run their Hadoop clusters on AWS EC2 instances. Each of these instances is a Linux guest on a Xen hypervisor. Traditionally each guest’s network traffic goes through the hypervisor, which adds a little bit of overhead to the bandwidth. EC2 now supports Single Root I/O Virtualization (called Enhanced […]

 
Read More..

Information and Insights: Big Data vs. Actionable Data

Info_and_Insights_sm
 

The era of data-driven business has arrived. Big data analytics tools are enabling organizations to capture, manage and mine mountains of raw chaotic data from multiple sources to gain insights that inform products, services and marketing strategies. The challenge is that not all big data insights are relevant and meaningful enough to spark real change […]

 
Read More..

Hadoop Happenings: War of Words

hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week distributions waged a war of words as Cloudera dismissed its competitors, and Hortonworks defended the Open Data Platform. Apache Tajo was also declared ready for commercial use by the Apache Software Foundation. 1. The End Game for […]

 
Read More..

Improving the Consumer Experience: How Media Companies are Using Big Data

Improving_Consumer_Experience_Sm_final
 

Big data is transforming entire industries, and the media industry is no exception.  Once media companies relied on traditional data to make educated guesses at what content consumers were looking for. Today they have mountains of rich data that reveals what consumers are doing, searching, consuming, tweeting, liking and sharing. Armed with this information, media […]

 
Read More..

Plugging in Presto UDFs

  • By Sivaramakrishnan Narayanan
  • March 4, 2015
 

Presto is a great query engine for a variety of SQL workloads. We’ve been offering  Presto-as-a-Service for many months now and a frequent question that comes up is: “How can I plug-in custom user-defined functions in Presto?” In this blogpost, we will answer this very question. We’ve created a Presto UDF Project in github that […]

 
Read More..

Hadoop Happenings: Market Rumblings

hadoop-happenings
 

1. Making Sense of the ODP-Where Does Hadoop Go From Here? Datanami.com- Debate over the purpose of Open Data Platform got heated with some dismissing it as a distraction or a grasp at relevance while others claim its necessary to spur Hadoop innovation. Read More 2. Big Data Bits: Strata + Hadoop World Rewind CMSWire.com- […]

 
Read More..

Clickstream Data Analysis: A Powerful Tool for Your Business

Clickstream-data-analysis-thumb-v02
 

  Big data analytics platforms such as cloud-based Hadoop have become powerful tools for businesses looking to leverage vast sets of customer data for competitive advantage. But with so much rich data streaming in from multiple sources, the analytics challenge for many businesses is determining what types of data will yield the highest amounts of […]

 
Read More..

Hadoop Happenings: Strata + Hadoop 2015

hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place in this week’s Hadoop Happenings.  From the controversial Open Data Platform to increasing support from Spark, coverage of the recent O’Reilly Strata + Hadoop world conference along with accompanying vendor announcements dominated the news this week. Hadoop vendor Cloudera also pressed pause on movement […]

 
Read More..

Qubole Adds Apache Spark to Hadoop-based Cloud Offering

 

One of the things customers love about Qubole is that they’re able to use the latest and greatest technologies—without having to fiddle with deploying it on their own. Continuing this tradition, I’m pleased to announce that we’ve expanded our portfolio of services on the Qubole Data Services (QDS) platform to include Apache Spark. Data scientists […]

 
Read More..

Hadoop Happenings: Open Data Platform

hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week several large companies formed a new alliance. Mortar Data was purchased by one of its customers, and the first hints at Cloudera’s IPO were released. 1. Cloudera appears to be preparing for and IPO in the next […]

 
Read More..

Using Big Data for Digital Decision Making

Using-Big-Data-for-Decision-Making3-thumb
 

  Technology is a non-stop rollercoaster ride of innovation, with devices becoming increasingly faster and smarter. Computers infused with artificial intelligence systems are now able to analyze more data, recognize patterns and make decisions in real-time like never before. This presents a number of new opportunities for many industries. Improved analytics and decision-making abilities allow […]

 
Read More..

2014 Was a Great Year for Qubole

 

Today we reported some impressive stats on our growth in 2014. In short, last year was a phenomenal year for the company. The amount of data our clients processed on Qubole in 2014 soared to 519 petabytes of data, compared to 34 petabytes in 2013. In fact, we’re now processing more than 100 petabytes per […]

 
Read More..

Hadoop Happenings: Vendor Shifts

hadoop-happenings
 

Grab the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week Cloudera announced its acquisition of startup Xplain.io. Discussion continued on the future of Apache Spark, and rumors flew about Pivotal’s future in the big data arena. 1. Cloudera acquires self-service data-modeling startup Xplain.io Gigaom.com- Cloudera acquired Xplain.io, […]

 
Read More..

Qubole on Azure

  • By Swati Singhi
  • February 9, 2015
 

Qubole is the leading provider of Hadoop as a service. Our mission is to provide a simple, integrated, high-performance big data stack that businesses can use to derive actionable insights from their data sources quickly. Qubole Data Service (QDS) offers self-service and auto-scaling Hadoop in the cloud (patent pending) along with an integrated suite of data […]

 
Read More..

Reaping the Benefits of Real-Time Analytics

reaping-benefits-blog-banner-624x154
 

  Discussions surrounding big data often mention its three Vs: volume, variety and velocity. The most commonly discussed of those three is obviously volume, which isn’t surprising given the name big data. However, variety and velocity are just as important in the equation. In fact, velocity is too often overlooked. Companies are so focused on […]

 
Read More..

Hadoop Happenings: Spark Escalates

hadoop-happenings
 

Grab the latest news and commentary about Hadoop all in one place in this week’s Hadoop Happenings. This week focus turned to Apache Spark as it continues to gain interest as a faster more flexible alternative to MapReduce. There was also discussion on the growing role of data scientists and whether the role should be […]

 
Read More..

360-Degree View of Customer: Seeing the Big Picture Through the Big Data Lens

360-Degree-Customer-View_small
 

  Poaching continues to be a significant problem throughout the world, specifically in Africa. Every year thousands of different animals are illegally hunted, many of which are endangered and on the brink of extinction. In an effort to fight this problem, scientists have gone to great lengths to better understand these creatures. They study them […]

 
Read More..

Qubole Uses Drone To Announce Expansion

 

Big Data as a Service leader Qubole recently moved to larger quarters in Mountain View, Calif. To mark the occasion, the company used a DJI Phantom 2 Vision+ to videotape and commemorate the occasion. Qubole has joined a number of high technology firms using drones to get the word out. Qubole co-founder and CEO Ashish […]

 
Read More..

Hadoop Happenings: Apache Falcon Graduates

hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week the Apache Software Foundation announced Apache Falcon has graduated to a top-level project. Hortonworks’ distribution is now available on the Google Cloud Platform, and Netflix is open sourcing some of its analytics tools. 1. Netflix is open sourcing tools for […]

 
Read More..

Streamline Multi-Channel Marketing with Big Data

multi-channel-marketing
 

The growing impact of mobile devices is no secret to marketers. A 2013 report from the Winterberry group found that from 2012 to 2013 spend on mobile search marketing doubled, and the cost-per-click is now higher on tablets than desktops. Of course, the true challenge isn’t mobile marketing but the consumer’s propensity to move from […]

 
Read More..

Qubole partner ecosystem continues to grow: Xcentium uses Qubole Data Service to enhance analytics for e-commerce companies

 

2015 is off to a great start for Qubole, with the addition of a new member to our partner ecosystem—Xcentium. Xcentium is a world-class digital services and technology provider to Fortune 1000 organizations. Its E-commerce & Digital Services Practice is using Qubole’s self-service big data platform to quickly build enterprise scale solutions for its e-commerce […]

 
Read More..

Hadoop Happenings: Security Concerns

hadoop-happenings
 

Grab all of the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week Gigaom continued a series on the rivaling Hadoop security projects. Cloudian become a Hortonworks technology partner, and Information Week covered Hortonworks’ dilemma of becoming a profitable company while offering solely open source software. 1. Big data upstart MapR […]

 
Read More..

Big Data and Customer Micro-Segmentation: Applications in Media and E-Commerce

customer-segmentation
 

Customer segmentation is nothing new to an experienced marketer. Traditional B2C segments based on demographic, psychographic and behavioral data are taught in introductory college courses, and B2B marketers are well familiar with segments based on company size or purchase criteria. While these segments have served as a useful guide for decades, the era of big […]

 
Read More..

Hadoop Happenings: A Re-framing of Hadoop

hadoop-happenings
 

  Grab all the latest news and commentary on Hadoop in this week’s Hadoop Happenings. This week a guest post on the Hortonworks blog discussed Apache Ranger, AutoDesk discussed its plans for Hadoop and the cloud, and several sites weighed in on how Hadoop will continue to transform as an enterprise solution. 1. Hadoop Security: […]

 
Read More..

Big Data: The Solution to Ad Fraud?

big-data-ad-fraud
 

Those in the online advertising industry have a grim reality to face: fraud is rampant and increasingly costly. Considering how important advertising revenue is for online companies, media sites, and publishers, the problem hasn’t received much attention until recently. Having advertisers paying for fake ad impressions represents a potential breach in trust as long as […]

 
Read More..

Hadoop Happenings: Revitalize a Brand

hadoop-happenings
 

Grab the latest news and commentary about Hadoop in this week’s Hadoop Happenings. This week Datameer published statistics on big data and Hadoop in the news. Timberland discussed how it used data science to revitalize its brand, and a post discussed Facebook’s development of HydraBase. 1. In 2015, enterprises can better utilize the paradigm-shifting Hadoop […]

 
Read More..

Finding Business Value in Sentiment Analysis Data

sentiment-analysis
 

The explosion of social media and the proliferation of mobile devices have created a “perfect storm” of opportunities for customers to express their feelings and attitudes about anything and everything at anytime. This opinion or “sentiment” data, generated through social channels in the form of reviews, chats, shares, likes tweets, etc., often includes comments that […]

 
Read More..

Hadoop Happenings: Hadoop Lives Up to Hype

hadoop-happenings
 

Grab all of the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week focused on analysts’ projections for the Hadoop industry’s growth as well as the growing need for data scientists. Use cases for Hadoop in banking and telecommunications were also discussed. 1. Is Hadoop over-hyped? Market-watchers say […]

 
Read More..

Looking Forward: Hadoop Industry Trends

The-Internet-of-Things-and-Big-Data_small
 

  From its primitive beginnings as a modest open source search engine called “Nutch”, Hadoop has evolved into a powerful big data analytics platform. As big data technologies and policies rapidly advance, Hadoop is just getting started. In a recent article on Computerworld.com, writer Robert L. Mitchell interviews IT leaders, consultants and industry analysts to […]

 
Read More..

Hadoop Happenings: Predictions for Hadoop

hadoop-happenings
 

  Grab all of the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. Predictions for 2015 continued this week with a focus on big data adoption and the simplification of Hadoop. Forrester released a report on the big data in the cloud industry, and MapR’s CEO discussed the possibility […]

 
Read More..

Re-using JVMs across Hadoop jobs

  • By Sivaramakrishnan Narayanan
  • December 22, 2014
 

One of the oft-discussed problems with Hadoop is that it launches new JVMs for each map or reduce task. Launching a new JVM and loading all the classes is pretty expensive and can take anywhere from 4-8 seconds. If the job is a small one, this startup overhead can be a substantial part of overall […]

 
Read More..

Not All Hadoop Distributions are Created Equal

hadoop-distributions
 

The debate is over. Big data analytics has proven benefits. And organizations looking to implement a big data solution now have a number of options to choose from. The challenge is selecting the right Hadoop vendor, as not all Hadoop distributions are created equal. As a help to finding the best fit, here are a […]

 
Read More..

Hadoop Happenings: New Round of Funding

hadoop-happenings
 

1. The 6 Things Everyone Needs to Know about the Big Data Economy SmartDataCollective.com- In this post Bernard Marr argues that big data is moving mainstream and discusses several elements of the big data economy. Read More 2. Altiscale Lands $30M To Continue Building Hadoop Cloud Service Techcrunch.com- Altiscale announced Series B funding led by […]

 
Read More..

Hadoop Happenings: Looking to 2015

hadoop-happenings
 

Grab all of the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week there are many predictions for what Hadoop’s future holds. 1. The End of the Hadoop Bubble? Forbes.com- Hortonworks’ rush into a reduced IPO may be to grasp the interest of a potential buyer. There are […]

 
Read More..

Upcoming Webinar: Forrester Analyst Discusses Big Data in the Cloud

Big-Data-in-the-Cloud_small
 

Join us for a live webinar Dec. 10, 2014 at 10am PST/1 pm EST hosted by Noel Yuhanna, principal analyst of enterprise architecture at Forrester Research, and Ashish Thusoo, co-founder and CEO of Qubole. The webinar will discuss how the cloud has helped companies keep up with the fast-changing technology landscape and discuss why big […]

 
Read More..

Hadoop Happenings: Apache Pig 0.14.0

hadoop-happenings
 

Grab all of the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week a new version of Apache Pig was released. Forrester has a new report reviewing the Hadoop ecosystem, and LinkedIn provided details about its Gobblin big data framework. 1. Storage Hangout: Hadoop Plug-in Refresh Release and […]

 
Read More..

Hadoop Happenings: New Releases; Partnerships

hadoop-happenings
 

Grab all of the latest news and commentary about Hadoop in one place with this week’s Hadoop Happenings. This week Cloudera and MapR formed new partnerships. Splice Machine’s SQL on Hadoop database went on general release, and eHarmony discussed it’s future plans with Hadoop. 1. Why eHarmony is rebuilding itself atop Hadoop and (probably) OpenStack […]

 
Read More..
 
 
 

Get Blog Updates

Search Blog

 
 
 
 

Featured Blogs