Nauto Improves its Data Scientist Productivity, Accelerates Product Development

Core Business Problem

Data is at the heart of Nauto’s business. Its core activity is to detect real-time driver, vehicle and road information, then analyze it to provide actionable insights that alert a driver when risky behavior is present and in turn enable car fleet managers to identify and coach high-risk drivers.

To achieve this, Nauto must integrate as much information as possible, and rapidly use this to develop new algorithms and product feature sets to serve its customers. The scale is significant as Nauto is in over 250 fleets worldwide and detects safety events ranging from hard braking collisions to distracted driving and the relevant information around those events.

One of the challenges Nauto faces is managing its growing data volume. Lei Pan, Nauto’s director of engineering, explained: “As we started to scale, our data infrastructure was not ideal as it became very hard to manage our processing.” Nauto then turned to Qubole.

About Nauto

Nauto is a Silicon Valley-based AI-technology company improving the safety of commercial vehicle fleets today and the autonomous fleets of tomorrow. Its intelligent driver safety system – combining dual-facing cameras, computer vision, and proprietary algorithms – assesses how drivers interact with their vehicle and the road ahead, to reduce distracted driving, prevent collisions and save lives.

Single Qubole Platform Helps Improve Analysis, Inform Products and Algorithms

In a rapid three-month project, Nauto migrated to the scalable, cloud-based Qubole data platform. Qubole is accessed by Nauto’s team of DevOps and Data Engineers, as well as business users. Nauto uses tools such as Apache Spark, Presto, and Amazon SageMaker to cleanse and prepare datasets, run analytics, and build and deploy machine learning models. One key result of this integration has been the time-savings for Nauto to develop new products.

The biggest value of Qubole has been our ability to rapidly productize new data use cases.

Lei Pan, Director of Engineering, Cloud Infrastructure, Nauto

Better Productivity and Cost Control in Data Science

Qubole has helped Nauto’s Data Scientists become more productive. Previously, engineers spent many hours in data cleanup, maintenance, and management, including manual spreadsheet-based data processing to tune algorithms. Qubole has largely automated these processes and cut the typical time to provide actionable insights from 1-2 weeks to 2-3 days, a five-fold improvement. This means Nauto’s Data Scientists can focus on ‘productionalizing’ algorithms and providing customers with valuable insights.

Using Presto on Qubole, Nauto’s engineers can also ‘productionalize’ algorithms much faster by testing them against all its disparate customer data assets, rather than a subset. In addition, Qubole’s integration with Amazon SageMaker means Nauto can easily prepare data and define a model in Qubole, then automatically push it to SageMaker to train the model and make it available immediately. Nauto uses Qubole Airflow to orchestrate their data pipelines and model training in Qubole and Amazon SageMaker respectively.

Through Qubole, Nauto has full control and visibility into its data infrastructure costs. They support an increasing user base of data scientists and data analysts without increasing the size of their data team. Qubole’s auto-scaling and other cost-control features enable Nauto to better plan its allocation of resources to support different use cases.

The savings from Qubole makes our data engineering team much more productive. Our data engineering team moved away from doing routine maintenance and management work to focusing on serving our customers’ needs and road safety.

Lei Pan, Director of Engineering, Cloud Infrastructure, Nauto

Looking Ahead

Nauto plans to complete its global rollout of Qubole during 2019 and automate the entire end-to-end process of training new customer algorithms.

Qubole’s integration with Amazon SageMaker streamlines data processing and machine learning.

Lei Pan, Director of Engineering, Cloud Infrastructure, Nauto

Business Value

Delivery of Integrated, Large-Scale Data

  • Ability to utilize one unified repository
  • Facilitates the use of industry-standard tools such as Apache Spark, Presto, and SageMaker

Faster Development of Better Products

  • Qubole’s holistic view of data means products are deployed faster, with better quality
  • Time to develop new products has been cut from one year to 2-3 months

Higher Data Engineer Productivity

  • Data cleanup, maintenance, and management tasks are automated
  • Time to provide actionable insights cut from 1-2 weeks to 2-3 days
  • No extra overhead

Download the PDF version of this case study.