Aju Tamang
Hive and Presto Clusters with Jupyter on AWS and Azure
Jupyter™ Notebooks is one of the most popular IDE of choice among Python users. Traditionally, Jupyter users work with small or sampled datasets that do…
Eligibility Guidelines
How We Learned to Stop Data Wrangling and Love Machine Learning
Nexla is a data operations platform that focuses on enabling data movement between companies with security and scale. The platform is simple enough for the…
Connecting Jupyter with Remote Qubole Spark Cluster on AWS, MS Azure
Jupyter™ Notebooks is one of the most popular IDE of choice among Python users. Traditionally, most Jupyter users work with small or sampled datasets that…
Lunch and Learn
Processing Hierarchical Data using Spark Graphx Pregel API
Today distributed compute engines are the backbone of many analytic, batch & streaming applications. Spark provides many advanced features (pivot, analytic window functions, etc.) out…
Data Platform Pricing Comparison
The Fun of Creating Apache Airflow as a Service
A while back we shared the post about Qubole choosing Apache Airflow as its workflow manager. Then last year there was a post about GAing…






