Apache Airflow on Qubole

Apache Airflow is an open-source tool to programmatically author, schedule, and monitor data workflows. With Airflow, users can author workflows as directed acyclic graphs (DAGs) of tasks. A DAG is the set of tasks needed to complete a pipeline organized to reflect their relationships and interdependencies. Qubole provides Apache Airflow as a managed service that is easy to manage and scalable for your data engineering needs.


Apache Air Flow

Simplifying Data Engineering Pipeline Orchestration

Apache Airflow on Qubole improves performance, lowers cost, improves user experience, and enables smooth administration of Airflow clusters. With Apache Airflow on Qubole, data engineers can setup airflow clusters in seconds, use a single panel to track usage, manage cluster and ACLs, easily edit DAGs and plugins and leverage the integrated “Goto QDS” button on Qubole Operator to seamlessly navigate to the corresponding Qubole command page.

Three key benefits of Apache Airflow on Qubole

Agility with Flexibility

Reduce the complexity of managing Airflow resources while at the same time allows you fine-grained controls on cluster type and size, number of executors, airflow version, security and RBAC policies

Integrated Monitoring and Alerting

Lower downtime and improve business continuity with integrated monitoring service to monitor and automatically start all relevant airflow services. Prometheus with grafana for better monitoring of resources and alerting to endpoints such as email, slack, pager-duty etc.

Available On Multiple Clouds

Future proof your solution with Airflow on Qubole. It is cloud-agnostic and generally available on AWS, Azure, and Google Cloud Platform.

Airflow on Qubole vs Apache Airflow

 

Ease of use & debuggability

Airflow on QuboleApache Airflow
Versioning
Easy deployment of dags with dag explorer
Auto sync dags from cloud storage
Git integration
Automated log backup and cleanup
Airflow on Anaconda Virtual Environment

Monitoring

Airflow on QuboleApache Airflow
Integrated Graffana & Prometheus based alerts and monitoring
Integrated monitoring service for improved availability
Monit dashboard for better visibility of service health
Monitoring (Ganglia, DataDog, etc)

Security

Airflow on QuboleApache Airflow
Integrated RBAC for Airflow clusters
SSO with SAML 2.0 support
Data encryption (at rest and in motion)
Audit end-user activity logs
HIPAA, SOC2 Type2, ISO-27001 compliant environments

Support and Services

Airflow on QuboleApache Airflow
24/7 support from our Airflow experts
Support multiple versions of Airflow