Data Privacy and Integrity

Have a built-in multi-layer approach to protect the confidentiality, integrity, and availability of customer information.

Qubole provides data protection and secure access with encryption and RBAC controls. Qubole also integrates with leading cloud provider IAMs, AD, and LDAP implementations to provide the same rights and privileges to access the data.

Granular and Efficient Updates and Deletes


Have efficient updates and deletes for data stored in cloud data lakes to comply with regulations for the right to forget and the right to erase.

Qubole supports ACID transactions natively across multiple engines to help avoid lost updates, dirty reads, and stale reads and enforce app-specific integrity constraints. Data integrity is maintained in the data lake when concurrent users access the data lake to read and write data simultaneously. The ACID transaction helps with the right to be forgotten and the right to be erased by making sure that data in the data lake is current and if asked to be deleted, is deleted.

Granular Data Access Controls


Provide granular data access controls and the ability to mask data with a single policy across multiple engines Apache Spark, Presto, and Hive running on multiple clouds.

Anonymize data based on row and column filtering. Apache Ranger with Qubole provides centralized security administration to manage all security-related tasks and fine-grained authorization to do a specific action and/or operation including data anonymization, and custom masking. With Apache Ranger, the authorization methods can be standardized for the underlying engine.

Role-Based Access Control


Enable access controls at a minimum of three levels, starting from data ingest to data access: the infrastructure, platform, and data levels providing effective policy management.

Security teams can now leverage their cloud provider’s IAM/AD/LDAP services to restrict particular user’s based Qubole access to particular compute resources.

Qubole’s built-in role-based access control (RBAC) capabilities restrict users’ access to specific platform artifacts such as clusters, notebooks, and dashboards and only use predefined roles or create custom roles.

We had very large datasets and had to run queries every day to populate smaller normalized tables so that we could analyze data over time. This required writing Python scripts, which can be time-consuming for our business analysts. Making our business users self-sufficient got us where we wanted to be. - Adam Rose, Head of Engineering - Adobe Advertising Cloud