Case Study

GAIA Case Study

Issue link:

Contents of this Issue


Page 2 of 3

CASE STUDY Resilience, Reliability, and Availability Due to years of band-aids and workarounds, Gaia's legacy technology architecture was complex and fragile. Outages were common—and time consuming. "If an overnight process failed, fixing it was what you did for the rest of the day," recalls data engineer Alex Mendoza. Even when overnight processes didn't fail, they sometimes ran long, extending into working hours—a product of limited compute power. This often resulted in a logjam effect that prevented users from accessing critical data. Since Gaia implemented Qubole, it's a different story. Now, the system automatically scales up to complete processes, ensuring users always have access to the resources they need. And the system is stable and reliable, meaning major problems have largely become a thing of the past. "I can't remember the last time we all spent swarming a fire," says Alex Mendoza. As for those rare occasions when problems do occur, improved data-validation practices— implemented in Qubole—make it easier to identify the root cause of the issue and to resolve it quickly. " We've reduced the amount of time our engineers spend troubleshooting by at least a factor of three." Patrick Lawlor Product Data Analyst, Gaia In addition to freeing engineers from the frustrating task of troubleshooting, Qubole relieves them of the burden of maintaining, patching, and upgrading a dedicated in-house infrastructure, and automates other administrative tasks. "Qubole takes care of that background heavy lifting, says senior data engineer Jami Amore. "so, we can focus more on providing value to the business." QUBOLE OPEN DATA LAKE PLATFORM Airflow DAGs Scheduler Ad-Hoc Queries WORKERS WORKERS RDBMS Data Lake Billing System (Financials) Third Party Email System Helpdesk Ticketing System BI Soution

Articles in this issue

view archives of Case Study - GAIA Case Study