Today’s modern distributed data platforms are expanding at a quadratic rate in both size and complexity. It is more important than ever to define clear lines of delineation between compute and storage which allows both to scale independently. Scalability is becoming more and more important – both handing volume for a single data source as well as adding collections of disparate data sources into the data catalogue. Ultimately it is inevitable that data sources will be shared outside the platform to inside the platform and vice versa. In this presentation we will discuss the challenges with scaling storage and delve into the approach we have taken in The Walt Disney Company’s DTCI Data Platform to solve these complex issues in an easily maintainable, governable, audit-able, and simplistic way.
Free access to Qubole for 30 days to build data pipelines, bring machine learning to production, and analyze any data type from any data source.