Businesses make many decisions to deliver values to their customers or users. As the key component of making decisions, not only should the business collect the right data, but they also should be able to use the data whenever decisions would be made. Therefore, it becomes crucial for businesses to have a system that allows them to do so, and as one of the e-commerce unicorns in Indonesia, Bukalapak understands that completely through a decentralized data platform. As an organization that provides a product that has become a super app with millions of users every day, we need to move fast with data to make the right decisions in the right time. Migration from on-premise to the cloud was a moment that we used to make this transformation. With the growth of data in around tens of terabytes a day and thousands of queries executed per day, it in itself is a big challenge for us. We will show you the learnings from managing it on-premise, which became the main reason why a decentralized model was a solution for us. Not only from the conceptual standpoint, changes happened on the other part of the architecture to make sure that we could provide the right service to our internal users.
On the ingestion part, we decided to replace Debezium and with a homegrown CDC (Change Data Capture) and used Apache Beam with Google Cloud Dataflow in order to serve the data lake to our internal users. Then, in the downstream, we decided to utilise cloud services more by using Qubole for the batch processing. You might be able to see the reason behind the decision and the impact it brought to us for efficiency. On this presentation layer, we will show you how we implemented the lambda architecture to provide both delayed and fresh data to our internal users. Finally, to provide a better understanding of user’s experience, we created a quite comprehensive monitoring system to capture any problem earlier or even to understand any of users’ complaints or reports. Ultimately, the success of fundamental changes on the system would be measured against what positive impact it brings to the organization, therefore we would explain some of the benefits that it brought to BukaLapak, especially on how it increased our productivity by allowing us to get the data we need whenever decisions are meant to be made.
Free access to Qubole for 30 days to build data pipelines, bring machine learning to production, and analyze any data type from any data source.