White Papers

The Evolving Role of the Data Engineer

Issue link: https://www.qubole.com/resources/i/1243713

Contents of this Issue


Page 60 of 63

APPENDIX Best Practices for Managing Resources This report has laid out a dizzying platter of tools available for data engineering, but until now has avoided the question of what com‐ puter systems to run the tools on. Like any computing operation, you need physical resources in order to do ingestion and transfor‐ mations. Nearly everyone now schedules resources through either containers or virtual machines (VMs), on-premises or in the cloud, because they allow easy scaling and the efficient exploitation of physical resources. Containers and Virtual Machines A container essentially runs a single application in an isolated envi‐ ronment, whereas a VM runs a whole operating system, hosting any applications you want to include. Proponents of VMs claim they are more secure than containers, although attacks against both have been recorded. Containers are more lightweight and can spin up faster. Platform as a Service (PaaS) is another convenient cloud solution, providing an API on which you can run your programming func‐ tions. PaaS makes resource management particularly easy for the programmer, because the vendor handles all of the CPU and mem‐ ory resources behind the scenes. 53

Articles in this issue

view archives of White Papers - The Evolving Role of the Data Engineer