Apache Airflow training course is designed to familiarize with the use of Airflow schedule and maintain numerous Extract, Transform and Load (ETL) processes running on a large scale Enterprise Data Warehouse (EDW).
As Data Warehouses (DWs) increase in complexity it is important to have a dependable, scalable, intuitive, and simple scheduling and management program to monitor the flow of data and watch how transformations are completed. Apache Airflow, originally conceived of by AirBnb to help manage the complexities of their EDW, is being adopted by tech companies everywhere for its ease of management, scalability, and elegant design. Airflow is rapidly becoming the go-to technology for companies scaling out large data warehouses.
The course begins with an introduction to Airflow which includes a brief background and history of Airflow and covers the Airflow framework, database and User Interface (UI). Next, the course dives into Airflow development including operators and plugins, Directed Acyclic Graphs (DAGs), and scheduling. The course concludes with a session on deploying with Airflow and complex task dependency management.
By attending Apache Airflow workshop, delegates will learn to:
- Assess how to organize and arrange scheduling.
- Determine how to standardize Extract, Transform and Load (ETL) formats and processes.
- Integrate Scheduling code into regular code flows.
- This Apache Airflow class is ideal for DevOps engineers who want to monitor their enterprise data warehouses.