airflow

Concept

Architecture Overview

  • scheduler, which handles both triggering scheduled workflows, and submitting Tasks to the executor to run.
  • An executor, which handles running tasks. In the default Airflow installation, this runs everything inside the scheduler, but most production-suitable executors actually push task execution out to workers.
  • webserver, which presents a handy user interface to inspect, trigger and debug the behaviour of DAGs and tasks.
  • A folder of DAG files, read by the scheduler and executor (and any workers the executor has)
  • metadata database, used by the scheduler, executor and webserver to store STATE.

In the default Airflow installation, this runs everything inside the scheduler, but most production-suitable executors actually push task execution out to workers.

DAG

DAG

Configuration

Operator

Task

DAG documentation only supports markdown so far, while task documentation supports plain text, markdown, reStructuredText, json, and yaml

Airflow Executor

XCom

Pools

Task Flow

Variables

DAG writing best practices in Apache Airflow | Astronomer Documentation

Timetables — Airflow Documentation (apache.org) https://docs.astronomer.io/learn/airflow-datasets