Orchestrating a Video Processing Pipeline with Airflow and Isolated Worker Queues
The backend behind a self-hosted video compression service: turning "compress this video" into an Airflow DAG, isolating workloads with per-tier Celery queues, and picking FFmpeg settings that actually shrink the file.
# Orchestrating a Video Processing Pipeline with Airflow and Isolated Worker Queues
I run a self-hosted video compression service. The web app and API are the easy part; the hard part is everything that happens after someone clicks "compress." A video has to be pulled from storage, transcoded, pushed back, have its metadata extracted, and cleaned up — and that has to be reliable, observable, and fair across users on very different plans. Here's how I built that backend.
## The job is a DAG, not a request
A transcode takes seconds to many minutes, so it can't live inside a web request. I model each job as an **Apache Airflow** DAG with a linear task chain:
```
update status: downloading -> download -> update status: processing
-> process (FFmpeg) -> update status: uploading -> upload
-> extract & send metadata -> update status: completed -> cleanup
```
Every step writes a status (`downloading`, `processing`, `uploading`, `completed`) back to the Django side, which is exactly what the UI polls to show live progress. A DAG-level failure callback flips the job to `failed` and logs the run, so a job never silently disappears — it always ends in a terminal state someone can see.
## Isolating workloads so big jobs don't starve small ones
The naive version runs every job through one worker pool. That falls apart the moment a 4 GB premium upload lands ahead of a handful of quick free-tier clips: the small jobs wait behind the big one.
So I split the work along two axes — plan tier and video size — into separate DAGs (`process_video_freemium_S` through `process_video_premium_XL`), and route each DAG's tasks to dedicated **Celery queues** by the *kind* of work:
- `w_sync` — the tiny status-update tasks, concurrency 1. They're fast and must stay responsive.
- `w_<mode>_io` — download, upload, metadata. I/O-bound, concurrency 2.
- `w_<mode>_process` — the FFmpeg transcode itself. CPU-bound, concurrency 3.
Because the queues are separate, I can size and scale each worker pool independently, keep CPU-heavy encoding off the same workers doing quick I/O, and make sure a premium-XL encode runs in its own lane instead of blocking the freemium-S lane. Airflow runs on the CeleryExecutor with Redis as the broker and Postgres as the metadata DB.
## Encoding settings that actually make the file smaller
The subtle trap in video compression: blindly re-encoding can produce a *bigger* file than you started with, especially if the input is already well-compressed. So before transcoding I probe the input with `ffprobe` for its bitrate, height, and codec, and pick the CRF (quality factor) adaptively:
- **Same-codec re-encode** (e.g. H.264 in, H.264 out) needs a *higher* CRF to actually beat the existing encode — the input is already efficient, so I have to lean harder on quality to win on size.
- **Higher-bitrate input** is easier to compress, so it can take a *lower* CRF and still shrink.
- **Low-bitrate input** is already squeezed, so it gets a higher CRF to avoid bloating.
A user-supplied CRF always overrides the heuristic. The result is a default that trends toward "smaller than what you uploaded" instead of gambling on it.
## Keeping it honest in production
Storage is MinIO (S3-compatible), with Postgres and Redis behind the app and the whole thing fronted by nginx. Since it's self-hosted, I can't lean on a managed dashboard for visibility, so it ships with a **Prometheus + Grafana + Loki** stack — metrics and logs for the workers and the queues, so when a job is slow I can see *which* queue is backed up rather than guessing.
None of the individual pieces are exotic. The leverage comes from the shape: a pipeline you can watch step-by-step, workloads that can't trample each other, and encode settings derived from the input instead of hard-coded. That's the difference between a demo that transcodes one file and a service that transcodes other people's files all day.
If you're building media-processing pipelines, wiring up Airflow + Celery for fan-out work, or fighting with FFmpeg encode settings, I'm happy to compare notes — reach out via my portfolio at [tahayusufkomur.me](https://tahayusufkomur.me).