Running Scheduled Pipelines
Many workflows need to run automatically — ingesting labour records from field systems, updating financial ledgers, reconciling payroll totals. These are executed using scheduled pipelines.
Scheduling model
Workflows can be triggered using a cron schedule. Example configuration:
trigger_type: SCHEDULEDschedule_cron: "0 2 * * *"This executes the workflow every day at 02:00 UTC. The scheduler checks for eligible workflows once per minute.
Execution flow
When the scheduler detects a workflow ready to run:
- The workflow definition is loaded
- A
WorkflowRunrecord is created TaskRunrecords are generated for every step- The job is enqueued for the ARQ worker
The worker then processes the workflow through the standard execution lifecycle.
Preventing overlapping runs
If a workflow is still executing when the next schedule fires, the scheduler skips that run. This prevents duplicate operations such as double financial postings or duplicate data ingestion.
Example pipeline
A typical daily labour alignment pipeline:
FETCH → assignar.site_diariesTRANSFORM → extract_labour_entriesTRANSFORM → normalize_cost_categoriesPUSH → business_central.project_ledgerThis ensures labour costs are recorded in the ERP each day without manual intervention.
Observability
Each scheduled workflow run produces records including:
- run start time
- step execution results
- runtime duration per step
- failure details
View execution history under Workflows → Runs.
Best practices
- Ensure idempotency — workflows that run on a schedule may re-process records; design steps to handle duplicates gracefully
- Validate before posting — use
GOVERNsteps to catch bad data before it reaches the ERP - Test in sandbox first — run the workflow with
dry_run: truebefore activating the schedule - Monitor for failures — set up alerting on
WorkflowRunfailure states