Data Health & Scheduling
Module Summary
Set up build schedules, health checks, and expectations so pipelines run reliably without manual intervention.
Build Schedules
You can schedule datasets to build on a time-based cadence (hourly, daily, cron) or trigger-based (when an upstream input updates). Foundry's scheduler respects the dependency graph — it won't build a transform until all its inputs are fresh.
Combine both modes: schedule the top of the pipeline on a cron job and let downstream datasets build automatically when their inputs complete.
Expectations and Data Health Checks
Expectations are rules you attach to a dataset:
- Row count must be greater than 0.
- Column
email must match a regex pattern.
- Null rate on revenue must be below 1%.
- Dataset must be no more than 24 hours stale.
After each build, Foundry evaluates these expectations. If any fail, the dataset is flagged unhealthy and alerts are sent.Alerting and SLAs
Data Health integrates with alerting: you can send Slack messages, emails, or PagerDuty incidents when a dataset goes unhealthy. Combine this with SLA monitoring to define data freshness guarantees — if the 8 AM dashboard data isn't ready by 7:55 AM, the on-call engineer gets paged.
Key Takeaways
- Schedules can be time-based, trigger-based, or a combination of both.
- Expectations are declarative data quality rules evaluated after each build.
- Unhealthy datasets trigger alerts via Slack, email, or PagerDuty.
- SLA monitoring ensures data freshness guarantees are met.
Branching & CI
Foundry Functions