Intermediate

Data Health & Scheduling

Set up build schedules, health checks, and expectations so pipelines run reliably without manual intervention.

⏱ 14 min read📖 3 chapters+150 XP

1 · Build Schedules

You can schedule datasets to build on a time-based cadence (hourly, daily, cron) or trigger-based (when an upstream input updates). Foundry's scheduler respects the dependency graph — it won't build a transform until all its inputs are fresh. Combine both modes: schedule the top of the pipeline on a cron job and let downstream datasets build automatically when their inputs complete.

2 · Expectations and Data Health Checks

Expectations are rules you attach to a dataset: - Row count must be greater than 0. - Column email must match a regex pattern. - Null rate on revenue must be below 1%. - Dataset must be no more than 24 hours stale. After each build, Foundry evaluates these expectations. If any fail, the dataset is flagged unhealthy and alerts are sent.

3 · Alerting and SLAs

Data Health integrates with alerting: you can send Slack messages, emails, or PagerDuty incidents when a dataset goes unhealthy. Combine this with SLA monitoring to define data freshness guarantees — if the 8 AM dashboard data isn't ready by 7:55 AM, the on-call engineer gets paged.

✅ Key Takeaways

Schedules can be time-based, trigger-based, or a combination of both.
Expectations are declarative data quality rules evaluated after each build.
Unhealthy datasets trigger alerts via Slack, email, or PagerDuty.
SLA monitoring ensures data freshness guarantees are met.

1 · Build Schedules

2 · Expectations and Data Health Checks

3 · Alerting and SLAs

✅ Key Takeaways

🧠 Knowledge Check