Intermediate

Data Connection

Module Summary

Learn how to ingest data from external sources into Foundry using agents, source connectors, and syncs.

Sources and Agents

To ingest data, you configure a Source that describes where the data lives (e.g., an S3 bucket, a Snowflake database, or a REST API). An Agent is a lightweight process that runs near the source system and securely transfers data into Foundry. Agents run inside your network, so sensitive data never passes through an uncontrolled hop. Foundry manages the scheduling; the agent manages the transfer.

Syncs and File Imports

A Sync defines what data to pull and how often. You can sync entire tables, specific queries, or incremental change feeds. For file-based sources, Foundry can watch a directory and import new files automatically. Syncs support full refreshes (replace everything) or append mode (add new rows), matching the cadence your downstream pipelines need.

Handling Schema Evolution

Source systems change — columns get added, types shift. Data Connection provides schema evolution handling: it can auto-detect new columns, alert on breaking changes, and apply default values for missing fields. Combine this with Data Health checks downstream and you have an early warning system for upstream changes.

Key Takeaways

  • Data Connection ingests data from external systems into Foundry datasets.
  • Agents run inside your network for secure data transfer.
  • Syncs control what data to pull, how often, and in what mode.
  • Schema evolution handling protects against upstream changes.