Intermediate

Pipeline Builder

Module Summary

Discover the no-code Pipeline Builder tool for creating transforms visually with a drag-and-drop interface.

No-Code Transforms

Pipeline Builder lets you build transforms without writing code. You drag input datasets onto a canvas, add transformation nodes (filter, join, aggregate, pivot, derive column), and connect them to an output dataset. Under the hood, Pipeline Builder generates the same Spark execution plan as a Python or SQL transform — so there is no performance trade-off.

Common Operations

Pipeline Builder supports: - Filter — keep rows matching a condition. - Join — combine datasets on shared keys. - Aggregate — group and summarise (count, sum, avg). - Derive Column — add computed columns with expressions. - Pivot / Unpivot — reshape wide to long or vice versa. - Union — stack datasets vertically. Each node shows a live data preview so you can verify results at every step.

When to Use Pipeline Builder vs. Code

Pipeline Builder shines for straightforward ETL: filtering, joining, and aggregating. For complex logic — custom UDFs, machine learning, recursive algorithms — Code Repositories give you full flexibility. Many teams use Pipeline Builder for 80% of their transforms and drop into code for the remaining 20%. Both approaches produce standard Foundry datasets.

Key Takeaways

  • Pipeline Builder lets you build transforms visually — no code required.
  • It generates the same Spark execution plan as Python or SQL transforms.
  • Live previews at every node make debugging straightforward.
  • Use Pipeline Builder for standard ETL; use code for complex logic.