Pipeline configuration

Learn pipeline concept and how to configure a new pipeline.

A pipeline in GlassFlow orchestrates the flow of data from various sources, through transformations, and ultimately sends it to destinations where the data is stored or further utilized. Configuring a pipeline involves specifying these elements and defining how data is processed at each stage. There are three ways to configure a new pipeline:

  1. Using the GlassFlow WebApp app.glassflow.dev.

  2. Using GlassFlow CLI.

  3. Using the YAML configuration file (not yet available).

Creating a Pipeline with the WebApp

GlassFlow's intuitive interface makes it easy to create a space and pipeline and upload the transformation function Python script file.

Step 1: Log into the app.glassflow.dev:

When you log in the first time, you will see the space called main by default is defined and you can create a new space too.

Step 2: To create a new pipeline, change the tab to Pipelines and Create a New Pipeline. It will open a new pop-up form where you need to provide a pipeline name, specify the space, and upload Python code.

The app automatically integrates this custom logic into the specified pipeline, executing the transformations in real-time as data passes through.

Creating a Pipeline with the CLI

Before creating a pipeline, you need to create a new space where one or more pipelines can be initiated.

glassflow space create <space_name>

After the space is created successfully, you will get a SpaceID in the terminal.

To create a new pipeline in the existing space, you'll use the following command in the GlassFlow CLI:

glassflow pipeline create <pipeline_name> --space-id=<space_id> --function=<location_of_transformation_function>

Parameters Explained

  • <pipeline_name>: This is a name for your pipeline. Choose a name that reflects the purpose of the pipeline for easier management and reference.

  • --space=<space_id>: Specify the workspace or "space" where your pipeline will be created. Spaces help organize and isolate your pipelines to manage projects and collaborate with team members.

  • --function=<location_of_transformation_function>: Points to the location of the Python script containing your transformation logic. This script must include a handle function that defines how incoming data is processed.

After the pipeline is created successfully, you will get a PipelineID in the terminal.

Example

Suppose you want to create a pipeline named "weatherdataprocessing" in a space called "climateanalytics" and your transformation function is located at ./transformations/convert_temp.py. The command would look like this:

glassflow pipeline create weatherdataprocessing --space-id={space-id} --function=./transformations/convert_temp.py

Guidelines to name a space and pipeline

A GlassFlow resource name you provide for a space, pipeline, or organization name when you create them.

When you create a resource by providing a name, GlassFlow generates a uniquely identified ID for the resource. The resource can be accessed by this ID.

The resource name can have the following format:

  • Contains both uppercase and lowercase letters and numbers.

  • No Spaces: Spaces are not permitted anywhere in the pipeline name.

  • Special Characters: Including dashes - and underscores _, special characters (e.g., !, @, #, $, %, ^, &, *, (, ), +, =, {, }, [, ], |, \, :, ;, ', ", <, >, ,, ., ?, /) are allowed.

  • Length Limit: To ensure compatibility and readability, the pipeline name must be within a certain length limit, typically not exceeding 64 characters.

Best Practices:

  • Descriptive Names: Choose names that clearly describe the purpose or function of the pipeline, making it easier to identify and manage multiple pipelines.

  • Consistent Naming Scheme: Adopt a consistent naming scheme across your pipelines, especially if you're managing many of them. This could involve prefixes or suffixes that indicate the pipeline's stage in the data processing workflow (e.g., ingest, transform, export) or its data source.

Last updated

© 2023 GlassFlow