Consume data

This page explains how to consume data from GlassFlow pipelines.

Consuming data is a process of pulling transformed data from a data pipeline in GlassFlow. GlassFlow Python SDK is used to retrieve and consume data from the pipeline.

Create a data consumer for the mobility project

In this section, you'll learn how to consume transformed data from the mobility project data pipeline in GlassFlow and store it locally on a file. You'll use the GlassFlow Python SDK to interact with the pipeline and retrieve the transformed data in real-time.

Prerequisites

We assume that you have already completed the following before proceeding with the tutorial:

Consume Transformed Data

Create a Python script consumer_file.py inside the mobility folder and add the following code:

This script continuously checks for newly transformed data from the pipeline and consumes it as needed. The main GlassFlow SDK usage revolves around creating a GlassFlow client instance and a pipeline client instance to interact with the GlassFlow platform and consume data from the data pipeline, respectively.

  1. Initializes a GlassFlow client to establish a connection with the GlassFlow platform.

client = glassflow.GlassFlowClient()
  1. Creates a pipeline client for the specific data pipeline identified by pipeline_id within the specified space_id.

pipeline_client = client.pipeline_client(space_id=space_id, 
                                         pipeline_id=pipeline_id, 
                                         pipeline_access_token=token)
  1. Consumes the transformed data from the pipeline. It returns a response object containing the consumed event data.

res = pipeline_client.consume()

Run the script

python consumer_file.py

The script will start consuming data continuously from the pipeline and storing it locally on disk. You can see an example of consumed data here. You can check the updates to the data written to the file by running this command in another terminal window

tail -f mobility_data_transformed.txt

You can further extend this functionality to push the consumed data to cloud storage buckets or real-time databases as per your project requirements. See other tutorials for complex scenarios.

Last updated

© 2023 GlassFlow