Mobility Project

A practical example of configuring a new data pipeline for a car-sharing (Mobility) project.

The mobility project aims to process real-time ride events from an API to monitor fuel consumption and manage the supply of cars available for sharing. The primary objectives include identifying low-fuel vehicles and finding the nearest fuel station based on GPS coordinates and fuel type. It also discounts drivers based on fuel level and proximity to the closest fuel station.

Link to the GitHub project repository

We'll use the GlassFlow CLI to create a new space and configure our data pipeline.

Prerequisites

We assume that you have already had the following before proceeding with the tutorial:

You created a GlassFlow account.
You installed GlassFlow CLI and logged into your account via the CLI.

Custom Transformation Function

For the mobility project, the function processes real-time ride event data to identify a vehicle with low fuel levels, find the closest fuel station, and calculate discounts for users who need to refuel. It enables the ride-sharing service to encourage drivers to refill vehicles when necessary and optimize fleet management.

Creating a transform function

Create a Python script file transform.pyinside a new mobility project folder.

https://github.com/glassflow/glassflow-examples/blob/main/use-cases/mobility/transform.py

"""
Transform function by the user
"""
import json
import requests


def handler(data, log):
    log.info("Event:" + json.dumps(data), data=data)
    try:
        transformed = handle(data)
    except Exception as e:
        log.error("Error in transformation", error=str(e))
        raise e
    return transformed


def get_nearest_fuel_station(gps_cordinates, fuel_type):
    print("get nearest fuel station")
    url = "https://mock-mobility-s3r3lbzina-ey.a.run.app/mobility/gas-stations/nearest"
    resp = requests.get(url,
                        params={
                            'cordinates_lat': gps_cordinates[0],
                            'cordinates_long': gps_cordinates[1],
                            'fuel_type': fuel_type
                        })
    if resp.status_code == 200:
        fuel_station = resp.json()
        print(fuel_station)
        return fuel_station
    else:
        print("error getting nearest fuel station")
        print(resp.status_code)
        return None


def handle(data: json):
    data['discount'] = {"discount": False}
    if not data['is_electric'] and data['current_fuel_percentage'] < 25:
        # find nearest gas station using a partner API
        fuel_station = get_nearest_fuel_station(data['gps_cordinates'],
                                                data['fuel_type'])
        if fuel_station:
            data['discount'] = {
                "discount": True,
                "fuel_station": fuel_station,
                "discount_type": "fuel"
            }

    return data

Thehandler function contains all transformation logic where the event data is modified based on specific conditions.

If the vehicle is not electric and its current fuel percentage is below 25%, it calls the get_nearest_fuel_station function to find the nearest fuel station via the mock API server.
If a fuel station is found, it updates the 'discount' key with details about the discount offered by the fuel station.

In the next steps, we will configure a pipeline on Glassflow with the transform.py.

Creating the pipeline on GlassFlow

Step 1: Create a new space

Open a terminal and create a new space called examples to organize multiple pipelines:

glassflow space create examples

After creating the space successfully, you will get a SpaceID in the terminal.

Save the SpaceID for reference. You'll set it as environment variables for the project in the upcoming section.

Step 2: Configuring the Pipeline

Create a new pipeline in the selected space with a transformation function:

glassflow pipeline create mobilitydemo --space-id={space_id} --function=transform.py

This command initializes the pipeline with a name mobilitydemo in the examples space and specifies the transformation function transform.py. After running the command, it returns a new Pipeline ID with its access token.

Save the Pipeline ID and Access Token for reference. You'll set them as environment variables in the upcoming section.

The pipeline is now deployed and running on the GlassFlow cloud.

Step 3: Create an environment configuration file

Add a .env file in the project directory with the following configuration variables and their values:

PIPELINE_ID=your_pipeline_id
SPACE_ID=your_space_id
PIPELINE_ACCESS_TOKEN=your_pipeline_access_token

Replace your_pipeline_id, your_space_id, and your_pipeline_access_token with appropriate values obtained in the previous steps.

Publish Data

Generate data for the mobility project and publish it to the data pipeline in GlassFlow using the Python SDK.

Install required libraries

Install required libraries including GlassFlow SDK listed in the requirements.txt file using the pipcommand in a terminal.

pip install -r requirements.txt

Publish real-time API events to the pipeline

Create a new Python script file called producer_api.py in your project root directory and insert the code below. This Python script serves as a data producer, fetching mobility events data from a mock API server and publishing it to a GlassFlow pipeline.

https://github.com/glassflow/glassflow-examples/blob/main/use-cases/mobility/producer_api.py

"""
Get mobility events data via a mockserver and publish it to glassflow
"""
import glassflow
from dotenv import dotenv_values
import requests


def get_mock_events():
    """
    Get mock events from the mock server
    """
    res = requests.get(
        "https://mock-mobility-s3r3lbzina-ey.a.run.app/mobility/producer/events/ride-completed"
    )
    if res.status_code == 200:
        return res.json()
    else:
        print("Failed to get mock events")
        return None


def main():
    config = dotenv_values(".env")
    pipeline_id = config.get("PIPELINE_ID")
    space_id = config.get("SPACE_ID")
    token = config.get("PIPELINE_ACCESS_TOKEN")

    client = glassflow.GlassFlowClient()
    pipeline_client = client.pipeline_client(space_id=space_id,
                                             pipeline_id=pipeline_id,
                                             pipeline_access_token=token)
    counter = 0
    while True and counter < 1000:
        try:
            event = get_mock_events()
            counter += 1
            if event:
                req = pipeline_client.publish(request_body=event[0])

                if req.status_code == 200:
                    print("Event published successfully", event[0])
                else:
                    print("Failed to publish event")
                    print(req.text)
        except Exception as e:
            print(e)
            break
        except KeyboardInterrupt:
            break


if __name__ == "__main__":
    main()

Run the script

Run the Python script producer_api.py

python producer_api.py

This script continuously fetches mock mobility events data from a mock API server and publishes it to the specified GlassFlow pipeline.

Consume Data

Consume transformed data from the mobility project data pipeline in GlassFlow and store it locally on a file. You'll use the GlassFlow Python SDK to interact with the pipeline and retrieve the transformed data in real-time.

Consume transformed data

Create a Python script consumer_file.py inside the mobility folder and add the following code:

https://github.com/glassflow/glassflow-examples/blob/main/use-cases/mobility/consumer_file.py

"""Get transformed data and store it locally on disk
"""
import glassflow
import sys
from dotenv import dotenv_values
import json


def main():
    config = dotenv_values(".env")
    print(config)
    pipeline_id = config.get("PIPELINE_ID")
    space_id = config.get("SPACE_ID")
    token = config.get("PIPELINE_ACCESS_TOKEN")

    client = glassflow.GlassFlowClient()
    pipeline_client = client.pipeline_client(space_id=space_id,
                                             pipeline_id=pipeline_id,
                                             pipeline_access_token=token)

    with open("mobility_data_transformed.txt", "a+") as f:
        while True:
            try:
                # consume transfornmed data from the pipeline
                res = pipeline_client.consume()
                if res.status_code == 200:
                    # get the transformed data as json
                    data = res.body.event
                    print("Data consumed successfully")
                    print(data)
                    f.write(json.dumps(data) + "\n")
                    f.flush()
            except KeyboardInterrupt:
                print("exiting")
                sys.exit(0)


if __name__ == "__main__":
    main()

Run the script

python consumer_file.py

The script will start consuming data continuously from the pipeline and storing it locally on disk. You can see an example of consumed data here. You can check the updates to the data written to the file by running this command in another terminal window

tail -f mobility_data_transformed.txt

You can extend this functionality to push the consumed data to cloud storage buckets or real-time databases per your project requirements.

See other use cases for complex scenarios.

PreviousUse Cases NextReal-time price recommendation

Last updated 11 days ago