Templates
Processing Events

Processing Events With Streaming Functions

In the /datamodels/models.ts file, you will find the PageViewProcessed data model:

moose/app/datamodels/models.ts
export interface PageViewProcessed extends PageViewRaw {
  hostname: string;
  device_vendor: string;
  device_type: string;
  device_model: string;
  browser_name: string;
  browser_version: string;
  os_name: string;
  os_version: string;
}

In this model, the user_agent field from the PageViewRaw model has been replaced by several fields that provide more structured information about the user's device, browser, and operating system.

We achieve this transformation using the Streaming Functions primitive. You can see the transformation code in the PageViewRaw__PageViewProcessed.ts file located in the /functions folder of your Moose project. Here is an example of what the code might look like:

moose/app/functions/PageViewRaw__PageViewProcessed.ts
import { PageViewRaw, PageViewProcessed } from "../datamodels/models";
import { UAParser } from "ua-parser-js";
 
export default function run(event: PageViewRaw): PageViewProcessed {
  // Process the user_agent to extract device, browser, and OS details
  const { browser, device, os } = UAParser(event.user_agent);
 
  // Extract the hostname from the URL
  const hostname = new URL(event.href).hostname;
 
  return {
    ...event,
    hostname: hostname,
    device_vendor: device.vendor || "",
    device_type: device.type || "desktop",
    device_model: device.model || "",
    browser_name: browser.name || "",
    browser_version: browser.version || "",
    os_name: os.name || "",
    os_version: os.version || "",
  };
}

Here's a breakdown of how this works:

  1. Function Definition: The run() function takes in an event of type PageViewRaw.
  2. User Agent Parsing: The function processes the user_agent property to extract details about the device, browser, and operating system. It uses the ua-parser-js library to parse the user agent string.
  3. Hostname Extraction: It extracts the hostname from the event URL.
  4. Return Processed Event: The function returns a new PageViewProcessed event object with the extracted details.

When you save this function in the functions folder, Moose automatically listens for new PageViewRaw events. Upon receiving a new record, Moose runs the run function to process the data and publishes the resulting PageViewProcessed event to the appropriate streaming topic. Finally, Moose stores this processed data in the PageViewProcessed table in Clickhouse, making it available for querying and analysis.

You can validate that the steaming function provided already worked on the PageViewRaw data you ingested by checking your terminal. You should see a message indicating that the function processed the data and the number of messages processed.

You can also verify the processed data by querying the PageViewProcessed table in your database explorer.

Important Note

The naming convention of your filename is critical. In order for Moose to identify your streaming functions, the file you define your function in must follow this convention:

<SOURCE_DATAMODEL_NAME>__<DESTINATION_DATAMODEL_NAME>

-SOURCE_DATAMODEL_NAME: The pre-processed data model that you want to transform via the function.

  • DESTINATION_DATAMODEL_NAME: The model for the return type of transformed data.

By adhering to this convention, Moose ensures that your functions are correctly recognized and executed. For more advanced details and additional capabilities of streaming functions, refer to the Streaming Functions documentation.