Data Modeling

Introduction to Data Models

Viewing typescript

switch to python

Data Models are the backbone of your Moose application, enabling you to define the schemas for the data your application supports. Moose takes these definitions and automatically configures the entire data infrastructure, including:

  • Typed SDK: Captures data from upstream data sources or producers
  • Ingestion API Endpoint: Receives data from upstream sources, including but not limited to the SDK
  • Streaming Topic: Buffers incoming data from the ingestion server, handling peak loads without data loss
  • Database Landing Table: Stores ingested data in a structured format
  • Database View: Facilitates access and querying of the underlying data

Each component is strongly typed and kept in sync with the Data Model schema defined in your Moose application's source code.

Core Concepts

File and Folder Conventions

Data Models must be placed in the /datamodels directory of your Moose application. The directory structure should look like this:

      • models.ts
  • Data Model Definiton

    Data Models are represented as interfaces, and must be exported from a .ts file (using export keyword):

    datamodels/models.ts
    import { Key } from "@514labs/moose-lib";
     
    export interface UserActivity {
      eventId: Key<string>;
      timestamp: string;
      userId: string;
      activity: string;
    }
    Notes
    • You can define multiple Data Models within a single file.
    • The file name is flexible, but it must have a .ts or extension.
    • Moose automatically detects and generates the necessary infrastructure for any exported interfaces that adhere to the prescribed file and folder structure conventions.

    How Moose Interprets Data Models

    Moose will automatically detect and use any exported interfaces that adhere to the file and folder conventions. The property names and data types in the Data Model are interpreted and translated into the infrastructure components (SDK, Ingestion API, Streaming Topic, Database Landing Table, Database View).

    Data Model Examples

    Basic Data Model

    sample_data.json
    {
      "example_UUID": "123e4567-e89b-12d3-a456-426614174000",
      "example_string": "string",
      "example_number": 123,
      "example_boolean": true,
      "example_array": [1, 2, 3]
    }
    datamodels/models.ts
    import { Key } from "@514labs/moose-lib";
     
    export interface BasicDataModel {
      example_UUID: Key<string>;
      example_string: string;
      example_number: number;
      example_boolean: boolean;
      example_array: number[];
    }

    Optional Fields

    sample.json
    [
      {
        "example_UUID": "123e4567-e89b-12d3-a456-426614174000",
        "example_string": "string",
        "example_number": 123,
        "example_boolean": true,
        "example_array": [1, 2, 3],
        "example_optional_string": "optional"
      },
      {
        "example_UUID": "123e4567-e89b-12d3-a456-426614174000",
        "example_string": "string",
        "example_number": 123,
        "example_boolean": true,
        "example_array": [1, 2, 3]
      }
    ]
    datamodels/models.ts
    import { Key } from "@514labs/moose-lib";
     
    export interface DataModelWithOptionalField {
      example_UUID: Key<string>;
      example_string: string;
      example_number: number;
      example_boolean: boolean;
      example_array: number[];
      example_optional_string?: string; // Use the `?` operator to mark a field as optional
    }

    Nested Fields

    sample.json
    {
      "example_UUID": "123e4567-e89b-12d3-a456-426614174000",
      "example_string": "string",
      "example_number": 123,
      "example_boolean": true,
      "example_array": [1, 2, 3],
      "example_nested_object": {
        "example_nested_number": 456,
        "example_nested_boolean": true,
        "example_nested_array": [4, 5, 6]
      }
    }
    datamodels/models.ts
    import { Key } from "@514labs/moose-lib";
     
    // Define the nested object interface separately
    interface NestedObject {
      example_nested_number: number;
      example_nested_boolean: boolean;
      example_nested_array: number[];
    }
     
    export interface DataModelWithNestedObject {
      example_UUID: Key<string>;
      example_string: string;
      example_number: number;
      example_boolean: boolean;
      example_array: number[]; 
      example_nested_object: NestedObject; // Reference nested object interface
    }

    Moose offers a CLI helper to help automatically generate a Data Model and infer its schema based on a sample JSON file containing the data you want to ingest. The next section will provide more details on how to use this helper.