Introduction to Data Models
Viewing typescript
switch to python
Data Models are the backbone of your Moose application, enabling you to define the schemas for the data your application supports. Moose takes these definitions and automatically configures the entire data infrastructure, including:
- Typed SDK: Captures data from upstream data sources or producers
- Ingestion API Endpoint: Receives data from upstream sources, including but not limited to the SDK
- Streaming Topic: Buffers incoming data from the ingestion server, handling peak loads without data loss
- Database Landing Table: Stores ingested data in a structured format
- Database View: Facilitates access and querying of the underlying data
Each component is strongly typed and kept in sync with the Data Model schema defined in your Moose application's source code.
Core Concepts
File and Folder Conventions
Data Models must be placed in the /datamodels
directory of your Moose application. The directory structure should look like this:
- models.ts
- models.py
Data Model Definiton
Data Models are represented as interfaces
, and must be exported from a .ts
file (using export
keyword):
import { Key } from "@514labs/moose-lib";
export interface UserActivity {
eventId: Key<string>;
timestamp: string;
userId: string;
activity: string;
}
Data Models are defined as Python dataclasses using the @dataclass
decorator and must be located within a .py
file. To ensure Moose automatically detects and registers the dataclass as a Data Model, apply the @moose_data_model
decorator to your dataclass:
Import the @moose_data_model
decorator from the moose_lib
package.
from moose_lib import Key, moose_data_model
from dataclasses import dataclass
from typing import List
@moose_data_model
@dataclass
class UserActivity:
event_id: Key[str]
timestamp: str
user_id: str
activity: str
- You can define multiple Data Models within a single file.
- The file name is flexible, but it must have a
.ts
or.py
extension. - Moose automatically detects and generates the necessary infrastructure for any exported interfacesdataclasses decorated with the
@moose_data_model
decorator that adhere to the prescribed file and folder structure conventions.
How Moose Interprets Data Models
Moose will automatically detect and use any exported interfacesdataclasses decorated with the @moose_data_model
decorator that adhere to the file and folder conventions. The property names and data types in the Data Model are interpreted and translated into the infrastructure components (SDK, Ingestion API, Streaming Topic, Database Landing Table, Database View).
Data Model Examples
Basic Data Model
{
"example_UUID": "123e4567-e89b-12d3-a456-426614174000",
"example_string": "string",
"example_number": 123,
"example_boolean": true,
"example_array": [1, 2, 3]
}
import { Key } from "@514labs/moose-lib";
export interface BasicDataModel {
example_UUID: Key<string>;
example_string: string;
example_number: number;
example_boolean: boolean;
example_array: number[];
}
from dataclasses import dataclass
from moose_lib import Key, moose_data_model
@moose_data_model
@dataclass
class BasicDataModel:
example_UUID: Key[str]
example_string: str
example_number: int
example_boolean: bool
example_array: list[int]
Optional Fields
[
{
"example_UUID": "123e4567-e89b-12d3-a456-426614174000",
"example_string": "string",
"example_number": 123,
"example_boolean": true,
"example_array": [1, 2, 3],
"example_optional_string": "optional"
},
{
"example_UUID": "123e4567-e89b-12d3-a456-426614174000",
"example_string": "string",
"example_number": 123,
"example_boolean": true,
"example_array": [1, 2, 3]
}
]
import { Key } from "@514labs/moose-lib";
export interface DataModelWithOptionalField {
example_UUID: Key<string>;
example_string: string;
example_number: number;
example_boolean: boolean;
example_array: number[];
example_optional_string?: string; // Use the `?` operator to mark a field as optional
}
from dataclasses import dataclass
from typing import Optional
from moose_lib import Key, moose_data_model
@moose_data_model
@dataclass
class DataModelWithOptionalField:
example_UUID: Key[str]
example_string: str
example_number: int
example_boolean: bool
example_array: list[int]
example_optional_string: Optional[str] # Use the Optional type to mark a field as optional
Nested Fields
{
"example_UUID": "123e4567-e89b-12d3-a456-426614174000",
"example_string": "string",
"example_number": 123,
"example_boolean": true,
"example_array": [1, 2, 3],
"example_nested_object": {
"example_nested_number": 456,
"example_nested_boolean": true,
"example_nested_array": [4, 5, 6]
}
}
import { Key } from "@514labs/moose-lib";
// Define the nested object interface separately
interface NestedObject {
example_nested_number: number;
example_nested_boolean: boolean;
example_nested_array: number[];
}
export interface DataModelWithNestedObject {
example_UUID: Key<string>;
example_string: string;
example_number: number;
example_boolean: boolean;
example_array: number[];
example_nested_object: NestedObject; // Reference nested object interface
}
from moose_lib import Key, moose_data_model
from dataclasses import dataclass
@dataclass
class ExampleNestedObject:
example_nested_number: int
example_nested_boolean: bool
example_nested_array: list[int]
@moose_data_model ## Only register the outer dataclass
@dataclass
class DataModelWithNestedObject:
example_UUID: Key[str]
example_string: str
example_number: int
example_boolean: bool
example_array: list[int]
example_nested_object: ExampleNestedObject
The example_nested_object
field in the Data Model is a nested object that is defined using the ExampleNestedObject
dataclass. The @moose_data_model
decorator is used to register the DataModelWithNestedObject
dataclass with Moose.
Moose offers a CLI helper to help automatically generate a Data Model and infer its schema based on a sample JSON file containing the data you want to ingest. The next section will provide more details on how to use this helper.