Data Stack Architecture
Based on the Moose primitives that you define in your application, Moose will automatically configure and manage your underlying data infrastructure, including:
- An ingestion layer with API endpoints and automatically generated SDKs (as defined by your Data Models), to capture data from other applications or services
- A storage and processing layer - with streaming buffers to handle peak load ingestion, and database tables to capture raw and processed data - again, as defined by your Data Models
- An orchestration layer for managing data movement and transformations - as defined by your Streaming Functions and Aggregations
- A semantic layer to maintain data products available for downstream consumers - as defined by your Data Models, Streaming Functions, and Aggregations
- A consumption layer with API endpoints and automatically generated SDKs (as defined by your Data Models, Streaming Functions, and Aggregations), to make data easily accessible by other applications and services
Packaging Architecture
Your Moose applications can be packaged up to run in two similar but distinct architectures:
- Local dev server, aka “Dev”
- Production deployment, aka “Prod”
The “Dev” packaging is designed to run locally on your machine as you are developing.
You can locally run your Moose app in this configuration automatically via the Moose CLI with the moose dev
command.
Your app will update in realtime based on code changes, and generally does not seek to preserve local data across changes.
This architecture comes with “batteries included”, container-based infrastructure packaged in, so it’s lightweight and easy to get up and running.
Specifically, a Redpanda (opens in a new tab) container is included for data streaming,
a Clickhouse (opens in a new tab) container is included for data storage.
Node/Python processes are also created for data processing (ie. Streaming Functions).
The “Prod” architecture is designed to run on-prem or in the cloud of your choice, and support highly scaled production workloads. Your entire Moose application is packaged into a single container for easy deployment and scaling. You can package your Moose app in this configuration automatically via the Moose CLI with the moose build --docker
command. It can be updated with a standard deployment process (such as CI/CD automation), and it provides deep assurances of underlying data integrity across versions and deployments. This architecture does NOT come with storage and streaming infrastructure included – it’s designed to run on top of the production data streaming and data storage clusters of your choice. Currently Moose supports Redpanda (opens in a new tab) for data streaming, and Clickhouse (opens in a new tab) for data storage.