This documentation defines the architectural and implementation standard for new projects ("greenfield"). It establishes guidelines for building a robust, scalable, and maintainable Modular Monolith using a modern, strictly typed technology stack.
All implementations must strictly adhere to the following technologies:
- Language: Go 1.24+.
- Architecture: Modular Monolith.
- Communication/Contract: gRPC and Protocol Buffers (Single Source of Truth).
- External API:
- gRPC: Primary backend-backend communication protocol.
- REST/HTTP: Automatically exposed via
grpc-gateway(Reverse proxy). - WebSocket: Bidirectional real-time communication (
/ws). - GraphQL (Optional): Flexible API with subscriptions via WebSocket (
/graphql). - Documentation: Swagger UI (OpenAPIv2) available at
/swagger-ui/(Dev only).
- Persistence: SQLC (Type-safe SQL).
- Database: PostgreSQL (with versioned migrations).
- Local Infrastructure: Docker Compose.
- Migrations:
golang-migrate(Schema management). - Observability:
- Logs: Structured Logging (
log/slog) with JSON format. - Metrics: OpenTelemetry (OTel) exposing metrics in Prometheus format.
- Tracing: OpenTelemetry (Context propagation).
- Logs: Structured Logging (
Folder organization is critical for maintaining modularity. Each module must be self-contained.
project/
├── cmd/
│ ├── server/ # Monolith Entrypoint
│ │ ├── main.go # Minimal entrypoint (orchestrates setup)
│ │ ├── setup/ # Server setup utilities
│ │ │ ├── config.go # Configuration loading
│ │ │ ├── database.go # Database initialization
│ │ │ ├── gateway.go # HTTP gateway setup (includes GraphQL)
│ │ │ ├── registry.go # Registry creation and module registration
│ │ │ └── server.go # gRPC and HTTP server setup
│ │ ├── health/ # Health check handlers
│ │ │ └── handlers.go # Health, readiness, liveness endpoints
│ │ ├── observability/ # Observability setup
│ │ │ ├── init.go # Observability initialization
│ │ │ ├── logger.go # Structured logging setup
│ │ │ ├── metrics.go # Prometheus metrics setup
│ │ │ └── tracing.go # OpenTelemetry tracing setup
│ │ └── commands/ # Command-line commands
│ │ ├── migrate.go # Migration command
│ │ ├── seed.go # Seed data command
│ │ └── admin.go # Admin tasks command
│ └── auth/ # Microservice Entrypoint (main.go)
├── configs/ # YAML configurations per application
│ ├── server.yaml # Monolith configuration
│ └── auth.yaml # Microservice configuration
├── internal/
│ ├── config/ # Central configuration loader (YAML + Env)
│ ├── testutil/ # Testing utilities
│ │ ├── registry.go # Test registry builder
│ │ ├── grpc.go # gRPC test server utilities
│ │ ├── events.go # Event bus testing utilities
│ │ └── migration.go # Migration testing helpers
│ └── graphql/ # Optional GraphQL support
│ ├── schema/ # GraphQL schema files
│ ├── resolver/ # GraphQL resolvers
│ └── generated/ # Generated GraphQL code
├── scripts/ # Automation scripts (scaffolding)
├── proto/ # Centralized API definitions
│ ├── google/ # Google dependencies (API, Protobuf)
│ └── [module]/ # Module-specific protos (v1)
├── modules/ # Business Modules
│ └── [module_name]/
│ ├── internal/
│ │ ├── service/ # gRPC Server implementation (Business Logic)
│ │ ├── repository/ # Data access adapters (Interface)
│ │ ├── models/ # Domain models
│ │ └── db/
│ │ ├── query/ # .sql files (Handwritten queries)
│ │ └── store/ # Go code generated by SQLC
│ └── resources/
│ └── db/
│ └── migration/ # SQL DDL scripts (Schema Versioning)
├── examples/ # Integration test examples
├── sqlc.yaml # Global SQL generation configuration
├── buf.yaml # Buf configuration
└── go.mod
The success of a modulith depends on discipline. A rotten module infects the others.
- Imports: A module
ANEVER can import anything from theinternal/folder of a moduleB. - Communication: The only legitimate form of communication between modules is:
- gRPC (in-process): Calling through the generated gRPC client (using the internal gateway). Being in-process, there are no network hops; it's a direct function call through the gRPC stack, guaranteeing performance and strong contracts.
- Events: Publish/Subscribe (if implemented in the future).
- Data: Sharing repositories, SQLC queries, or database models between modules is forbidden. Each module is the absolute owner of its schema.
- DTOs: Protobuf messages are the common language. Types from
store/orrepository/should not leak outside the module.
To avoid endless debates, we establish the following standard:
- Domain Ownership: Business logic resides in the
service/layer. - Simple Models: We don't use rich entities (complex DDD) unless strictly necessary.
- Flow:
store(DB) ->repository(Adapter) ->service(Domain/Business) ->proto(DTO). - Repository: Returns simple structs from
storeor basic domain models ininternal/models/. There is no business logic in the repository.
To improve traceability, debugging, and sortability of data, we adopt the standard of Prefixed and Time-Orderable Identifiers (Stripe style).
- Standard: We will use TypeID (
github.com/jetpack-io/typeid-go), which combines a readable prefix with a UUIDv7. - Format:
prefix_01h455vb4pex5vsknk084sn02q.- Prefix: Indicates the entity type (e.g.
user,role,org). Maximum 8 characters. - Suffix: A UUIDv7 encoded in Base32 (Crockford), making it lexicographically sortable.
- Prefix: Indicates the entity type (e.g.
- Advantages:
- Sortable: Time-based sortability allows databases (PostgreSQL) to index more efficiently than with random UUIDs.
- Contextual: When seeing an ID in a log (
user_...), we immediately know which entity it belongs to. - Security: They are globally unique and hard to predict.
- Ownership: TypeIDs are generated only in the
servicelayer. The repository and database are passive and never generate identifiers. - Semantics: Prefixes are purely informative for humans and traceability; they should not be used for authorization logic or cross-domain access.
Note
In this document, for simplicity, TypeIDs are represented and stored as complete VARCHAR. In high-performance implementations, only the binary suffix could be stored as UUID and the prefix reconstructed in the application.
All gRPC requests are automatically validated using protovalidate. The validation interceptor runs globally for all modules - no per-module setup required.
- Global Interceptor: Registered once in
cmd/server/setup/server.goand applies to all gRPC requests - Automatic Detection: Validates any protobuf message with validation annotations
- Zero Configuration: New modules automatically get validation
Add validation annotations to your proto messages:
import "buf/validate/validate.proto";
message CreateUserRequest {
string email = 1 [(buf.validate.field).string = {
email: true,
min_len: 1
}];
string phone = 2 [(buf.validate.field).string.pattern = "^\\+?[1-9]\\d{1,14}$"];
}- Interceptor Level (Automatic): Field format validation (email, phone, URI, length, patterns)
- Service Level (Business Logic): Cross-field validation, database lookups, domain rules
Validation errors are automatically converted to codes.InvalidArgument with detailed field-level messages.
See .cursor/rules/25-protobuf-validation.mdc for comprehensive validation examples and best practices.
The template follows a package-based versioning strategy for Protocol Buffers, enabling multiple API versions to coexist while maintaining backward compatibility.
- Package Versioning: Each API version uses a distinct package name (e.g.,
auth.v1,auth.v2) - Directory Structure: Proto files are organized by module and version:
proto/{module}/v{version}/ - REST Path Versioning: HTTP endpoints include the version prefix via
grpc-gatewayannotations (e.g.,/v1/auth/...,/v2/auth/...) - Breaking Changes: Require a new version directory; non-breaking changes modify existing versions
- Coexistence: Multiple versions can run simultaneously during migration periods
proto/
├── auth/
│ ├── v1/
│ │ └── auth.proto # package auth.v1; → /v1/auth/...
│ └── v2/
│ └── auth.proto # package auth.v2; → /v2/auth/...
└── order/
└── v1/
└── order.proto # package order.v1; → /v1/order/...
Create a new API version (v2, v3, etc.) when you need to just breaking changes:
- Removing fields from messages
- Changing field types (e.g.,
string→int32) - Removing RPC methods from services
- Changing RPC signatures (request/response types)
- Changing field numbers (violates protobuf compatibility)
Non-breaking changes can be made to existing versions:
- Adding new fields (with new field numbers)
- Adding new RPC methods
- Adding new optional fields
- Deprecating fields (using
deprecated = true)
- Create new version directory:
mkdir -p proto/{module}/v{version} - Copy and modify the proto file from the previous version
- Update package name:
package {module}.v{version}; - Update REST paths: Change
/v{old}/to/v{new}/in HTTP annotations - Update Go package option:
option go_package = ".../proto/{module}/v{version};{module}v{version}"; - Generate code:
just proto - Implement new service handlers in the module
Use the provided tooling:
# Create a new API version for a module
just proto-version-create MODULE_NAME=auth VERSION=v2
# This will:
# - Create proto/auth/v2/ directory
# - Copy proto/auth/v1/auth.proto as a starting point
# - Update package name and paths
# - Generate code automaticallyThe project uses Buf for breaking change detection:
# buf.yaml
version: v1
breaking:
use:
- FILECheck for breaking changes before committing:
# Check for breaking changes in proto files
just proto-breaking-check
# Or check a specific module
just proto-breaking-check MODULE_NAME=auth- Maintain Old Versions: Keep previous versions active during migration
- Gradual Migration: Migrate clients to new versions over time
- Deprecation Warnings: Use
deprecated = truein proto fields/methods - Documentation: Document migration guides for breaking changes
- Sunset Policy: Define a timeline for removing old versions
Step 1: Create new version
just proto-version-create MODULE_NAME=auth VERSION=v2Step 2: Modify the new proto file
// proto/auth/v2/auth.proto
syntax = "proto3";
package auth.v2; // Changed from auth.v1
import "google/api/annotations.proto";
option go_package = ".../gen/go/proto/auth/v2;authv2";
service AuthService {
rpc RequestLogin(RequestLoginRequest) returns (RequestLoginResponse) {
option (google.api.http) = {
post: "/v2/auth/login/request" // Changed from /v1/
body: "*"
};
}
// New method in v2
rpc RequestLoginWithBiometric(RequestLoginWithBiometricRequest) returns (RequestLoginResponse) {
option (google.api.http) = {
post: "/v2/auth/login/biometric"
body: "*"
};
}
}Step 3: Generate code
just protoStep 4: Implement service handlers
Both auth.v1 and auth.v2 services will be registered and available simultaneously.
Generated Go code follows the same version structure:
gen/go/proto/
├── auth/
│ ├── v1/
│ │ ├── auth.pb.go
│ │ ├── auth_grpc.pb.go
│ │ └── auth.pb.gw.go
│ └── v2/
│ ├── auth.pb.go
│ ├── auth_grpc.pb.go
│ └── auth.pb.gw.go
Import both versions in your code:
import (
authv1 "github.com/.../gen/go/proto/auth/v1"
authv2 "github.com/.../gen/go/proto/auth/v2"
)REST endpoints automatically reflect the proto version through grpc-gateway:
proto/auth/v1/auth.proto→/v1/auth/*endpointsproto/auth/v2/auth.proto→/v2/auth/*endpoints
Both versions are accessible simultaneously:
# v1 endpoint
curl -X POST http://localhost:8080/v1/auth/login/request
# v2 endpoint
curl -X POST http://localhost:8080/v2/auth/login/request- Start with v1: All new modules begin with
v1 - Avoid Premature Versioning: Only create new versions for breaking changes
- Document Changes: Use CHANGELOG.md to document version changes
- Test Both Versions: Ensure old and new versions work correctly
- Migration Windows: Provide sufficient time for clients to migrate
- Use Deprecation: Mark old fields/methods as deprecated before removal
- Monitor Usage: Track which versions are actively used before sunsetting
The template provides a standardized error handling system in internal/errors that eliminates boilerplate and guarantees consistency.
Instead of manually mapping each error to gRPC codes, we use typed domain errors:
import "github.com/LoopContext/go-modulith-template/internal/errors"
// In the service
func (s *Service) CreateUser(ctx context.Context, req *pb.Request) (*pb.Response, error) {
// Domain errors are automatically mapped
if err := s.repo.CreateUser(ctx, id, email); err != nil {
return nil, errors.ToGRPC(errors.Internal("failed to create user", err))
}
return &pb.Response{Id: id}, nil
}The internal/errors package provides constructors for all common cases:
// Not found (maps to codes.NotFound)
errors.NotFound("user not found")
// Validation (maps to codes.InvalidArgument)
errors.Validation("invalid email format")
// Already exists (maps to codes.AlreadyExists)
errors.AlreadyExists("user already exists")
// Unauthorized (maps to codes.Unauthenticated)
errors.Unauthorized("authentication required")
// Forbidden (maps to codes.PermissionDenied)
errors.Forbidden("access denied")
// Conflict (maps to codes.AlreadyExists)
errors.Conflict("resource conflict")
// Internal (maps to codes.Internal)
errors.Internal("internal server error")
// Unavailable (maps to codes.Unavailable)
errors.Unavailable("service temporarily unavailable")Errors can include additional details:
err := errors.NotFound("user not found",
errors.WithDetail("user_id", userID),
)
// Note: Domain errors automatically wrap underlying errors when createdif errors.Is(err, errors.TypeNotFound) {
// Handle not found case
}
var domainErr *errors.DomainError
if errors.As(err, &domainErr) {
// Access error details
log.Info("error type", "type", domainErr.Type)
}- ✅ Consistency: All services use the same error format
- ✅ Traceability: Errors wrap the complete chain with
%w - ✅ Type-safe: The compiler detects errors in types
- ✅ Less Boilerplate: No more manual
status.Error()in each service
Transactions must be controlled by the business layer (service) but executed by the repository.
- WithTx Pattern: The repository must offer a way to execute multiple operations in an atomic transaction.
- Conceptual Example:
err := r.WithTx(ctx, func(txRepo Repository) error {
// These operations occur within the same transaction
if err := txRepo.CreateUser(ctx, ...); err != nil { return err }
if err := txRepo.AssignRole(ctx, ...); err != nil { return err }
return nil
})We establish a clear boundary to avoid duplicate validations:
- Structural (Proto) - Automatic: Format, length, required fields, ranges, patterns. Automatically validated by the protovalidate interceptor (see Section 6). Add validation annotations to proto messages - no code needed.
- Business (Service) - Manual: Cross-field validation, database existence checks, complex permissions, state rules, temporal logic. Handled in the service layer.
See Section 6: gRPC Request Validation for details on adding validation annotations to proto messages.
- Token Validation: Performed centrally in a global gRPC Interceptor.
- Context: The interceptor extracts
user_idandrolefrom the token and injects them intocontext.Contextso they're available throughout the call chain. - Public Endpoints: Modules declare their public endpoints (login, registration) that don't require authentication.
The template includes authorization helpers in internal/authz to implement role and permission-based access control:
import "github.com/LoopContext/go-modulith-template/internal/authz"
func (s *Service) DeleteUser(ctx context.Context, req *pb.Request) (*pb.Response, error) {
// Require specific permission
if err := authz.RequirePermission(ctx, "users:delete"); err != nil {
return nil, errors.ToGRPC(err)
}
// Business logic...
}// Require one of multiple roles
if err := authz.RequireRole(ctx, authz.RoleAdmin, authz.RoleModerator); err != nil {
return nil, errors.ToGRPC(err)
}// Ensure user owns the resource
if err := authz.RequireOwnership(ctx, req.UserId); err != nil {
return nil, errors.ToGRPC(err)
}
// Allow ownership OR specific roles (flexible)
if err := authz.RequireOwnershipOrRole(ctx, req.UserId, authz.RoleAdmin); err != nil {
return nil, errors.ToGRPC(err)
}Register custom roles during module initialization:
func init() {
authz.RegisterRole("moderator",
"posts:delete",
"comments:delete",
"users:ban",
)
authz.RegisterRole("editor",
"posts:create",
"posts:edit",
"posts:publish",
)
}admin: Has wildcard permission (*) - full accessuser: Basic permissions (users:read,profile:read,profile:edit)
- ✅ Centralized: All authorization logic in one place
- ✅ Reusable: Same helpers work across all modules
- ✅ Type-safe: Roles and permissions as typed constants
- ✅ Flexible: Supports permissions, roles, and ownership
The template supports authentication with external providers using markbates/goth:
- Supported providers: Google, Facebook, GitHub, Apple, Microsoft, Twitter/X
- Auto-link by email: Automatically links external accounts to existing users with the same email
- Manual linking: Users can link/unlink accounts from their profile
- Token encryption: OAuth tokens are encrypted with AES-256-GCM before storage
For complete configuration, see OAuth Integration Guide.
The configuration hierarchy favors flexibility both in development and complex microservices deployments.
The application loads configuration following a strict precedence order (from lowest to highest priority):
- Default Values: Hardcoded values in
config.go(e.g.Env: "dev",HTTPPort: "8080"). - System Environment Variables: Variables defined in the environment where the application runs (
os.Getenv). .envFile: Variables loaded from the.envfile in the project root (usinggodotenv). Overrides system variables.- YAML File: Configuration located in
configs/(e.g.configs/server.yaml). Has the highest priority and overrides everything above.
Final Precedence Order: YAML > .env > system ENV vars > defaults
On application startup, a structured log is recorded showing the final value and source of each configuration variable:
Configuration sources
ENV="dev = yaml"
HTTP_PORT="8080 = yaml"
DB_DSN="postgres://... = yaml"
JWT_SECRET="[42 bytes] = yaml"
This facilitates debugging and understanding which source is providing each value.
- Add the field to the struct in
internal/config/config.gowith the correspondingyamlandenvtags. - Implement loading logic in
OverrideWithEnvandOverrideWithEnvFromDotenvto support environment variables. - Update YAML files in
configs/if the value is environment-specific. - Inject the configuration struct in the
Initializefunction of the corresponding module.
Although they reside in YAML, these variables are critical for the runtime environment:
ENV:devorprod. Determines log level and activation of debugging tools.DB_DSN: PostgreSQL connection.JWT_SECRET: Secret key for JWT tokens. Must be at least 32 bytes (256 bits) for HS256 algorithm. Automatically validated when loading configuration.HTTP_PORT/GRPC_PORT: Listening ports.
The system automatically validates configuration before starting:
- JWT Secret: Must be at least 32 bytes to meet HS256 security requirements.
- Production: In
prodmode,DB_DSNandJWT_SECRETare required.
We use Docker Compose to start dependencies (Database).
- PostgreSQL port is configurable via
DB_PORTin the host's.env. - Useful commands in
justfile:just docker-up,just docker-down.
Observability is a first-class citizen. Code should not be deployed without visibility.
We use the standard library log/slog (Go 1.21+).
- Format: JSON in production, Text in development.
- Context: Every log must include
trace_idandspan_idif they exist in the context. - Levels: INFO (normal flow), ERROR (exceptions), DEBUG (dev only). DEBUG level is enabled by default in development to facilitate debugging.
- Early Initialization: The logger is initialized in two phases: first with a basic logger before loading configuration (to see initialization logs), and then re-initialized with complete configuration (format, level) after loading configuration.
- Privacy (PII): NEVER log sensitive information (emails, tokens, passwords).
slog.InfoContext(ctx, "user created", "user_id", id) // Avoid logging email hereWe instrument the application using the OpenTelemetry SDK.
- Protocol: Prometheus (
/metrics). - Standard Metrics:
http_request_duration_seconds(Histogram).grpc_server_handled_total(Counter).
- Mapping: Automatic middleware/interceptors for gRPC and HTTP.
The system exposes two critical endpoints for the orchestrator (K8s):
- /healthz (Liveness): Indicates if the process is alive. Returns
200 OK. - /readyz (Readiness): Indicates if the service can receive traffic. Validates database connection using
db.PingContext(r.Context())to respect HTTP client timeouts and allow the orchestrator to cancel the check if necessary.
We implement distributed tracing using the OTLP exporter.
- Propagation: Traces automatically travel through gRPC interceptors.
- Context: Allows seeing the path of a request from the gateway to the repository.
To eliminate OpenTelemetry boilerplate, the template provides helpers that simplify instrumentation:
import "github.com/LoopContext/go-modulith-template/internal/telemetry"
// Service layer - auto-includes module and operation attributes
func (s *Service) CreateUser(ctx context.Context, req *pb.Request) (*pb.Response, error) {
ctx, span := telemetry.ServiceSpan(ctx, "auth", "CreateUser")
defer span.End()
// Add custom attributes
telemetry.SetAttribute(ctx, "user_email", req.Email)
// Business logic...
if err != nil {
telemetry.RecordError(ctx, err)
return nil, errors.ToGRPC(err)
}
return &pb.Response{Id: id}, nil
}
// Repository layer - includes entity name
func (r *Repo) GetUser(ctx context.Context, id string) (*User, error) {
ctx, span := telemetry.RepositorySpan(ctx, "auth", "GetUser", "user")
defer span.End()
user, err := r.q.GetUserByID(ctx, id)
if err != nil {
telemetry.RecordError(ctx, err)
return nil, fmt.Errorf("failed to get user: %w", err)
}
return user, nil
}telemetry.StartSpan(ctx, name)- Basic spantelemetry.ServiceSpan(ctx, module, operation)- Service layer spantelemetry.RepositorySpan(ctx, module, operation, entity)- Repository spantelemetry.SetAttribute(ctx, key, value)- Add attribute to current spantelemetry.RecordError(ctx, err)- Record error in span (uses ctx, not span)telemetry.AddEvent(ctx, name, attrs)- Add event to span
- ✅ Less Boilerplate: No more imports from multiple OTel packages
- ✅ Consistency: All spans follow the same naming convention
- ✅ Automatic Attributes: Module, operation and entity are automatically included
- ✅ Context Propagation: Context propagates correctly between layers
To avoid tight coupling between modules, we have an internal Event Bus (internal/events).
- Pub/Sub Pattern: Modules subscribe to events (e.g.
user.created) without knowing who emits them. - Non-Blocking: Event publication occurs in separate goroutines to avoid penalizing gRPC/HTTP response time.
- Extensibility: Facilitates adding side effects (auditing, notifications) without modifying the original service.
- Distributed Events: For multi-instance deployments, see Distributed Events Guide for Kafka, Valkey Pub/Sub, and other distributed implementations.
To avoid typos and improve autocomplete, the template includes typed constants for common events:
import "github.com/LoopContext/go-modulith-template/internal/events"
// In the service - using typed constants
bus.Publish(ctx, events.Event{
Name: events.UserCreatedEvent, // Autocomplete available!
Payload: events.NewUserCreatedPayload(userID, email),
})
// Subscription - using the same constants
bus.Subscribe(events.UserCreatedEvent, func(ctx context.Context, e events.Event) error {
slog.InfoContext(ctx, "audit: logging user creation", "user_id", e.Payload["user_id"])
return nil
})The template includes common events from the auth module:
// Auth module events
events.UserCreatedEvent // "user.created"
events.MagicCodeRequestedEvent // "auth.magic_code_requested"
events.SessionCreatedEvent // "auth.session_created"
events.ProfileUpdatedEvent // "user.profile_updated"
events.OAuthAccountLinkedEvent // "auth.oauth_account_linked"
events.ContactChangeRequestedEvent // "user.contact_change_requested"Add your events in internal/events/types.go:
const (
OrderCreatedEvent = "order.created"
OrderCancelledEvent = "order.cancelled"
OrderShippedEvent = "order.shipped"
)
// Helper to create type-safe payloads
func NewOrderCreatedPayload(orderID, userID string, amount float64) (map[string]any, error) {
if orderID == "" || userID == "" {
return nil, Validation("order ID and user ID are required")
}
return map[string]any{
"order_id": orderID,
"user_id": userID,
"amount": amount,
}, nil
}- ✅ Type-safe: The compiler detects incorrect event names
- ✅ Autocomplete: IDEs suggest available events
- ✅ Validation: Payload helpers validate required fields
- ✅ Documentation: Events are centralized and easy to discover
The project includes complete support for WebSocket (internal/websocket), enabling bidirectional real-time communication with clients.
- Event Bus Integration: Events published to the bus can be automatically sent to connected WebSocket clients.
- JWT Authentication: WebSocket connections are protected via JWT extracted from query parameter (
?token=...). - Directed Messages: Support for broadcast (all clients) and specific messages by
user_id. - Lifecycle Management: Automatic handling of connections, disconnections, heartbeat (ping/pong).
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ Client │─────▶│ WebSocket │─────▶│ Hub │
│ (Browser) │ │ Handler │ │ (Manager) │
└─────────────┘ └──────────────┘ └─────────────┘
│
▼
┌─────────────┐
│ Event Bus │
│ Subscriber │
└─────────────┘
// Send event from a module (will propagate via WebSocket)
bus.Publish(ctx, events.Event{
Name: "notification.new",
Payload: map[string]any{
"user_id": "user_123",
"message": "New notification",
},
})
// WebSocket subscriber captures it and sends to connected clientEndpoint: ws://localhost:8080/ws?token={jwt_token}
See complete guide: docs/WEBSOCKET_GUIDE.md
The project supports optional GraphQL integration using gqlgen, providing a flexible alternative to gRPC/REST.
- Schema per Module: Each module defines its own GraphQL schema (
internal/graphql/schema/{module}.graphql). - Subscriptions: Support for real-time subscriptions via WebSocket.
- Event Bus Integration: Subscriptions can listen to events from the internal bus.
- Automated Setup: Installation and configuration script (
scripts/graphql-add-to-project.sh).
internal/graphql/
├── schema/
│ ├── schema.graphql # Root schema (combines all)
│ ├── auth.graphql # Auth module schema
│ └── order.graphql # Order module schema
├── resolver/
│ ├── resolver.go # Root resolver
│ ├── auth.go # Auth resolvers
│ └── order.go # Order resolvers
└── server.go # GraphQL setup
# 1. Add GraphQL to project
just graphql-init
# 2. Define schemas per module in internal/graphql/schema/
# 3. Generate code
just graphql-generate-all
# 4. Implement resolvers in internal/graphql/resolver/
# 5. Validate
just graphql-validateEndpoints:
- GraphQL API:
POST /graphql - Playground:
GET /graphql/playground(dev only)
See complete guide: docs/GRAPHQL_INTEGRATION.md
The modular design and packaging allow efficient system scaling:
The system supports automatic scaling based on CPU/Memory defined in the Helm Chart. An 80% threshold is recommended to trigger new replicas.
The application handles termination signals to close database connections and finish in-flight gRPC requests before dying.
We guarantee minimum availability during Kubernetes cluster maintenance, ensuring there's always at least one operational replica.
Development begins by defining the API. This ensures frontend and backend agree on data structure before writing code.
proto/users/v1/users.proto:
syntax = "proto3";
package users.v1;
import "google/api/annotations.proto";
service UserService {
// Creates a new user
rpc CreateUser(CreateUserRequest) returns (CreateUserResponse) {
option (google.api.http) = {
post: "/v1/users"
body: "*"
};
}
}
message CreateUserRequest {
string username = 1;
string email = 2;
}
message CreateUserResponse {
string id = 1;
string username = 2;
}We design the database and necessary operations. SQLC will generate the data access code.
1. Migration (DDL):
We create migrations using golang-migrate.
modules/users/resources/db/migration/000001_initial_schema.up.sql:
CREATE TABLE users (
id VARCHAR(64) PRIMARY KEY,
username VARCHAR(255) NOT NULL,
email VARCHAR(255) NOT NULL UNIQUE,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP -- Must be updated from application or via Trigger
);2. Queries (SQL):
modules/users/internal/db/query/users.sql:
-- name: CreateUser :exec
INSERT INTO users (id, username, email) VALUES ($1, $2, $3);
-- name: GetUserByEmail :one
SELECT * FROM users WHERE email = $1 LIMIT 1;
-- name: GetValidMagicCodeByEmail :one
SELECT * FROM magic_codes
WHERE user_email = $1 AND code = $2 AND expires_at > $3
ORDER BY created_at DESC LIMIT 1;Note
For queries involving time comparisons (e.g. magic codes with expiration), it's recommended to pass the current time as a parameter ($3) from the application instead of using CURRENT_TIMESTAMP in SQL. This guarantees consistency between application time and database time, avoiding synchronization issues.
3. SQLC Configuration:
sqlc.yaml:
version: "2"
sql:
- egine: "postgresql"
queries: "modules/users/internal/db/query/"
schema: "modules/users/resources/db/migration/"
gen:
go:
package: "store"
out: "modules/users/internal/db/store"
sql_package: "database/sql"
emit_interface: true
emit_json_tags: true4. Generation:
Run sqlc generate. This creates modules/users/internal/db/store/ with type-safe code.
5. Multi-Module Migration System (internal/migration):
The template includes an automatic discovery and execution system for migrations for all registered modules.
Each module implements the ModuleMigrations interface to declare its migration path:
// En modules/users/module.go
func (m *Module) MigrationPath() string {
return "modules/users/resources/db/migration"
}Migrations run automatically when starting the server:
Migrations run automatically when starting the server (handled in cmd/server/main.go):
// Migrations are executed automatically in cmd/server/main.go
runner := migration.NewRunner(cfg.DBDSN, reg)
if err := runner.RunAll(); err != nil {
slog.Error("Failed to run migrations", "error", err)
return
}The system:
- Discovers all modules that implement
ModuleMigrations - Executes migrations in module registration order
- Uses
golang-migrateinternally for version tracking - Each module maintains its own migration history
- Automatic path resolution: Migration paths are automatically resolved relative to the project root, making tests work seamlessly
# Run all migrations for all modules
just migrate-up # or simply: just migrate
# Revert last migration for a specific module
just migrate-down MODULE_NAME=users
# Create a new migration for a module
just migrate-create MODULE_NAME=users NAME=add_profile_fields
# Delete all tables and re-run migrations
just db-down # Only deletes tables
just db-reset # Deletes and re-runs (db-down + migrate-up)# Run only migrations without starting server
go run cmd/server/main.go -migrate
# or
just migrate- ✅ Automatic: You don't need to modify
main.goor gateway setup when adding modules (migrations auto-discovered) - ✅ Ordered: Migrations execute in registration order
- ✅ Autonomous: Each module manages its own schema
- ✅ Portable: Works in both monolith and microservices
Similar to migrations, the template includes an automatic discovery and execution system for seed data.
Each module implements the ModuleSeeder interface (from internal/migration) to declare its seed data path:
// In modules/users/module.go
func (m *Module) SeedPath() string {
return "modules/users/resources/db/seed"
}Seed data can be executed via:
# Run seed data for all modules
just seed
# Or using subcommand
go run cmd/server/main.go seedThe system:
- Discovers all modules that implement
ModuleSeeder - Executes seed SQL files in alphabetical order (e.g.,
001_initial_data.sql,002_more_data.sql) - Each module manages its own seed data
- Seed data is typically used for development and testing
Note: Seed data is NOT executed automatically on server startup. It must be run explicitly via just seed or the seed subcommand.
We create an intermediate layer that abstracts sqlc from the rest of the application. The repository is a slave of the service: it doesn't generate IDs or contain logic.
modules/users/internal/repository/repository.go:
package repository
import (
"context"
"database/sql"
"fmt"
"project/modules/users/internal/db/store"
)
// Repository defines business operations on data
type Repository interface {
CreateUser(ctx context.Context, id, username, email string) error
}
type SQLRepository struct {
q *store.Queries
db *sql.DB
}
func NewSQLRepository(db *sql.DB) *SQLRepository {
return &SQLRepository{
q: store.New(db),
db: db,
}
}
func (r *SQLRepository) CreateUser(ctx context.Context, id, username, email string) error {
// Type-safe execution. The repository does NOT generate the ID.
err := r.q.CreateUser(ctx, store.CreateUserParams{
ID: id,
Username: username,
Email: email,
})
if err != nil {
return fmt.Errorf("error persisting user: %w", err)
}
return nil
}We implement the gRPC interface generated by protoc. This is where orchestration logic and domain ownership reside.
modules/users/internal/service/service.go:
package service
import (
"context"
"database/sql"
"errors"
"log/slog"
"go.jetify.com/typeid"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
usersv1 "project/gen/go/users/v1" // Code generated by Buf/Protoc
"project/modules/users/internal/repository"
)
type UserService struct {
usersv1.UnimplementedUserServiceServer
repo repository.Repository
}
func NewUserService(repo repository.Repository) *UserService {
return &UserService{repo: repo}
}
func (s *UserService) CreateUser(ctx context.Context, req *usersv1.CreateUserRequest) (*usersv1.CreateUserResponse, error) {
// 1. Domain Logic: Identity Generation (TypeID)
tid, _ := typeid.WithPrefix("user") // Centralized generation in Service
idStr := tid.String()
// 2. Persistence call
err := s.repo.CreateUser(ctx, idStr, req.Username, req.Email)
if err != nil {
// Specific error handling: mapping to appropriate gRPC codes
if errors.Is(err, sql.ErrNoRows) {
slog.DebugContext(ctx, "user not found", "email", req.Email)
return nil, status.Error(codes.NotFound, "user not found")
}
slog.ErrorContext(ctx, "failed to create user", "error", err)
return nil, status.Error(codes.Internal, "failed to create user")
}
// 3. Mapping to Proto response
return &usersv1.CreateUserResponse{
Id: idStr,
Username: req.Username,
}, nil
}- Create new migration script:
modules/[mod]/resources/db/migration/00X_add_field.up.sql. - Update Queries in
.sqlif necessary to include the field in SELECTs or INSERTs. - Run
sqlc generate. The Go struct will update automatically. - Fix compilation errors (the Go compiler will tell you where the field is missing).
We establish a testing discipline that guarantees quality without bureaucracy:
To facilitate unit testing, we use gomock (go.uber.org/mock) to generate automatic mocks of interfaces.
Philosophy:
- Type-safe: Mocks fail at compilation if the interface changes, ensuring tests are always synchronized.
- Automatic: Generation via
//go:generate, aligned with project philosophy (sqlc, buf). - Validatable: Expectation verifications in tests to ensure code calls dependencies correctly.
Commands:
# Install tool
just install-mocks
# Generate all mocks
just generate-mocks
# Run unit tests (generates mocks automatically)
just test-unitAdding mocks to a new interface:
- Add annotation at the start of the file (before package doc):
//go:generate mockgen -source=myinterface.go -destination=mocks/myinterface_mock.go -package=mocks
// Package mypackage provides...
package mypackage
type MyInterface interface {
DoSomething(ctx context.Context, id string) error
}-
Generate:
just generate-mocks -
Use in tests:
package service_test
import (
"testing"
"go.uber.org/mock/gomock"
"yourproject/path/to/mocks"
)
func TestWithMock(t *testing.T) {
ctrl := gomock.NewController(t)
defer ctrl.Finish()
mock := mocks.NewMockMyInterface(ctrl)
// Setup expectations
mock.EXPECT().
DoSomething(gomock.Any(), "user_123").
Return(nil).
Times(1)
// Test code that uses the mock
// ...
}Mocks vs Real Repository:
- Unit Tests: Use mocks (fast, isolated, don't require DB).
- Integration Tests: Use real DB with Testcontainers (validate real SQL queries).
Real Example:
See modules/auth/internal/service/service_mock_test.go for complete examples of how to test services using repository mocks.
For a smooth development experience, we use Air to automatically recompile code on save:
- Monolith:
just dev - Any Module:
just dev-module {name}(e.g.just dev-module auth)
Tip
Air watches for changes in .go, .yaml, .yml, .proto, .sql, .env files and specific configuration files, restarting the binary instantly. The module generator (just new-module) automatically creates the necessary .air.{module}.toml file.
The project provides wildcard commands to work with any module:
# Build
just build-module auth # Generates bin/auth
just build-module payments # Generates bin/payments
just build-all # Compiles server + all modules
# Docker
just docker-build-module auth # Generates modulith-auth:latest
just docker-build-module payments # Generates modulith-payments:latest
# Development with Hot Reload
just dev-module auth # Runs auth with hot reload
just dev-module payments # Runs payments with hot reloadNote
All binaries are compiled in the centralized bin/ directory, ignored by Git.
- Convention:
*_test.gofiles next to the code they test. - Unit Tests:
- Approach: Test pure business logic and transformations.
- Mocks: Mock the
repository.Repositoryinterface in Service tests. Using real DB in unit tests is forbidden.
- Integration Tests:
- Location: Can live within each module or in a separate
tests/integrationfolder. - Infrastructure: Use
docker-composeor Testcontainers to start a real database. - Flow: Test gRPC endpoint -> Repository -> real DB and verify side effects.
- Location: Can live within each module or in a separate
The template provides comprehensive testing utilities to simplify integration testing:
Create test registries easily with testutil.NewTestRegistryBuilder():
import "github.com/LoopContext/go-modulith-template/internal/testutil"
func TestMyModule(t *testing.T) {
// Set up test database (using testcontainers)
pgContainer, err := testutil.NewPostgresContainer(ctx, t)
require.NoError(t, err)
defer func() {
if err := pgContainer.Close(ctx); err != nil {
t.Logf("Failed to close container: %v", err)
}
}()
db, err := pgContainer.DB(ctx)
require.NoError(t, err)
defer db.Close()
// Create test registry with database
reg := testutil.NewTestRegistryBuilder().
WithDatabase(db).
Build()
// Register your module
reg.Register(myModule.NewModule())
// Initialize and run migrations
if err := reg.InitializeAll(); err != nil {
t.Fatalf("Failed to initialize: %v", err)
}
if err := testutil.RunMigrationsForTest(ctx, pgContainer.DSN, reg); err != nil {
t.Fatalf("Failed to run migrations: %v", err)
}
// Now you can test your module with a real database
}Test gRPC services end-to-end with testutil.NewGRPCTestServer():
func TestGRPCService(t *testing.T) {
ctx := context.Background()
pgContainer, err := testutil.NewPostgresContainer(ctx, t)
require.NoError(t, err)
defer func() {
if err := pgContainer.Close(ctx); err != nil {
t.Logf("Failed to close container: %v", err)
}
}()
db, err := pgContainer.DB(ctx)
require.NoError(t, err)
defer db.Close()
cfg := testutil.TestConfig()
cfg.DBDSN = pgContainer.DSN
reg := testutil.NewTestRegistryBuilder().
WithDatabase(db).
WithConfig(cfg).
Build()
reg.Register(auth.NewModule())
// ... initialize and migrate ...
// Create gRPC test server
grpcServer, err := testutil.NewGRPCTestServer(cfg, reg)
if err != nil {
t.Fatalf("Failed to create gRPC test server: %v", err)
}
defer grpcServer.Stop()
// Get gRPC client
authClient := pb.NewAuthServiceClient(grpcServer.Conn())
// Test your service
resp, err := authClient.GetProfile(ctx, &pb.GetProfileRequest{UserId: "user_123"})
// ... assertions ...
}Test event bus interactions with testutil.EventCollector:
func TestEventBus(t *testing.T) {
bus := events.NewBus()
collector := testutil.NewEventCollector()
// Subscribe to events
collector.Subscribe(bus, "user.created")
// Publish event
bus.Publish(ctx, events.Event{
Name: "user.created",
Payload: map[string]interface{}{
"user_id": "user_123",
},
})
// Wait for event (with timeout)
event, err := collector.WaitForEvent(5 * time.Second)
if err != nil {
t.Fatalf("Failed to receive event: %v", err)
}
// Verify event
assert.Equal(t, "user.created", event.Name)
assert.Equal(t, "user_123", event.Payload.(map[string]interface{})["user_id"])
// Or get all collected events
allEvents := collector.AllEvents()
assert.Len(t, allEvents, 1)
}See examples/ directory for comprehensive integration test examples:
examples/grpc_service_test.go- Testing gRPC services end-to-endexamples/module_communication_test.go- Testing inter-module communicationexamples/event_bus_test.go- Testing event bus interactionsexamples/repository_transaction_test.go- Testing repository transactionsexamples/full_module_test.go- Complete module integration test
These examples demonstrate best practices for:
- Setting up test databases with Testcontainers
- Creating test registries
- Testing gRPC services
- Testing event bus interactions
- Testing database transactions
- Testing complete module workflows
To accelerate the start of new modules and ensure they follow defined standards, we have a robust scaffolding tool.
- Command:
just new-module {name}(e.g.just new-module payments) - Automation:
- Generates standard folder structure.
- Creates boilerplate files (
module.go,service.go,repository.go,proto). - Automatically configures
sqlc.yamladding the entry for the new module. - Generates
.air.{module}.tomlfile for hot reload with Air. - Creates
cmd/{module}/main.gofor independent deployment as microservice. - Creates
configs/{module}.yamlwith module-specific configuration. - Plural Handling: Detects plural names (e.g.
products) and adjusts the generated struct name (e.g.Product) in templates to avoid compilation errors.
- Generated Files:
cmd/[name]/main.go: Entrypoint for independent microservice.configs/[name].yaml: Module-specific configuration..air.[name].toml: Hot reload configuration.modules/[name]/module.go: Completeregistry.Moduleimplementation with:Name()- Module identifierInitialize(reg)- Initialization with registry accessRegisterGRPC(server)- gRPC handler registrationRegisterGateway(ctx, mux, conn)- HTTP gateway registrationMigrationPath()- Module migration path (optional, for automatic migration discovery)SeedPath()- Module seed data path (optional, for automatic seed data discovery)PublicEndpoints()- Public endpoints (no auth, optional)
modules/[name]/internal/service/service.go: Service with:- Integration with
internal/errorsfor error handling - Integration with
internal/telemetryfor tracing - Integration with
internal/eventsfor pub/sub - TypeID generation
- Validation and authorization
- Integration with
modules/[name]/internal/repository/repository.go:Repositoryinterface for testabilitySQLRepositoryimplementation with SQLC- Transaction support with
WithTx()
modules/[name]/resources/db/migration/: Initial SQL scripts (up/down)proto/[name]/v1/: Protocol Buffer definition with HTTP annotations
After generating a module:
# Generate code
just proto # Generates gRPC code
just sqlc # Generates DB code
# Build
just build-module payments
# Docker
just docker-build-module payments
# Development
just dev-module paymentsOnce the module is generated with just new-module orders, implement the business logic:
// modules/orders/internal/service/service.go
func (s *Service) CreateOrder(ctx context.Context, req *pb.CreateOrderRequest) (*pb.CreateOrderResponse, error) {
// 1. Telemetry (already included in template)
ctx, span := telemetry.ServiceSpan(ctx, "orders", "CreateOrder")
defer span.End()
// 2. Authorization (using template helpers)
if err := authz.RequirePermission(ctx, "orders:create"); err != nil {
return nil, errors.ToGRPC(err)
}
// 3. Business validation
if req.Amount <= 0 {
return nil, errors.ToGRPC(errors.Validation("amount must be positive"))
}
// 4. Generate TypeID (sortable, prefixed)
tid, _ := typeid.WithPrefix("order")
id := tid.String()
// 5. Persistence
if err := s.repo.CreateOrder(ctx, id, req); err != nil {
telemetry.RecordError(ctx, err)
return nil, errors.ToGRPC(errors.Internal("failed to create order", err))
}
// 6. Publish event (typed event)
payload, _ := events.NewOrderCreatedPayload(id, req.UserId, req.Amount)
s.bus.Publish(ctx, events.Event{
Name: events.OrderCreatedEvent,
Payload: payload,
})
return &pb.CreateOrderResponse{Id: id}, nil
}All of the above uses template abstractions - your code only contains business logic.
The scaffolding tool generates code that follows established patterns and best practices. Understanding these patterns helps when extending generated modules.
SQLC Return Type Handling:
SQLC generates code that returns structs (not pointers) and slices of structs (not pointers). The repository template automatically converts these to pointers to match the interface:
// Get method - converts struct to pointer
func (r *SQLRepository) GetModule(ctx context.Context, id string) (*store.Module, error) {
result, err := r.q.GetModule(ctx, id)
if err != nil {
return nil, fmt.Errorf("error getting module: %w", err)
}
return &result, nil // Convert struct to pointer
}
// List method - converts slice of structs to slice of pointers
func (r *SQLRepository) ListModules(ctx context.Context) ([]*store.Module, error) {
modules, err := r.q.ListModules(ctx)
if err != nil {
return nil, err
}
result := make([]*store.Module, len(modules))
for i := range modules {
result[i] = &modules[i] // Convert each element to pointer
}
return result, nil
}SQLC Type Naming Conventions:
SQLC generates types with schema prefixes. Types follow the pattern {Schema}{TableName}:
- Tables in the
authschema generate:auth.users→store.AuthUser(NOTstore.User)auth.magic_codes→store.AuthMagicCode(NOTstore.MagicCode)auth.sessions→store.AuthSession
Always use the full prefixed type names in repository interfaces, service code, and tests:
// ✅ Correct
func GetUser(ctx context.Context, id string) (*store.AuthUser, error)
func GetMagicCode(ctx context.Context, code string) (*store.AuthMagicCode, error)
// ❌ Wrong - will cause "undefined: store.User" compilation errors
func GetUser(ctx context.Context, id string) (*store.User, error)
func GetMagicCode(ctx context.Context, code string) (*store.MagicCode, error)After running just sqlc, check modules/<mod>/internal/db/store/models.go to see the exact generated type names.
Transaction Handling:
The WithTx method includes proper panic and error handling:
func (r *SQLRepository) WithTx(ctx context.Context, fn func(Repository) error) error {
tx, err := r.db.BeginTx(ctx, nil)
if err != nil {
return fmt.Errorf("failed to begin transaction: %w", err)
}
defer func() {
if p := recover(); p != nil {
_ = tx.Rollback()
panic(p) // Re-panic after rollback
} else if err != nil {
_ = tx.Rollback()
} else {
err = tx.Commit()
}
}()
txRepo := &SQLRepository{
q: r.q.WithTx(tx),
db: r.db,
}
err = fn(txRepo)
return err
}Telemetry Integration:
All service methods include telemetry spans and proper error recording:
func (s *Service) CreateModule(ctx context.Context, req *pb.CreateModuleRequest) (*pb.CreateModuleResponse, error) {
ctx, span := telemetry.ServiceSpan(ctx, "module", "CreateModule")
defer span.End()
// ... business logic ...
if err != nil {
telemetry.RecordError(ctx, err) // Use ctx, not span
return nil, errors.ToGRPC(errors.Internal("failed to create", err))
}
}Important: telemetry.RecordError expects ctx context.Context, not a span. The context contains the span information automatically.
Error Handling:
The template uses the domain error system, not raw gRPC status errors:
// ✅ Correct: Use domain errors
if err != nil {
return nil, errors.ToGRPC(errors.Internal("failed to create", err))
}
// ✅ Correct: Handle specific error types
if err == sql.ErrNoRows {
return nil, errors.ToGRPC(errors.NotFound("resource not found"))
}
// ❌ Incorrect: Don't use raw status.Error
return nil, status.Error(codes.Internal, "internal error")Telemetry Attributes:
When setting attributes, convert non-string types to strings:
// ✅ Correct: Convert int to string
telemetry.SetAttribute(ctx, "items_count", fmt.Sprintf("%d", len(req.Items)))
// ❌ Incorrect: Don't pass int directly
telemetry.SetAttribute(ctx, "items_count", len(req.Items))The scaffolding tool ensures all generated code follows these patterns:
-
Repository Layer:
- ✅ SQLC struct-to-pointer conversion
- ✅ Proper transaction error handling
- ✅ Error wrapping with context
-
Service Layer:
- ✅ Telemetry spans for all operations
- ✅ Domain error system usage
- ✅ Event publishing for side effects
- ✅ TypeID generation for entities
-
Module Layer:
- ✅ Complete
registry.Moduleimplementation - ✅ Proper dependency injection
- ✅ Migration path configuration
- ✅ Complete
After generating a module, verify it compiles:
# Generate code
just proto
just sqlc
# Build the module
go build ./modules/<module-name>/...
# Or build the entire project
just buildIssue: SQLC return type errors
cannot use r.q.GetModule(ctx, id) (value of struct type store.Module) as *store.Module
Solution: The template handles this automatically. If you see this error, ensure you're using the latest template version.
Issue: Telemetry errors
cannot use span as context.Context value in argument to telemetry.RecordError
Solution: Always use ctx for telemetry.RecordError, not span:
// ✅ Correct
telemetry.RecordError(ctx, err)
// ❌ Incorrect
telemetry.RecordError(span, err) // Wrong: span is not context.Context
// ✅ Correct
telemetry.RecordError(ctx, err) // Correct: use ctx which contains spanIssue: Unused imports
Solution: The template only includes necessary imports. If you add functionality that requires new imports, add them explicitly.
When extending generated modules:
-
Add Repository Methods:
- Follow the SQLC struct-to-pointer pattern
- Use
WithTxfor transactional operations - Wrap errors with context
-
Add Service Methods:
- Include telemetry spans
- Use domain error system
- Publish events for side effects
-
Add SQL Queries:
- Place in
modules/<name>/internal/db/query/<name>.sql - Run
just sqlcto generate code - Update repository interface and implementation
- Place in
-
Add Proto Methods:
- Edit
proto/<name>/v1/<name>.proto - Run
just prototo generate code - Implement in service layer
- Edit
A well-designed Modulith allows transitioning from a single binary (Monolith) to multiple binaries (Microservices) without changing module logic.
Each module must define its own configuration struct to avoid depending on global variables.
// modules/auth/module.go
type Config struct {
JWTSecret string `yaml:"jwt_secret"` // yaml tag required for mapping from YAML
}
func Initialize(db *sql.DB, grpcServer *grpc.Server, bus *events.Bus, cfg Config) error {
// Early validation: verify required configuration is present
if cfg.JWTSecret == "" {
return fmt.Errorf("JWT secret is empty, cannot initialize auth module")
}
// ... rest of initialization
}The project uses a centralized loader in internal/config (based on yaml.v3) with the following hierarchy:
- Files per Application: A
configs/folder with YAML files specific to each entrypoint (e.g.configs/server.yaml,configs/auth.yaml) is recommended. - Unified Schema: Although files are different, they all map to the central
AppConfigstruct to maintain consistency. A microservice will simply ignore YAML sections that don't apply to it. - Precedence Hierarchy: The loading order is: YAML > .env > system ENV vars > defaults. This means YAML values have the highest priority, followed by the
.envfile, then system variables, and finally default values. - Traceability: On startup, the application logs the source of each configuration variable, facilitating debugging and understanding which value is being used.
Separation is achieved by creating different entry points (cmd/) that point to their respective configuration files:
- Monolith Mode (
cmd/server/main.go): Starts all modules, a single DB connection and a single gRPC server. - Microservice Mode (
cmd/auth/main.go): Only imports and initializes theauthmodule.
When modules live in different binaries, gRPC calls that were previously in-process (direct) must now travel over the network. To make this transparent:
- A Service Discovery or internal Load Balancer is used.
- The gRPC client injected in a module must point to the external microservice address instead of
127.0.0.1(or use the same client interface).
The project is ready to run in container environments (Docker) and orchestrators (Kubernetes) natively, with a modular approach that allows evolution from monolith to microservices without friction.
We use an optimized Dockerfile with two stages that supports dynamic building of any module:
- Builder: Compiles the binary in a Go image (Alpine). Uses
--build-arg TARGET={module}to select what to build. - Runner: A lightweight image (
alpine:3.20) that only contains the binary and necessary configuration files.
All binaries are compiled in /app/bin/ and automatically consolidated.
# Build monolith server
just docker-build
# Generates: modulith-server:latest
# Build a specific module
just docker-build-module auth
# Generates: modulith-auth:latest
# Build any module
just docker-build-module payments
# Generates: modulith-payments:latestIn deployment/helm/modulith is the standard chart that supports multiple deployment strategies.
Deploys everything as a single deployment with autoscaling:
helm install modulith-server ./deployment/helm/modulith \
--values ./deployment/helm/modulith/values-server.yaml \
--namespace productionCombines monolith with independent modules for components that need to scale differently:
# Main server with core modules
helm install modulith-server ./deployment/helm/modulith \
--values values-server.yaml
# Auth module separated (higher demand)
helm install modulith-auth ./deployment/helm/modulith \
--values values-auth-module.yamlEach module as independent deployment:
# Each module with its own lifecycle
helm install modulith-auth ./deployment/helm/modulith \
--set deploymentType=module \
--set moduleName=auth
helm install modulith-orders ./deployment/helm/modulith \
--set deploymentType=module \
--set moduleName=orders- ✅ Multi-Module Support: Single chart for server and all modules
- ✅ Naming Convention: Automatically generates
modulith-{module}:tag - ✅ HPA and PDB: Configurable Horizontal Pod Autoscaling and Pod Disruption Budgets
- ✅ Health Checks: Liveness (
/healthz) and Readiness (/readyz) probes - ✅ Secrets: Sensitive configuration management (DB_DSN, JWT_SECRET)
- ✅ Resource Limits: CPU and memory configuration per deployment
See complete documentation at: deployment/helm/modulith/README.md
We manage base infrastructure using a modular approach with OpenTofu (Open Source Fork of Terraform) and Terragrunt to guarantee consistent and reproducible environments.
Note: IaC manages base infrastructure (VPC, EKS, RDS), while application deployments are handled with Helm Charts (see previous section).
deployment/opentofu/modules/: Definition of base components (VPC, RDS, EKS).deployment/terragrunt/envs/: Environment-specific configurations (dev,prod).
- VPC (Network): Configures public (ELBs) and private (Nodes/DB) subnets with NAT Gateway.
- RDS (Database): PostgreSQL 16 instance isolated in private subnets.
- EKS (Compute): Managed Kubernetes cluster with scalable Node Groups.
Terragrunt allows us to keep code DRY (Don't Repeat Yourself) and is 100% compatible with OpenTofu. To deploy the development environment:
cd deployment/terragrunt/envs/dev
terragrunt run-all plan # Preview changes (uses tofu internally)
terragrunt run-all apply # Apply infrastructureThe project integrates an automation pipeline to guarantee stability:
Run automatically on each Push/PR:
- Checksum/Verify: Validates that dependencies haven't been altered.
The project imposes a "World Class" quality standard through a highly configured linter:
- Strict Linter:
golangci-lintis configured to detect not only errors, but also:- Cyclomatic and Cognitive Complexity: Avoids unmanageable functions.
- Nesting Level: Maximum 5 levels (linters
nestif). - Documentation: Every public element MUST have Godoc comments.
- Security: Static analysis with
gosecon each commit.
- Configuration Validation: The configuration loader semantically validates critical variables before the application starts (Fail-Fast).
- Tests with Race Detection: Code with race conditions is not allowed (
-race).
The project includes an advanced coverage reporting system:
# Visual report in terminal with statistics
just coverage-report
# Interactive HTML report
just test-coverage
just coverage-htmlThe coverage report shows:
- 📦 Coverage per package with visual indicators (🟢 >95%, 🟡 80-95%, 🟠 60-80%)
- 📈 General statistics (packages with excellent/good/medium coverage)
- 🎯 Top 10 files with best coverage
⚠️ Areas that need more tests
Note: Total project coverage automatically excludes generated code (*.pb.go, sqlc, etc.) to provide accurate metrics of hand-written code.
We've adopted a strict set of rules to guarantee consistency:
- wsl_v5 (Whitespace Linter): Forces use of whitespace to separate logical blocks (e.g. before a
returnorif). - wrapcheck: Forces wrapping external errors with
fmt.Errorf("...: %w", err)to maintain traceability chain. - revive: Modern replacement for
golintfor style and naming conventions. - errcheck: Verifies that all returned errors are handled appropriately.
- goconst: Detects repeated strings that should be constants.
- cyclop: Limits cyclomatic complexity of functions (maximum 10).
- funlen: Limits function length (maximum 60 lines).
- package-comments: All packages must have documentation.
Golden Rule: NEVER modify .golangci.yaml to ignore or suppress errors. Always implement appropriate fixes.
Mandatory Process:
- Run:
just lintafter ANY modification to.gofiles. - Iterate: Fix all errors until reaching 0 issues.
- Appropriate Fixes:
errcheck: Add error handling or explicitly assign to_if the error should be intentionally ignored.goconst: Extract repeated strings to constants with descriptive names.revive: Rename unused parameters to_.wsl_v5: Add appropriate whitespace between statements and control flow.cyclop: Reduce complexity by extracting logic to helper functions.funlen: Split long functions into smaller, focused functions.
- Validation: CI/CD will reject any PR with linting errors.
Refactoring Example (Complexity):
// ❌ BAD: Complex function with cyclomatic complexity > 10
func TestComplexFunction(t *testing.T) {
// 50+ lines of code with many nested if/else
}
// ✅ GOOD: Extract to helper functions
func TestComplexFunction(t *testing.T) {
t.Run("case 1", func(t *testing.T) { testCase1(t) })
t.Run("case 2", func(t *testing.T) { testCase2(t) })
}
func testCase1(t *testing.T) {
t.Helper()
// Focused logic
}If you're using an LLM to generate or extend this project, make sure to follow this logical order to maintain integrity:
- Skeleton First: Create folder structure and
go.mod,buf.yaml,sqlc.yamlfiles. - Contract (Proto): Define
.protofiles and generate code withbuf generate. - Persistence (SQL): Create
.sqlmigrations and generate store withsqlc generate. - Repository: Implement the
Repositoryinterface wrappingsqlccode. - Service: Create business logic, generate TypeIDs and perform gRPC error mapping.
- Wiring (Module): Export the module's
Initializefunction and register it incmd/server/setup/registry.goviaRegisterModules(). - Injection: Ensure
db *sql.DBandbus *events.Busare passed correctly between layers.
To avoid coupling with external providers (Twilio, SendGrid, etc.), the system uses the Adapter Pattern combined with an Event-Driven approach.
- Interfaces: Defined in
internal/notifier/notifier.go(EmailProvider,SMSProvider). - Reactive Implementation: A
notifier.Subscriberlistens to global events (e.g.auth.magic_code_requested) and dispatches notifications asynchronously and non-blocking. - LogNotifier for Dev: Prints notifications in structured logs, allowing testing flows like "Magic Code" without configuring external APIs.
- Injection and Registration:
- The module (e.g.
auth) emits the event to theBus. - The
Subscriberregisters to theBusincmd/server/setup/registry.go(viaCreateRegistry), ensuring delivery logic is completely outside the module's domain.
- The module (e.g.
The system provides a cache abstraction for session storage, rate limiting, and general caching.
type Cache interface {
Get(ctx context.Context, key string) ([]byte, error)
Set(ctx context.Context, key string, value []byte, ttl time.Duration) error
Delete(ctx context.Context, key string) error
Exists(ctx context.Context, key string) (bool, error)
Close() error
}- MemoryCache: In-memory cache with automatic cleanup of expired entries. Ideal for development and single-instance deployments.
- ValkeyCache: Stub prepared for Valkey. Add dependency
github.com/valkey/go-redis/v9to use.
import "github.com/LoopContext/go-modulith-template/internal/cache"
// Create in-memory cache
mc := cache.NewMemoryCache()
// Save value with TTL
err := mc.Set(ctx, "session:123", sessionData, 30*time.Minute)
// Retrieve value
data, err := mc.Get(ctx, "session:123")
if errors.Is(err, cache.ErrNotFound) {
// Cache miss
}
// Helper for strings
sc := cache.NewStringCache(mc)
token, err := sc.Get(ctx, "token:456")To protect the system against cascading failures, the template includes resilience patterns.
Implements the Circuit Breaker pattern for external services:
import "github.com/LoopContext/go-modulith-template/internal/resilience"
// Create circuit breaker
config := resilience.DefaultCircuitBreakerConfig()
config.MaxFailures = 5
config.Timeout = 30 * time.Second
cb := resilience.NewCircuitBreaker("payment-service", config)
// Use for external calls
err := cb.Execute(ctx, func(ctx context.Context) error {
return paymentClient.Charge(ctx, amount)
})
if errors.Is(err, resilience.ErrCircuitOpen) {
// Service is failing, use fallback
}- Closed: Normal operation, calls pass through.
- Open: Circuit open, rejects calls immediately.
- Half-Open: Testing recovery, allows some calls.
config := resilience.DefaultRetryConfig()
config.MaxAttempts = 3
config.InitialDelay = 100 * time.Millisecond
err := resilience.Retry(ctx, config, func(ctx context.Context) error {
return externalService.Call(ctx)
})Feature flag system for gradual rollouts and A/B testing.
import "github.com/LoopContext/go-modulith-template/internal/feature"
// Create manager
fm := feature.NewInMemoryManager()
// Register flags
fm.RegisterFlag("new_checkout", "New checkout flow", false)
fm.RegisterFlag("dark_mode", "Enable dark mode", true)
// Check flag
if fm.IsEnabled(ctx, "new_checkout") {
// Use new flow
}// Flag enabled for 20% of users
fm.SetFlag(ctx, feature.Flag{
Name: "experimental_feature",
Enabled: true,
Percentage: 20, // Only 20% of users
})
// Check for specific user
featureCtx := feature.Context{
UserID: userID,
Email: email,
}
if fm.IsEnabledFor(ctx, "experimental_feature", featureCtx) {
// User is in the 20%
}fm.SetFlag(ctx, feature.Flag{
Name: "beta_feature",
Enabled: true,
Rules: []feature.Rule{
{
Attribute: "email",
Operator: "contains",
Value: "@beta.com",
},
},
})Domain errors now include stable codes for API clients.
gRPC errors include the code in the message: [ERROR_CODE] message
[USER_NOT_FOUND] user with email test@example.com not found
[AUTH_TOKEN_EXPIRED] session has expired, please login again
[VALIDATION_FAILED] email format is invalid
| Code | Type | Description |
|---|---|---|
NOT_FOUND |
NotFound | Resource not found |
ALREADY_EXISTS |
AlreadyExists | Resource already exists |
VALIDATION_FAILED |
Validation | Validation error |
AUTH_REQUIRED |
Unauthorized | Authentication required |
AUTH_TOKEN_EXPIRED |
Unauthorized | Token expired |
FORBIDDEN |
Forbidden | Access denied |
RATE_LIMITED |
Forbidden | Rate limit exceeded |
import "github.com/LoopContext/go-modulith-template/internal/errors"
// Create error with specific code
err := errors.WithCode(errors.CodeUserNotFound, "user not found")
// Or use existing helpers (code assigned automatically)
err := errors.NotFound("user not found") // Code: NOT_FOUND
// Get code from an error
code := errors.GetErrorCode(err) // "NOT_FOUND"The logging middleware records all HTTP requests with detailed information.
- HTTP method and path
- Status code and duration
- Bytes written
- Request ID (if available)
- User-Agent and Remote Address
config := middleware.LoggingConfig{
SkipPaths: []string{"/healthz", "/readyz", "/metrics"},
SlowRequestThreshold: 500 * time.Millisecond,
}
handler := middleware.Logging(config)(yourHandler)- INFO: Successful requests (2xx, 3xx)
- WARN: Client errors (4xx) or slow requests
- ERROR: Server errors (5xx)
The template includes health check endpoints designed for integration with orchestrators (Kubernetes, Docker Swarm, etc.).
/livez: Liveness probe - always returns 200 if the process is alive/readyz: Readiness probe - checks all critical dependencies/healthz: Legacy endpoint (backward compatibility, same as/livez)/healthz/ws: WebSocket connection status (active connections and connected users)
The /readyz endpoint returns JSON with the status of each dependency:
{
"status": "ready",
"checks": {
"modules": "healthy",
"database": "healthy",
"event_bus": "healthy",
"websocket": "healthy"
}
}Response codes:
200 OK: All dependencies are healthy503 Service Unavailable: One or more dependencies are unavailable
Checks performed:
- Modules: Executes
HealthCheckAll()on all registered modules - Database: Verifies connectivity with
db.PingContext() - Event Bus: Verifies that the event bus is initialized
- WebSocket Hub: Verifies that the WebSocket hub is initialized
The Helm chart automatically configures probes:
livenessProbe:
httpGet:
path: /livez
port: http
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /readyz
port: http
initialDelaySeconds: 5
periodSeconds: 5
startupProbe:
httpGet:
path: /livez
port: http
failureThreshold: 30
periodSeconds: 2Administrative task system for maintenance and cleanup operations.
cleanup-sessions: Cleans expired user sessionscleanup-magic-codes: Cleans expired magic codes
# Run an administrative task
just admin TASK=cleanup-sessions
# Or directly with the binary
./bin/server admin cleanup-sessions
./bin/server admin cleanup-magic-codes
# List available tasks
./bin/server adminAdministrative tasks implement the admin.Task interface:
package tasks
import (
"context"
"database/sql"
"fmt"
"log/slog"
"github.com/LoopContext/go-modulith-template/internal/admin"
)
type MyTask struct {
db *sql.DB
}
func (t *MyTask) Name() string {
return "my-task"
}
func (t *MyTask) Description() string {
return "Description of what this task does"
}
func (t *MyTask) Execute(ctx context.Context) error {
// Task implementation
slog.Info("Running my task")
return nil
}
// Register in internal/admin/tasks/register.go
func RegisterAllTasks(runner *admin.Runner, db *sql.DB) {
runner.Register(NewCleanupSessionsTask(db))
runner.Register(NewCleanupMagicCodesTask(db))
runner.Register(NewMyTask(db)) // New task
}Administrative tasks run as independent commands and are useful for:
- Periodic cleanup of expired data (cron jobs)
- Database maintenance
- Data migration operations
- Audit tasks
Example with Kubernetes CronJob:
apiVersion: batch/v1
kind: CronJob
metadata:
name: cleanup-sessions
spec:
schedule: "0 2 * * *" # Daily at 2 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: cleanup
image: modulith-server:latest
command: ["./service", "admin", "cleanup-sessions"]
restartPolicy: OnFailureMiddleware to limit the maximum duration of HTTP requests.
# configs/server.yaml
request_timeout: 30s # Maximum duration of a requestOr via environment variable:
REQUEST_TIMEOUT=30s- If a request exceeds the timeout, the middleware returns
504 Gateway Timeout - The timeout propagates to the request context, allowing handlers to cancel long operations
- The timeout is applied after other middlewares (CORS, rate limiting, etc.)
Handlers can check the context to cancel operations:
func (s *Service) LongRunningOperation(ctx context.Context) error {
// Check if context was cancelled
select {
case <-ctx.Done():
return ctx.Err() // context.DeadlineExceeded
default:
// Continue with operation
}
// Long operation...
return nil
}Abstraction system for secrets management that allows using different providers without changing business code.
package secrets
type Provider interface {
GetSecret(ctx context.Context, key string) (string, error)
GetSecretJSON(ctx context.Context, key string, v interface{}) error
}Reads secrets from environment variables:
provider := secrets.NewEnvProvider()
secret, err := provider.GetSecret(ctx, "DB_PASSWORD")- VaultProvider: HashiCorp Vault
- AWSSecretsProvider: AWS Secrets Manager
- K8sSecretsProvider: Kubernetes Secrets
import "github.com/LoopContext/go-modulith-template/internal/secrets"
// Initialize provider (from configuration)
var secretProvider secrets.Provider
if cfg.Env == "prod" {
secretProvider = secrets.NewVaultProvider(cfg.VaultAddr)
} else {
secretProvider = secrets.NewEnvProvider()
}
// Get secret
dbPassword, err := secretProvider.GetSecret(ctx, "DB_PASSWORD")
if err != nil {
return fmt.Errorf("failed to get DB password: %w", err)
}
// Get JSON secret
var dbConfig struct {
Host string `json:"host"`
Port int `json:"port"`
Database string `json:"database"`
}
if err := secretProvider.GetSecretJSON(ctx, "DB_CONFIG", &dbConfig); err != nil {
return fmt.Errorf("failed to get DB config: %w", err)
}// Get secret with default value
value, err := secrets.GetSecretOrDefault(ctx, provider, "API_KEY", "default-key")This architecture favors compile-time safety and operational discipline. Go 1.24+ is chosen for native slog support, improved toolchain features, and performance optimizations that enable cleaner and more efficient code.
The template is designed following the stateless processes principle of the 12-factor app methodology. This ensures the application can scale horizontally without issues.
All application processes are stateless:
-
No local state in the file system:
- ✅ No temporary files are written
- ✅ No sessions stored on disk
- ✅ No data saved in
/tmpor local directories - ✅ Only reading configuration files and static resources (Swagger JSON)
-
Persistent state in external services:
- ✅ Sessions: Stored in PostgreSQL (
sessionstable) - ✅ Application data: PostgreSQL
- ✅ Cache (optional): Valkey (if configured)
- ✅ Logs: Sent to stdout/stderr (captured by orchestrator)
- ✅ Sessions: Stored in PostgreSQL (
-
Ephemeral state in memory:
⚠️ WebSocket Hub: Maintains active connections in memory⚠️ Event Bus: Subscription state in memory- ℹ️ Note: These are acceptable for stateless processes, but have implications for horizontal scaling (see below)
Verification performed:
# No temporary file writing
grep -r "os.Create\|ioutil.WriteFile\|/tmp" cmd/ internal/ modules/ --exclude-dir=vendor
# No state in file system
grep -r "file.*state\|local.*state" cmd/ internal/ modules/Result: ✅ No temporary file writes or local state storage found.
- HTTP/gRPC requests: Completely stateless, any instance can handle any request
- Sessions: Stored in shared DB, any instance can validate sessions
- JWT tokens: Stateless, don't require server storage
The WebSocket Hub maintains active connections in memory. This means:
-
Sticky Sessions (Recommended):
- Configure load balancer with sticky sessions (session affinity)
- Ensures a client always connects to the same instance
- Implementation: Configure
sessionAffinityin Kubernetes Service
-
Alternative: Shared State (Advanced):
- For scaling without sticky sessions, consider Valkey Pub/Sub for WebSocket
- Implement distributed hub using
internal/events/distributed.goas reference - Requires additional implementation (not included in base template)
Production recommendation:
- For most cases, sticky sessions are sufficient
- For high availability without sticky sessions, implement distributed hub
Startup:
- Loads configuration from environment/YAML
- Connects to external services (DB, optional Valkey)
- Runs migrations (if necessary)
- Initializes modules
- Starts HTTP/gRPC servers
- Ready to receive requests
Shutdown (Graceful):
- Stops accepting new connections
- Closes active WebSocket connections
- Waits for in-flight requests to finish (configurable timeout)
- Closes connections to external services
- Flushes telemetry (tracing/metrics)
- Terminates process
Shutdown time: Configurable via SHUTDOWN_TIMEOUT (default: 30s)
The template supports two types of processes:
-
Web Process (
cmd/server/main.go):- Handles HTTP/gRPC requests
- Manages WebSocket connections
- Scales horizontally (with WebSocket considerations)
-
Worker Process (
cmd/worker/main.go):- Processes asynchronous events
- Executes scheduled tasks
- Consumes from event bus
- Scales independently from web process
Procfile (Heroku/Railway compatible):
web: go run cmd/server/main.go
worker: go run cmd/worker/main.go
Before adding new functionality, verify:
- Is any temporary file written? → NO
- Is state stored in memory that must persist between restarts? → NO (use DB/Valkey)
- Does it depend on local process state? → NO (any instance must work)
- Are sessions in shared DB? → YES
- Do logs go to stdout? → YES
- 12-Factor App: Processes
- 12-Factor App: Disposability
- See also:
docs/12_FACTOR_APP.md(complete compliance guide)
The template is designed to scale horizontally through the process model of the 12-factor app methodology.
The application runs as one or more stateless processes that share nothing or share only external services (DB, Valkey).
Process Types:
-
Web Process (
cmd/server/main.go):- Handles HTTP/gRPC requests
- Manages WebSocket connections
- Scales horizontally
-
Worker Process (
cmd/worker/main.go):- Processes asynchronous events
- Executes scheduled tasks
- Scales independently
HTTP/gRPC Requests:
- ✅ Completely stateless: Any instance can handle any request
- ✅ No sticky sessions required: Load balancer can distribute requests randomly
- ✅ Automatic scaling: HPA (Horizontal Pod Autoscaler) configured in Helm chart
Scaling example:
# deployment/helm/modulith/values-server.yaml
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70Worker Processes:
- ✅ Independent scaling: Can have more workers than web processes
- ✅ Event-driven: Scales according to event volume
- ✅ No dependencies: Each worker is independent
Limitation:
- The WebSocket Hub maintains connections in memory
- A client must always connect to the same instance
Recommended Solution: Sticky Sessions
# Kubernetes Service with session affinity
apiVersion: v1
kind: Service
metadata:
name: modulith-server
spec:
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800 # 3 hoursAlternative: Distributed Hub (Advanced)
- Implement distributed hub using Valkey Pub/Sub
- Requires additional implementation (not included in base template)
- See
internal/events/distributed.goas reference
Connection Pooling:
# configs/server.yaml
db_max_open_conns: 25 # Per instance
db_max_idle_conns: 25
db_conn_max_lifetime: 5mConnection calculation:
- If you have 5 instances: 5 × 25 = 125 maximum connections
- Adjust
DB_MAX_OPEN_CONNSaccording to expected number of instances - PostgreSQL default: 100 connections (adjust
max_connectionsif necessary)
In-Process Bus:
- ✅ Thread-safe with
sync.RWMutex - ✅ Multiple goroutines can publish/subscribe simultaneously
⚠️ Only works within a process
Distributed Bus (Future):
- Use
internal/events/distributed.goas base - Implement with Kafka, RabbitMQ, or Valkey Pub/Sub
- Enables events between instances
1. Manual Scaling:
# Kubernetes
kubectl scale deployment modulith-server --replicas=5
# Helm
helm upgrade modulith-server ./deployment/helm/modulith \
--set replicaCount=52. Automatic Scaling (HPA):
# Already configured in Helm chart
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 803. Metrics-Based Scaling:
# Example: Scale based on requests per second
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "100"1. Resource Limits:
resources:
requests:
cpu: 500m
memory: 256Mi
limits:
cpu: 1000m
memory: 512Mi2. Readiness Probes:
- Ensure pod is ready before receiving traffic
- Configured in Helm chart:
/readyz
3. Graceful Shutdown:
- Configure appropriate
SHUTDOWN_TIMEOUT - Allow in-flight requests to finish
- Default: 30 seconds
4. Health Checks:
- Liveness:
/livez(process alive) - Readiness:
/readyz(ready for requests) - Startup:
/readyz(first time)
┌───────-──────┐
│ Load Balancer│
└──────┬───────┘
│
┌──────────────────┼──────────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ Server │ │ Server │ │ Server │
│ Pod 1 │ │ Pod 2 │ │ Pod 3 │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└──────────────────┼──────────────────┘
│
┌──────▼──────┐
│ PostgreSQL │
│ (Shared) │
└─────────────┘
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Worker │ │ Worker │ │ Worker │
│ Pod 1 │ │ Pod 2 │ │ Pod 3 │
└─────────┘ └─────────┘ └─────────┘
Before scaling to production:
- Configure HPA with appropriate limits
- Adjust connection pool according to number of instances
- Configure sticky sessions for WebSocket (if applicable)
- Validate that health checks work correctly
- Configure resource limits and requests
- Test manual scaling before enabling automatic
- Monitor metrics during scaling
- Document known limitations (WebSocket, etc.)