diff --git a/components/lif/api_key_auth/README.md b/components/lif/api_key_auth/README.md new file mode 100644 index 00000000..d2262385 --- /dev/null +++ b/components/lif/api_key_auth/README.md @@ -0,0 +1,20 @@ +# `api_key_auth` — Component + +Simple API-key authentication middleware for FastAPI. Validates `X-API-Key` headers against a configurable map of `key:client-name` pairs, and exposes the matched client name on `request.state.principal`. + +Used by services that want bare API-key auth (no Cognito, no HS256 JWT) — the GraphQL API in particular. For the richer MDR-side middleware that handles three principal types, see [`mdr_auth`](../mdr_auth/). + +## Public surface + +```python +from lif.api_key_auth import ApiKeyAuthMiddleware, ApiKeyConfig + +config = ApiKeyConfig.from_environment(prefix="GRAPHQL_AUTH") +if config.is_enabled: + app.add_middleware(ApiKeyAuthMiddleware, config=config) +``` + +The middleware no-ops when `__API_KEYS` is unset, making local dev simple. + +## Used by +- `bases/lif/api_graphql` — the GraphQL service's only auth path diff --git a/components/lif/auth/README.md b/components/lif/auth/README.md new file mode 100644 index 00000000..092df47f --- /dev/null +++ b/components/lif/auth/README.md @@ -0,0 +1,20 @@ +# `auth` — Component + +Lightweight HS256 JWT auth used by the Advisor and Example Data Source bases. Not the same as [`mdr_auth`](../mdr_auth/) — this one is older and simpler, with no Cognito or middleware-managed tenant routing. Demo-grade. + +## Public surface + +```python +from lif.auth.core import ( + create_access_token, create_refresh_token, decode_jwt, + verify_token, get_current_user, +) +``` + +`create_access_token` / `create_refresh_token` mint short-lived access tokens and longer-lived refresh tokens. `verify_token` is a sync validator suitable for use as a FastAPI dependency. `get_current_user` is the FastAPI `Depends(...)` callable that extracts the authenticated username from a bearer token. + +## Used by +- `bases/lif/advisor_restapi` — login + per-endpoint user resolution +- `bases/lif/example_data_source_rest_api` — `verify_token` behind `x-key` header auth + +New services should default to `mdr_auth` instead unless they specifically don't want Cognito support. diff --git a/components/lif/composer/README.md b/components/lif/composer/README.md new file mode 100644 index 00000000..48264893 --- /dev/null +++ b/components/lif/composer/README.md @@ -0,0 +1,19 @@ +# `composer` — Component + +Merges LIF fragments back into a `LIFRecord`. Used by the cache to assemble the final record returned to callers from many small fragment writes (one per data source, one per orchestration job). + +## Public surface + +```python +from lif.composer import compose_with_single_fragment, compose_with_fragment_list +``` + +| Function | Purpose | +|---|---| +| `compose_with_single_fragment` | Apply one `LIFFragment` to a base record | +| `compose_with_fragment_list` | Apply many fragments at once; order matters when they overlap | + +Composition is path-driven: a fragment with `fragment_path = "person.Name"` is merged at that location. Path syntax follows the PascalCase/camelCase rules documented in [`docs/specs/data-model-rules.md`](../../../docs/specs/data-model-rules.md). + +## Used by +- `components/lif/query_cache_service` — fragments arrive piecemeal; the composer stitches them into a complete record diff --git a/components/lif/data_source_adapters/README.md b/components/lif/data_source_adapters/README.md new file mode 100644 index 00000000..bbaca72f --- /dev/null +++ b/components/lif/data_source_adapters/README.md @@ -0,0 +1,25 @@ +# `data_source_adapters` — Component + +Adapter framework for pulling LIF data from heterogeneous source systems. The orchestrator picks an adapter by id, the adapter handles the source-specific API contract (auth scheme, pagination, response shape), and returns LIF fragments that the orchestrator merges into a record. + +## Public surface + +```python +from lif.data_source_adapters import LIFDataSourceAdapter, ADAPTER_REGISTRY, register_adapter +``` + +`LIFDataSourceAdapter` is the abstract base class. Subclasses live in sibling directories (one per adapter): + +| Adapter id | Directory | Notes | +|---|---|---| +| `lif-to-lif` | [`lif_to_lif_adapter/`](lif_to_lif_adapter/) | Reads from another LIF GraphQL API | +| `example-data-source-rest-api-to-lif` | [`example_data_source_rest_api_to_lif_adapter/`](example_data_source_rest_api_to_lif_adapter/) | Reference impl that pulls from the bundled example data source | + +External adapters can be registered via `register_adapter(adapter_id, adapter_class)`. + +## Adding a new adapter + +See [`docs/operations/guides/creating-a-data-source-adapter.md`](../../../docs/operations/guides/creating-a-data-source-adapter.md) for the class contract and design guidelines, or [`docs/operations/guides/add-data-source.md`](../../../docs/operations/guides/add-data-source.md) for the end-to-end tutorial. + +## Used by +Pulled in by the orchestrator via configuration — the registry is consulted at adapter-dispatch time rather than by direct import from other components. diff --git a/components/lif/data_source_adapters/example_data_source_rest_api_to_lif_adapter/README.md b/components/lif/data_source_adapters/example_data_source_rest_api_to_lif_adapter/README.md new file mode 100644 index 00000000..3d08935b --- /dev/null +++ b/components/lif/data_source_adapters/example_data_source_rest_api_to_lif_adapter/README.md @@ -0,0 +1,15 @@ +# `example_data_source_rest_api_to_lif_adapter` — Adapter + +Reference adapter that pulls from the bundled [`example_data_source_rest_api`](../../../../bases/lif/example_data_source_rest_api/) base. Treat as a worked example for how to write a custom adapter against a real source API — copy this directory, rename, swap in your auth + URL conventions. + +## Files + +| File | What it does | +|---|---| +| `adapter.py` | `ExampleDataSourceRestAPIToLIFAdapter` — implements the `LIFDataSourceAdapter` contract | + +## Registered as +`example-data-source-rest-api-to-lif` (see [`../README.md`](../README.md) for the registry). + +## See also +[`docs/operations/guides/add-data-source.md`](../../../../docs/operations/guides/add-data-source.md) walks through cloning this adapter as the starting point for a custom data source. diff --git a/components/lif/data_source_adapters/lif_to_lif_adapter/README.md b/components/lif/data_source_adapters/lif_to_lif_adapter/README.md new file mode 100644 index 00000000..6dde37e0 --- /dev/null +++ b/components/lif/data_source_adapters/lif_to_lif_adapter/README.md @@ -0,0 +1,16 @@ +# `lif_to_lif_adapter` — Adapter + +Reads from another LIF GraphQL API. Used when one LIF deployment needs to pull learner data from a peer LIF deployment — typically in multi-org demos where org1's orchestrator fetches data hosted by org2 or org3. + +## Files + +| File | What it does | +|---|---| +| `adapter.py` | `LIFToLIFAdapter` — implements the `LIFDataSourceAdapter` contract | +| `graphql_query_all_fields_org2.graphql` | Pre-baked query template for org2 | +| `graphql_query_all_fields_org3.graphql` | Pre-baked query template for org3 | + +The per-org `.graphql` files exist because schema field selection is hard to template dynamically against an evolving LIF data model; keeping the queries as text files makes them easy to inspect and edit. Generic adapters that target arbitrary LIF deployments will need to generate selections at runtime — that work isn't done here. + +## Registered as +`lif-to-lif` (see [`../README.md`](../README.md) for the registry). diff --git a/components/lif/datatypes/README.md b/components/lif/datatypes/README.md new file mode 100644 index 00000000..c01677cd --- /dev/null +++ b/components/lif/datatypes/README.md @@ -0,0 +1,19 @@ +# `datatypes` — Component + +Core Pydantic models that flow through the LIF data plane. Every service that handles LIF records, queries, or jobs imports from here. Single source of truth for the wire shapes — bases don't define their own. + +## Layout + +| File | Contents | +|---|---| +| `core.py` | `LIFRecord`, `LIFPerson`, `LIFFragment`, `LIFQuery`, `LIFQueryFilter`, `LIFUpdate`, `LIFQueryPlan*`, `LIFPersonIdentifier(s)`, `LIFQueryStatusResponse`, `LIFQueryPlanPartTranslation`, `HealthCheckResponse`, `TargetTransformationDataModel(s)DTO` | +| `identity_mapping.py` | `IdentityMapping` | +| `mdr_sql_model.py` | SQLModel-style classes used by MDR persistence | +| `orchestration.py` | `OrchestratorJob`, `OrchestratorJobDefinition`, `OrchestratorJobRequest`, request/response wrappers | + +## Naming convention + +Models follow the PascalCase/camelCase split documented in [`docs/specs/data-model-rules.md`](../../../docs/specs/data-model-rules.md): entities (containers) are PascalCase, scalars are camelCase. Many models use `populate_by_name=True` with `alias="EntityName"` so they accept either case on input but normalize internally. + +## Used by +Practically everything: every REST base, the cache and planner services, the orchestrator, the translator, and the MDR services. Changes here ripple widely; treat additions as a stable-API extension. diff --git a/components/lif/example_data_source_service/README.md b/components/lif/example_data_source_service/README.md new file mode 100644 index 00000000..e0aad51c --- /dev/null +++ b/components/lif/example_data_source_service/README.md @@ -0,0 +1,21 @@ +# `example_data_source_service` — Component + +Sample data + business logic backing the [`example_data_source_rest_api`](../../../bases/lif/example_data_source_rest_api/) base. Provides a small fake person/course dataset that adapters can exercise locally without standing up a real SIS or LMS. + +## Public surface + +```python +from lif.example_data_source_service.core import ( + user_info, users_info, users_info_filtered, courses_info, +) +``` + +| Function | Returns | +|---|---| +| `user_info(user_id)` | One sample person | +| `users_info()` | All sample persons | +| `users_info_filtered(filter)` | Subset matching the filter | +| `courses_info()` | Sample courses dataset | + +## Used by +- `bases/lif/example_data_source_rest_api` — only consumer; this component exists to keep that base small and stub-able. diff --git a/components/lif/exceptions/README.md b/components/lif/exceptions/README.md new file mode 100644 index 00000000..da07e70c --- /dev/null +++ b/components/lif/exceptions/README.md @@ -0,0 +1,24 @@ +# `exceptions` — Component + +LIF-wide exception types. Services raise these instead of bare `Exception` so HTTP bases can centralize translation to status codes via `@app.exception_handler` registrations. + +## Public surface + +```python +from lif.exceptions.core import ( + LIFException, + ResourceNotFoundException, + DataNotFoundException, + # ... +) +``` + +`LIFException` is the catch-all base. Sub-types convey common semantics (not-found, data-not-found, validation failure, etc.) so handlers can map them to 404 / 422 / 500 without `isinstance` chains in business code. + +## Convention + +When you add a new exception type, also wire its handler in any base that should respond differently to it. The translator base is a good template — see its `@app.exception_handler` cascade in `bases/lif/translator_restapi/core.py`. + +## Used by +- Every REST base — handlers convert these to HTTP responses +- Service components — raise these from business logic diff --git a/components/lif/graphql_client/README.md b/components/lif/graphql_client/README.md new file mode 100644 index 00000000..9a534bb4 --- /dev/null +++ b/components/lif/graphql_client/README.md @@ -0,0 +1,20 @@ +# `graphql_client` — Component + +Authenticated HTTP client for calling the LIF GraphQL API. Wraps the boilerplate (auth header, error mapping, JSON shaping) into two functions so callers don't need to learn `httpx` semantics. + +## Public surface + +```python +from lif.graphql_client import graphql_query, graphql_mutation, GraphQLClientException +``` + +Both functions send `X-API-Key` from `LIF_GRAPHQL_API_KEY` (when set) as the auth header — see CLAUDE.md § "GraphQL API Key Authentication" for the server-side configuration. + +| Function | Purpose | +|---|---| +| `graphql_query(...)` | Read-side query, returns parsed data | +| `graphql_mutation(...)` | Write-side mutation, returns parsed data | +| `GraphQLClientException` | Raised on transport or GraphQL-error response | + +## Used by +- `components/lif/semantic_search_service` — calls GraphQL to fulfill MCP queries diff --git a/components/lif/identity_mapper_service/README.md b/components/lif/identity_mapper_service/README.md new file mode 100644 index 00000000..07b911d3 --- /dev/null +++ b/components/lif/identity_mapper_service/README.md @@ -0,0 +1,14 @@ +# `identity_mapper_service` — Component + +Business logic for mapping a person's identifiers across source systems. Owns the rules for creating, listing, and deleting `IdentityMapping`s without knowing how they're stored. + +## Public surface + +```python +from lif.identity_mapper_service.core import IdentityMapperService +``` + +`IdentityMapperService` is constructed with an `IdentityMapperStorage` (the interface in [`identity_mapper_storage`](../identity_mapper_storage/)) and operates against it. This split lets the same business logic work over an in-memory store for tests and a SQL store ([`identity_mapper_storage_sql`](../identity_mapper_storage_sql/)) in production. + +## Used by +- `bases/lif/identity_mapper_restapi` — instantiates one service per app + dispatches HTTP handlers to it diff --git a/components/lif/identity_mapper_storage/README.md b/components/lif/identity_mapper_storage/README.md new file mode 100644 index 00000000..d85cb227 --- /dev/null +++ b/components/lif/identity_mapper_storage/README.md @@ -0,0 +1,18 @@ +# `identity_mapper_storage` — Component + +Storage interface for `IdentityMapping`s. Defines the abstract contract that the identity mapper service depends on, separate from any concrete database implementation. + +## Public surface + +```python +from lif.identity_mapper_storage.core import IdentityMapperStorage +``` + +`IdentityMapperStorage` is the abstract base class. Concrete implementations live in sibling bricks; the only one in tree is [`identity_mapper_storage_sql`](../identity_mapper_storage_sql/) (SQLAlchemy/MariaDB). + +The split exists so the identity mapper service can be tested against an in-memory or fake implementation without spinning up a database — and so a future swap to a different backend (Postgres, Redis, etc.) wouldn't require touching service logic. + +## Used by +- `bases/lif/identity_mapper_restapi` — declares the interface type for lifecycle injection +- `components/lif/identity_mapper_storage_sql` — implements the interface +- `components/lif/identity_mapper_service` (transitively, via the base) diff --git a/components/lif/identity_mapper_storage_sql/README.md b/components/lif/identity_mapper_storage_sql/README.md new file mode 100644 index 00000000..4f58d01c --- /dev/null +++ b/components/lif/identity_mapper_storage_sql/README.md @@ -0,0 +1,26 @@ +# `identity_mapper_storage_sql` — Component + +SQL-backed implementation of [`identity_mapper_storage`](../identity_mapper_storage/). Uses SQLAlchemy against a MariaDB instance (deployed as `projects/lif_identity_mapper_mariadb/`). + +## Layout + +| File | Contents | +|---|---| +| `core.py` | `IdentityMapperSqlStorage` — the concrete `IdentityMapperStorage` impl | +| `model.py` | SQLAlchemy ORM model for the mapping table | +| `crud.py` | Low-level CRUD helpers used by `core` | +| `db.py` | Engine/session factory (`initialize_database`, `get_db_session_factory`, `dispose_db_engine`) | + +## Public surface + +```python +from lif.identity_mapper_storage_sql.core import IdentityMapperSqlStorage +from lif.identity_mapper_storage_sql.db import ( + initialize_database, get_db_session_factory, dispose_db_engine, +) +``` + +`db.py`'s lifecycle helpers are called by the base's `lifespan` handler — engine initialization is per-app, not per-request. + +## Used by +- `bases/lif/identity_mapper_restapi` — instantiates the SQL storage and threads it into the service diff --git a/components/lif/langchain_agent/README.md b/components/lif/langchain_agent/README.md new file mode 100644 index 00000000..bcc2d6d8 --- /dev/null +++ b/components/lif/langchain_agent/README.md @@ -0,0 +1,26 @@ +# `langchain_agent` — Component + +Wraps LangChain/LangGraph into a `LIFAIAgent` purpose-built for the Advisor's chat experience: per-conversation memory, structured prompt templates, and an `ask_agent(task, query)` interface that maps high-level tasks ("load_profile", "continue_conversation", "save_interaction_summary") to the right prompt + tool chain. + +## Layout + +| File | Contents | +|---|---| +| `core.py` | `LIFAIAgent` — top-level interface (`setup`, `ask_agent`) | +| `helpers.py` | Prompt + chain construction helpers | +| `memory.py` | LangGraph memory wiring (`langmem`-backed) | +| [`prompts/`](prompts/) | Text-file prompt templates loaded at runtime | + +Keeping prompts as plain text in `prompts/` (rather than f-strings in code) lets non-engineers tune wording without touching Python. + +## Public surface + +```python +from lif.langchain_agent import LIFAIAgent + +agent = await LIFAIAgent.setup(config) +response = await agent.ask_agent("continue_conversation", user_message) +``` + +## Used by +- `bases/lif/advisor_restapi` — single consumer; this component exists to keep that base small. diff --git a/components/lif/langchain_agent/prompts/README.md b/components/lif/langchain_agent/prompts/README.md new file mode 100644 index 00000000..da05b98e --- /dev/null +++ b/components/lif/langchain_agent/prompts/README.md @@ -0,0 +1,15 @@ +# `prompts/` — LangChain prompt templates + +Text-file prompt templates loaded by [`langchain_agent`](../) at runtime. Kept as plain text so wording can be tuned without code changes or commits to Python files. + +## Files + +| File | Used when | +|---|---| +| `load_profile.txt` | Advisor's initial profile load — fires on `/start-conversation` | +| `continue_conversation.txt` | Each subsequent turn — fires on `/continue-conversation` | +| `summarize_interaction.txt` | Mid-conversation summary the agent uses to keep context bounded | +| `save_interaction_summary.txt` | Final summary written on `/logout` for future-session memory | +| `prompt_template_query.txt` | Base scaffold the other prompts inherit from | + +Refer to `components/lif/langchain_agent/core.py` for the `task` → prompt mapping. diff --git a/components/lif/lif_schema_config/README.md b/components/lif/lif_schema_config/README.md new file mode 100644 index 00000000..eea2e40f --- /dev/null +++ b/components/lif/lif_schema_config/README.md @@ -0,0 +1,34 @@ +# `lif_schema_config` — Component + +Centralized configuration and utility helpers for everything LIF services do with the schema: where to load it from, how to name GraphQL types from it, how to map XSD types into Python. + +Replaces scattered `os.getenv("LIF_…")` calls across services with a single `LIFSchemaConfig` that knows how to read from the environment and validate. + +## Layout + +| File | Contents | +|---|---| +| `core.py` | `LIFSchemaConfig` — the main config class + `from_environment()` factory | +| `type_mappings.py` | XSD → Python type conversions used by schema generation | +| `naming.py` | Case conversion + GraphQL naming conventions (PascalCase / camelCase rules) | +| `openapi.py` | OpenAPI document structure helpers | +| Also exports | `DEFAULT_ATTRIBUTE_KEYS` — common attribute keys used by semantic search | + +## Public surface + +```python +from lif.lif_schema_config import LIFSchemaConfig, DEFAULT_ATTRIBUTE_KEYS + +config = LIFSchemaConfig.from_environment() +config.root_type_name # "Person" +config.graphql_query_name # "person" +config.mdr_api_url # URL of MDR API +config.query_planner_query_url +``` + +## Used by +- `bases/lif/api_graphql` +- `bases/lif/semantic_search_mcp_server` +- `components/lif/query_cache_service` +- `components/lif/openapi_to_graphql` +- `components/lif/semantic_search_service` diff --git a/components/lif/logging/README.md b/components/lif/logging/README.md new file mode 100644 index 00000000..f9b9a24f --- /dev/null +++ b/components/lif/logging/README.md @@ -0,0 +1,26 @@ +# `logging` — Component + +The standard logger factory used across (almost) all LIF services. One module, one function, opinionated defaults: ISO timestamp with milliseconds, level name, logger name, message. + +## Public surface + +```python +from lif.logging import get_logger + +logger = get_logger(__name__) +``` + +`LOG_LEVEL` env var (default `INFO`) controls verbosity. Format: + +``` +2025-10-09 14:33:21.123 INFO | my.logger | message +``` + +## When to use this vs. `mdr_utils/logger_config` + +This component is the convention for non-MDR services. The MDR internals (`mdr_utils/logger_config.py`) use a slightly different format kept for historical compatibility — but new MDR endpoint code can use either since they share Python's `logging` module under the hood. New bases outside MDR should pull from here. + +See [`docs/operations/guides/adding-a-new-microservice.md`](../../../docs/operations/guides/adding-a-new-microservice.md) for the full guidance. + +## Used by +Most bases (api_graphql, advisor_restapi, query_cache_restapi, example_data_source_rest_api, semantic_search_mcp_server, orchestrator_restapi, translator_restapi, identity_mapper_restapi, query_planner_restapi) and many components. diff --git a/components/lif/mdr_client/resources/README.md b/components/lif/mdr_client/resources/README.md new file mode 100644 index 00000000..594ca96d --- /dev/null +++ b/components/lif/mdr_client/resources/README.md @@ -0,0 +1,11 @@ +# `resources/` — Bundled OpenAPI schema + +Ships the most recent known-good LIF OpenAPI schema as a static file. Used by [`mdr_client`](../) when `USE_OPENAPI_DATA_MODEL_FROM_FILE=true` (dev only) or when MDR is unreachable in deliberately-offline test setups. + +## Files + +| File | Purpose | +|---|---| +| `openapi_constrained_with_interactions.json` | Snapshot of the LIF V1.1 OpenAPI schema with interaction-mode constraints applied | + +Kept in sync with MDR's V1.1 baseline; not the source of truth. Production services should always load schema from MDR, not from this file — see CLAUDE.md § "Schema Loading Pattern" for the no-silent-fallback policy. diff --git a/components/lif/mongodb_connection/README.md b/components/lif/mongodb_connection/README.md new file mode 100644 index 00000000..b478e0c7 --- /dev/null +++ b/components/lif/mongodb_connection/README.md @@ -0,0 +1,14 @@ +# `mongodb_connection` — Component + +MongoDB client factory. Reads connection settings from environment and returns a `Database` handle ready for use. Sync and async variants are provided since the cache service uses both. + +## Public surface + +```python +from lif.mongodb_connection import get_database, get_database_async +``` + +Both honor `MONGODB_URI`, `MONGO_DB`, and `MONGO_COLLECTION` env vars. + +## Used by +- `components/lif/query_cache_service` — single consumer today; this component exists to keep the Mongo coupling in one place so the eventual cache-read refactor (see [`query_cache_read`](../query_cache_read/) and siblings) has a clean swap point. diff --git a/components/lif/openapi_schema_parser/README.md b/components/lif/openapi_schema_parser/README.md new file mode 100644 index 00000000..b0e949c4 --- /dev/null +++ b/components/lif/openapi_schema_parser/README.md @@ -0,0 +1,15 @@ +# `openapi_schema_parser` — Component + +Walks a LIF OpenAPI schema and emits its "leaves" — the scalar attribute paths semantic search can match against. A leaf is a field like `person.Name.firstName` (entity-PascalCase + scalar-camelCase, per [`docs/specs/data-model-rules.md`](../../../docs/specs/data-model-rules.md)). + +## Public surface + +```python +from lif.openapi_schema_parser import load_schema_leaves +``` + +`load_schema_leaves(schema)` returns the full list of dotted paths plus their type metadata. Callers wrap the result into something searchable (typically Sentence-Transformers embeddings). + +## Used by +- `components/lif/semantic_search_service` — uses leaves as the corpus for semantic matching +- `components/lif/schema_state_manager` — caches the parsed leaves alongside the raw schema diff --git a/components/lif/openapi_to_graphql/README.md b/components/lif/openapi_to_graphql/README.md new file mode 100644 index 00000000..b9d1faf6 --- /dev/null +++ b/components/lif/openapi_to_graphql/README.md @@ -0,0 +1,27 @@ +# `openapi_to_graphql` — Component + +Generates a Strawberry GraphQL schema dynamically from a LIF OpenAPI schema. The whole point of this component is that the GraphQL API has no hand-written `.graphql` schema files — types, input filters, enums, and root queries are all constructed at runtime from whatever the MDR currently serves. + +## Layout + +| File | Contents | +|---|---| +| `core.py` | `generate_graphql_schema`, `generate_graphql_root_types` — top-level entrypoints | +| `type_factory.py` | Builds Strawberry types from OpenAPI schema definitions | +| `schema_tools.py` | Schema-traversal utilities (find references, resolve `$ref`, etc.) | + +## Public surface + +```python +from lif.openapi_to_graphql import generate_graphql_schema, generate_graphql_root_types +from lif.openapi_to_graphql import schema_tools +``` + +## Notes + +- **`$ref` resolution:** MDR's `generate_openapi_schema` inlines all `$ref`s, so the `$ref` branch in `type_factory.py` exists but isn't exercised by production schemas. Don't delete it — file-based schemas (`USE_OPENAPI_DATA_MODEL_FROM_FILE=true`) still rely on it. +- **Strawberry `info` typing:** dynamic resolvers must annotate the `info` parameter as `strawberry.types.Info` (not `object` / `Any`). Strawberry 0.297+ identifies the parameter by type, not by name. +- **Field name preservation:** uses `strawberry.field(name=field_name)` so the wire shape preserves PascalCase entity / camelCase scalar conventions ([`docs/specs/data-model-rules.md`](../../../docs/specs/data-model-rules.md)). + +## Used by +- `bases/lif/api_graphql` — single consumer; the GraphQL service's whole reason for existing is this component. diff --git a/components/lif/schema_state_manager/README.md b/components/lif/schema_state_manager/README.md new file mode 100644 index 00000000..755cf585 --- /dev/null +++ b/components/lif/schema_state_manager/README.md @@ -0,0 +1,27 @@ +# `schema_state_manager` — Component + +Thread-safe schema lifecycle manager for services that need the LIF schema at runtime *and* need to refresh it without restarting. Wraps [`mdr_client`](../mdr_client/) loading and [`openapi_schema_parser`](../openapi_schema_parser/) parsing into one object with sync + async init paths. + +## Public surface + +```python +from lif.schema_state_manager import SchemaStateManager, SchemaState +from lif.lif_schema_config import LIFSchemaConfig + +config = LIFSchemaConfig.from_environment() +manager = SchemaStateManager(config) + +# Sync (e.g., MCP server startup where async lifespan isn't an option) +manager.initialize_sync() + +# Async (e.g., FastAPI lifespan) +await manager.initialize() + +state = manager.state # SchemaState — leaves, filter models, embeddings, source +await manager.refresh() # re-load from MDR +``` + +`SchemaState` tracks where the schema came from (`"mdr"` or `"file"`) plus all the derived structures consumers need (parsed leaves, filter models, embeddings for semantic search). + +## Used by +- `bases/lif/semantic_search_mcp_server` — initializes one manager at startup, exposes `POST /schema/refresh` diff --git a/components/lif/semantic_search_service/README.md b/components/lif/semantic_search_service/README.md new file mode 100644 index 00000000..3e14f320 --- /dev/null +++ b/components/lif/semantic_search_service/README.md @@ -0,0 +1,20 @@ +# `semantic_search_service` — Component + +Implements the semantic search and mutation operations that the MCP server exposes as tools. Given a natural-language fragment, finds the closest matching LIF data fields (via Sentence-Transformers embeddings over schema leaves) and constructs the corresponding GraphQL query or mutation. + +## Public surface + +```python +from lif.semantic_search_service.core import run_semantic_search, run_mutation +``` + +| Function | Purpose | +|---|---| +| `run_semantic_search(query, ...)` | Translate a NL fragment into a GraphQL query, execute it, return results | +| `run_mutation(...)` | Same idea for mutations (only registered if the schema has a mutation model) | + +Both dispatch through [`graphql_client`](../graphql_client/) so the same authentication / error-handling rules apply. + +## Used by +- `bases/lif/semantic_search_mcp_server` — both functions are wrapped as MCP tools (`lif_query`, `lif_mutation`) +- `components/lif/schema_state_manager` — used during initialization to build embeddings for the leaves it caches diff --git a/components/lif/string_utils/README.md b/components/lif/string_utils/README.md new file mode 100644 index 00000000..ae2f73b3 --- /dev/null +++ b/components/lif/string_utils/README.md @@ -0,0 +1,31 @@ +# `string_utils` — Component + +Case-conversion and identifier-sanitization helpers used wherever LIF's PascalCase/camelCase data model rules need to be applied programmatically — typically when building GraphQL types from an OpenAPI schema or constructing semantic-search field paths. + +## Public surface + +```python +from lif.string_utils import ( + safe_identifier, + to_pascal_case, to_snake_case, to_camel_case, + camelcase_path, + dict_keys_to_snake, dict_keys_to_camel, + convert_dates_to_strings, + to_value_enum_name, +) +``` + +| Function | What it does | +|---|---| +| `safe_identifier(s)` | Sanitize an arbitrary string into a valid Python/GraphQL identifier | +| `to_pascal_case(s)` / `to_camel_case(s)` / `to_snake_case(s)` | Self-explanatory | +| `camelcase_path(dotted)` | Apply camelCase to each segment of a dotted path | +| `dict_keys_to_snake(d)` / `dict_keys_to_camel(d)` | Recursive case conversion of dict keys | +| `convert_dates_to_strings(obj)` | Serializes `date` / `datetime` for JSON callers | +| `to_value_enum_name(value)` | Generates a stable enum member name from an arbitrary value | + +See [`docs/specs/data-model-rules.md`](../../../docs/specs/data-model-rules.md) for *which* case applies *where* (entities vs scalars vs enums). + +## Used by +- `components/lif/openapi_to_graphql` — generating Strawberry types +- `components/lif/semantic_search_service` — building search corpus paths diff --git a/components/lif/tenant_routing/README.md b/components/lif/tenant_routing/README.md new file mode 100644 index 00000000..d05fda3e --- /dev/null +++ b/components/lif/tenant_routing/README.md @@ -0,0 +1,34 @@ +# `tenant_routing` — Component + +Pure functions that map Cognito group names to Postgres schema names + decide which schema a request should route to. The MDR auth middleware reads from here on every request to set `SET search_path` for the DB session. + +## Public surface + +```python +from lif.tenant_routing import ( + MAX_GROUP_NAME_LEN, + SCHEMA_PREFIX, + resolve_tenant_schema, + sanitize_group_name, + tenant_schema_for_group, +) +``` + +| Symbol | Purpose | +|---|---| +| `sanitize_group_name(name)` | Strips/normalizes a Cognito group name into a valid PG identifier component (or `None` if it sanitizes to empty) | +| `tenant_schema_for_group(name)` | Returns `tenant_` or `None` | +| `resolve_tenant_schema(enabled, is_service_principal, cognito_groups, service_schema)` | The full resolution logic — service principals route to `service_schema`; users route to their first group's `tenant_`; group-less users fall back to `service_schema` | +| `SCHEMA_PREFIX` | `"tenant_"` | +| `MAX_GROUP_NAME_LEN` | `128` — matches Cognito's own group name limit | + +## Why pure functions + +These rules need to match exactly between the auth middleware (Python) and the Flyway-installed `clone_lif_schema()` Postgres function. Keeping the Python side as pure, testable functions makes it easy to verify the two implementations agree. + +## Used by +- `components/lif/mdr_auth/core.py` — middleware sets `request.state.tenant_schema` per request +- `components/lif/mdr_services/tenant_service.py` — `provision_tenant` uses this to compute the target schema before calling `clone_lif_schema` + +## See also +[`docs/design/cross-cutting/self-serve-tenant-auth.md`](../../../docs/design/cross-cutting/self-serve-tenant-auth.md) for the full schema-per-tenant story (issue #883). diff --git a/components/lif/translator/README.md b/components/lif/translator/README.md new file mode 100644 index 00000000..84bb85f8 --- /dev/null +++ b/components/lif/translator/README.md @@ -0,0 +1,22 @@ +# `translator` — Component + +Applies JSONata-based transformations to convert data between schemas. Source schema id + target schema id + raw input → translated output. Used by the orchestrator (source-system → LIF) and the Learner Data Export service (LIF → external formats like OpenBadges 3.0 or CEDS). + +## Public surface + +```python +from lif.translator.core import Translator, TranslatorConfig +from lif.translator import utils +``` + +`TranslatorConfig(source_schema_id, target_schema_id)` describes the transformation; `Translator(config).run(input_data)` executes it. The translator fetches transformation definitions from MDR via [`mdr_client`](../mdr_client/) at run-time. + +## Layout + +| File | Contents | +|---|---| +| `core.py` | `Translator`, `TranslatorConfig` | +| `utils.py` | JSONata helpers + path-resolution utilities | + +## Used by +- `bases/lif/translator_restapi` — single consumer; mounts `Translator` behind `POST /translate/source/{source_schema_id}/target/{target_schema_id}` diff --git a/cspell.json b/cspell.json index ab532ae1..c8b83d68 100644 --- a/cspell.json +++ b/cspell.json @@ -79,6 +79,7 @@ "fastapi", "fastmcp", "firstname", + "getenv", "fromjson", "frontends", "Frontmatter", @@ -93,6 +94,8 @@ "Identifiertype", "idempotently", "idxs", + "inlines", + "isinstance", "INITDB", "ilike", "initdb",