Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
PROJECT_ID=ons-sdx-bob
ENV=development
DATASET_BUCKET_NAME="ons-sdx-bob-datasets"
PROJECT_ID=ons-sds-sandbox
PROFILE=dev
DATASET_BUCKET_NAME="ons-sds-sandbox-datasets"
FIRESTORE_DATABASE="test"
PUBLISH_DATASET_TOPIC_ID="projects/ons-sds-sandbox/topics/publish-dataset"
12 changes: 3 additions & 9 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,6 @@ SHELL := bash
.ONESHELL:


PHONY: install
install: ## Install dependencies
uv sync


.PHONY: lint
lint:
@echo "Running Ruff linter..."
Expand All @@ -25,10 +20,9 @@ test:
uv run --dev pytest -v --disable-warnings tests/


.PHONY: test-parallel
test-parallel:
@echo "Running Local Tests..."
uv run --dev pytest -n auto -v --disable-warnings tests/
.PHONY: install
install: ## Install dependencies
uv sync


.PHONY: dev
Expand Down
49 changes: 49 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,53 @@ make test
docker build -t sds-loader .
```

## Profiles

sds-loader uses "profiles" to determine the concrete implementation of the abstracted services it uses.

For example when running locally, you may just want to use fake repositories and test the business logic of the application, whereas in production you will want to use a Firestore database, GCP etc.

Profiles are determined by the `PROFILE` environment variable. This will default to `prod` if not set. The following profiles are available...

- `prod`: This profile will use the real implementations of all services. This is the default profile.
- `dev`: This profile will use fake repositories and services that do not connect to any real services
- `local_storage_firestore` This will use fake repositories for all services except the `DatasetStorageRepositoryInterface` which will use Firestore. To set up a local Firestore read the instructions below...

## Firestore emulator

In order to use Firestore locally, you will need to set up the Firestore emulator. You can do this using Docker. Run the following command to start the Firestore emulator:

You will need to set the envronment variable FIRESTORE_EMULATOR_HOST to instruct the application to connect to the emulator instead of the real Firestore service...

```bash
export FIRESTORE_EMULATOR_HOST=localhost:8080
```

Ensure the profile for the application is set to `local_storage_firestore` to use the Firestore emulator...

```bash
export PROFILE=local_storage_firestore

# Or add it to the .env file

PROFILE=local_storage_firestore
```

Then run the following command to start the Firestore emulator in Docker. Note it takes a few seconds to start up, so you may want to run this command before starting the application...

```
docker run \
--rm \
-p=9000:9000 \
-p=8080:8080 \
-p=4000:4000 \
-p=9099:9099 \
-p=8085:8085 \
-p=5001:5001 \
-p=9199:9199 \
--env "GCP_PROJECT=${PROJECT_ID}" \
--env "ENABLE_UI=true" \
spine3/firebase-emulator &
```


1 change: 1 addition & 0 deletions app/broadcasters/fake_broadcaster.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ class FakeBroadcaster(DatasetBroadcastInterface):
"""
A fake broadcaster that doesn't actually broadcast the data
"""

def __init__(self):
self.broadcasted = []

Expand Down
8 changes: 2 additions & 6 deletions app/broadcasters/pubsub_broadcaster.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ class PubsubBroadcaster(DatasetBroadcastInterface):
A broadcaster that will
broadcast to pubsub
"""

def __init__(
self,
settings: PubsubBroadcastSettings,
Expand All @@ -24,9 +25,4 @@ def __init__(
self.pubsub_client = PubsubService()

def broadcast(self, dataset_metadata: DatasetMetadata) -> None:

self.pubsub_client.publish_message(
self.settings.publish_dataset_topic_id,
json.dumps(dataset_metadata),
{}
)
self.pubsub_client.publish_message(self.settings.publish_dataset_topic_id, json.dumps(dataset_metadata), {})
116 changes: 16 additions & 100 deletions app/dependencies.py
Original file line number Diff line number Diff line change
@@ -1,26 +1,13 @@
from lagom import Singleton, dependency_definition
from lagom import Singleton
from lagom.container import Container
from sds_common.publishers.gcs_schema_publisher import GcsSchemaPublisher
from sds_common.publishers.github_schema_publisher import GithubSchemaPublisher
from sdx_base.services.storage import StorageService

from app.broadcasters.fake_broadcaster import FakeBroadcaster
from app.broadcasters.pubsub_broadcaster import PubsubBroadcaster
from app.interfaces.dataset_broadcast_interface import DatasetBroadcastInterface
from app.interfaces.dataset_deletion_repository_interface import DatasetDeletionRepositoryInterface
from app.interfaces.dataset_source_repository_interface import DatasetSourceRepositoryInterface
from app.interfaces.dataset_storage_repository_interface import DatasetStorageRepositoryInterface
from app.repositories.dataset_deletion.fake_dataset_deletion_repository import FakeDatasetDeletionRepository
from app.repositories.dataset_deletion.firestore_dataset_deletion_repository import FirestoreDatasetDeletionRepository
from app.repositories.dataset_source.bucket_dataset_source_repository import BucketDatasetSourceRepository
from app.repositories.dataset_source.fake_dataset_source_repository import FakeDatasetSourceRepository
from app.repositories.dataset_storage.fake_dataset_storage_repository import FakeDatasetStorageRepository
from app.repositories.dataset_storage.firestore_dataset_storage_repository import FirestoreDatasetStorageRepository
from app.services.dataset_service import DatasetService, DatasetSettings
from app.services.schema_service import SchemaService

from app import get_logger
from app.profiles import PROFILES
from app.services.dataset_service import DatasetService, DatasetSettings
from app.settings import Settings, get_instance, QuickSettings

logger = get_logger()


class FakePublisher:
def __init__(self, name: str):
Expand All @@ -41,101 +28,30 @@ def build_container() -> Container:
# Create the DI container
container = Container()

# Determine environment
is_prod = QuickSettings().is_production()

# -----------------------------
# Core / shared dependencies
# -----------------------------
container[Settings] = lambda: get_instance()

# -----------------------------
# DatasetSourceRepositoryInterface
# -----------------------------

if is_prod:

@dependency_definition(container)
def build_bucket_dataset_source_repository() -> BucketDatasetSourceRepository:
return BucketDatasetSourceRepository(
bucket_reader=StorageService(),
settings=container[Settings]
)

container[DatasetSourceRepositoryInterface] = BucketDatasetSourceRepository

else:
container[DatasetSourceRepositoryInterface] = FakeDatasetSourceRepository

# -----------------------------
# DatasetStorageRepositoryInterface
# Apply profile
# -----------------------------
profile = QuickSettings().get_profile()

if is_prod:
try:
profile_fn = PROFILES[profile]
except KeyError:
raise ValueError(f"Unknown profile '{profile}'. Available: {list(PROFILES.keys())}")
logger.info(f"Using profile {profile}")

@dependency_definition(container)
def build_firestore_dataset_storage_repository() -> FirestoreDatasetStorageRepository:
return FirestoreDatasetStorageRepository(
settings=container[Settings]
)

container[DatasetStorageRepositoryInterface] = FirestoreDatasetStorageRepository
else:
container[DatasetStorageRepositoryInterface] = FakeDatasetStorageRepository
# Apply profile
profile_fn(container)

# -----------------------------
# DatasetDeletionRepositoryInterface
# Static Services
# -----------------------------

if is_prod:

@dependency_definition(container)
def build_firestore_dataset_deletion_repository() -> FirestoreDatasetDeletionRepository:
return FirestoreDatasetDeletionRepository(
settings=container[Settings]
)

container[DatasetDeletionRepositoryInterface] = FirestoreDatasetDeletionRepository
else:
container[DatasetDeletionRepositoryInterface] = FakeDatasetDeletionRepository

# -----------------------------
# DatasetBroadcastInterface
# -----------------------------

if is_prod:

@dependency_definition(container)
def build_pubsub_broadcaster() -> PubsubBroadcaster:
return PubsubBroadcaster(
settings=container[Settings]
)

container[DatasetBroadcastInterface] = PubsubBroadcaster
else:
container[DatasetBroadcastInterface] = FakeBroadcaster

# Settings
container[DatasetSettings] = lambda: get_instance()

# -----------------------------
# Services
# -----------------------------

# Schema Service

if is_prod:
container[SchemaService] = SchemaService(
bucket_publisher=GcsSchemaPublisher,
repository_publisher=GithubSchemaPublisher,
)
else:
container[SchemaService] = SchemaService(
bucket_publisher=FakePublisher(name="Fake bucket publisher"),
repository_publisher=FakePublisher(name="Fake github publisher"),
)

# Dataset service
container[DatasetService] = Singleton(DatasetService)

return container
4 changes: 1 addition & 3 deletions app/exceptions/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@

class NonCriticalException(Exception):
...
class NonCriticalException(Exception): ...


class DatasetException(Exception):
Expand Down
3 changes: 1 addition & 2 deletions app/exceptions/dataset_deletion_empty_exception.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
from app.exceptions import NonCriticalException


class DatasetDeletionEmptyException(NonCriticalException):
...
class DatasetDeletionEmptyException(NonCriticalException): ...
3 changes: 1 addition & 2 deletions app/exceptions/dataset_deletion_exception.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
from app.exceptions import DatasetException


class DatasetDeletionException(DatasetException):
...
class DatasetDeletionException(DatasetException): ...
3 changes: 1 addition & 2 deletions app/exceptions/dataset_deletion_mark_exception.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
from app.exceptions import DatasetException


class DatasetDeletionMarkException(DatasetException):
...
class DatasetDeletionMarkException(DatasetException): ...
3 changes: 1 addition & 2 deletions app/exceptions/dataset_invalid_filename_exception.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
from app.exceptions import DatasetException


class DatasetInvalidFilenameException(DatasetException):
...
class DatasetInvalidFilenameException(DatasetException): ...
3 changes: 1 addition & 2 deletions app/exceptions/dataset_metadata_retrival_exception.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
from app.exceptions import DatasetException


class DatasetMetadataRetrivalException(DatasetException):
...
class DatasetMetadataRetrivalException(DatasetException): ...
3 changes: 1 addition & 2 deletions app/exceptions/dataset_not_found_exception.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
from app.exceptions import DatasetException


class DatasetNotFoundException(DatasetException):
...
class DatasetNotFoundException(DatasetException): ...
3 changes: 1 addition & 2 deletions app/exceptions/dataset_source_empty_exception.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
from app.exceptions import NonCriticalException


class DatasetSourceEmptyException(NonCriticalException):
...
class DatasetSourceEmptyException(NonCriticalException): ...
3 changes: 1 addition & 2 deletions app/exceptions/dataset_storing_exception.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
from app.exceptions import DatasetException


class DatasetStoringException(DatasetException):
...
class DatasetStoringException(DatasetException): ...
3 changes: 1 addition & 2 deletions app/exceptions/dataset_validation_exception.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
from app.exceptions import DatasetException


class DatasetValidationException(DatasetException):
...
class DatasetValidationException(DatasetException): ...
4 changes: 1 addition & 3 deletions app/exceptions/schema_source_invalid_exception.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@

from app.exceptions import SchemaException


class SchemaSourceInvalidException(SchemaException):
...
class SchemaSourceInvalidException(SchemaException): ...
13 changes: 2 additions & 11 deletions app/interfaces/dataset_storage_repository_interface.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,7 @@ class DatasetStorageRepositoryInterface(ABC):
"""

@abstractmethod
def get_latest_dataset_metadata(
self,
survey_id: str,
period_id: str
) -> DatasetMetadataWithoutId | None:
def get_latest_dataset_metadata(self, survey_id: str, period_id: str) -> DatasetMetadataWithoutId | None:
"""
Gets the latest dataset for a given survey and period id

Expand Down Expand Up @@ -49,12 +45,7 @@ def store_dataset(
...

@abstractmethod
def delete_dataset_version(
self,
survey_id: str,
period_id: str,
version: int
):
def delete_dataset_version(self, survey_id: str, period_id: str, version: int):
"""
Delete a specific version of the dataset from the repository

Expand Down
12 changes: 3 additions & 9 deletions app/middleware/timing.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,14 +27,8 @@ async def dispatch(self, request: Request, call_next):

response = await call_next(request)

process_time = round(
time.perf_counter() - start_time,
4
)

logger.info(
f"{request.method} {path} "
f"took {process_time}s"
)
process_time = round(time.perf_counter() - start_time, 4)

logger.info(f"{request.method} {path} took {process_time}s")

return response
4 changes: 2 additions & 2 deletions app/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@


class StrictBase(BaseModel):
model_config = ConfigDict(extra='forbid', use_enum_values=True)
model_config = ConfigDict(extra="forbid", use_enum_values=True)


class AllowExtraBase(BaseModel):
model_config = ConfigDict(extra='allow', use_enum_values=True)
model_config = ConfigDict(extra="allow", use_enum_values=True)
Loading
Loading