-
Notifications
You must be signed in to change notification settings - Fork 13
Microservices single Job execution RFC
Microservices must execute only one job at a time.
Microservices must not have any job management logic and persistence (especially in-memory persistence).
- We do not bloat microservices with business logic and persistence logic.
- Sometimes it is hard to understand which job is running in a particular microservice. Job execution may contain multiple processes running multiple binaries (for example, REINVENT4 uses multiple autodock vina instances running, and it is hard to determine programmatically which process belongs to job1 or job2).
- Microservices that contain multiple jobs must have some level of synchronization between FastAPI workers (which leads to statement 1).
- Microservices that contain multiple jobs must have persistent storage (must be deployed outside of the container to satisfy 12-factor app practices).
- Due to common cloud development practices, there is a 1 app - 1 process - 1 container relation. All side services/processes must be deployed to another container or side-car.
- A parallel computing framework would be hard to design and implement. For example, Parsl, since there would be another level of container-job management and we cannot map job-id to container-id directly.
- Containers that run multiple jobs are harder to mock for testing purposes.
The container process can have a persistent in-memory current JobId (GUID) that must be cleared using the API method /api/v1/main/state/clear. There can be only one JobId and state assigned to a container. When the container is just started, it does not have a JobId and state. When the container FastAPI process shuts down, the JobId and state are considered lost. The reason why the container must still have a JobId is to ensure that we are running/getting the result state of the same job that we started initially.
POST /api/v1/main/state/clear
Clears the container state.
GET /api/v1/main/state
Response body:
json Copy code { "jobId": "Optional[UUID]", "status": { "oneOf": [ "running", "idle" ] } } POST /api/v1/main/start/{jobId}
Payload:
Any
Response body:
Any
POST /api/v1/main/stop
Any other methods