- Abstract
- Main folder structure
- Tools and libraries
- Architecture diagram
- Assumptions, requirements and considerations
- Deployment pipeline and testing with make file
- Configuration parameters
- Paths
- Networking
- Technical decisions
- Security and best practices
- Known issues
- Moving to production and possible upgrades
- Useful links
- Possible alternatives
This document tries to explain the application workflow, set up and configuration of bucket-av-scanner. The solution is automating the new files scan by using bucket event notifications, a ruby script and clamd.
- Scalability is handled by decoupling the notifications and the scanning script though a RabbitMQ queue.
- Storage is implemented in MinIO buckets.
- ClamD scanning is managed in a REST API.
- The entire solution is running on standard public containers images.
- The workflow is managed by a Makefile.
- Containers deploy is implemented on a docker-compose yml manifest.
- Set up is done by custom entrypoints scripts.
- HA and resiliency is ensured by healthchecks and volumes.
- The entire solution is using open source tools.
Files and scripts has been distributed as follows:
├── Makefile _# Makefile defining the workflow deploying, testing, and wrapping up the application_
├── README.rd _# project documentation_
├── bucket-av-scanner.png _# main architecture diagram used in README.rd_
├── docker-compose _# folder containing docker-compose manifest and custom entrypoint scripts_
├── docker-compose.yml _# main docker-compose manifest defining services_
├── docker-compose_avscan-script_entrypoint.sh _# custom entrypoints script for avscan-script service_
├── docker-compose_minio-service-init_entrypoint.sh _# custom entrypoints script for minio-service initialization_
├── docker-compose_minio-service_entrypoint.sh _# custom entrypoints script for minio-service service_
└── docker-compose_mq-service-init_entrypoint.sh _# custom entrypoints script for mq-service service_
└── testing _# folder containing testing scripts_
└── bucket-av-scanner_tests.sh _# testing script with clean and infected (EICAR) cases_
The following tools has been used:
-
av-scan script
- ruby [3.0.2p107] # running main script
- ruby gems
- aws-sdk-s3 [1.114.0] # download newly putted file on MinIO
- json [2.6.2] # post scan notification formating
- uri [0.11.0] # request and notification fields management
- yaml [0.2.0] # request and notification fields management
- logger [1.5.1] # script messages logging
- securerandom [0.2.0] # randomize filename locally on avscan container before scanning
- bunny [2.19.0] # mq queue subscribing and management
- net [0.3.3] # request and notification network comunication
- rest-client [2.1.0] # clamd post request
-
Queue management
- RabbitMQ [3.8.34] # MQ queue implementation
-
Bucket storage solution
- MinIO [2022-05-26T05:48:41Z] # bucket implementation
-
Antivirus solution
- ClamAV 0.104.3
-
Deploy
- docker [20.10.12] # running standard docker images
- docker-compose [1.29.2] # defining and creating docker services
-
Testing
- aws cli [2.7.30] # uploading and listing files
- wget [1.21.3] # downloading EICAR signature file
-
Docker container images
image name tag size usage ajilaag/clamav-rest latest 263MB antivirus scanning rabbitmq 3.8-management-alpine 148MB mq queue minio/minio RELEASE.2022-05-26T05-48-41Z 376MB bucket storage minio/mc RELEASE.2022-05-09T04-08-26Z 158MB bucket initialization alpine/openssl latest 8.04MB tls ssl generation ruby 2.7.0 842MB av-scan sscript execution
Execution workflow steps are as follows
- An external service uploads a file on the MinIO s3 bucket
- MinIO sends a PUT notification to the RabbitMQ queue
- The notification is read by the internal ruby script subscribed to the queue
- The internal ruby script downloads temporally the file from MinIO s3 bucket
- The internal ruby script scans the file by sending it to clamd-service
- using ClamAV and sends the response back to avscan-script
- The internal ruby script notifies the content sharing service endpoint with the results (event if it's infected or not)
The following 3 services, are used only for initialize proposes, and are running only on deploy time
- mq-service-init: initializing mq-service queue, tags, users and bindings
- minio-service-init: creating buckets and users on minio-service
- openssl-init: generating self-signed certificate for minio-service
Running the Makefile assumes:
- aws cli installed and PATH accessible.
- docker daemon is running.
- docker-compose installed and PATH accessible.
- you have access to internet (DockerHub and Gem sources) for downloading public images and gem libraries.
- other tools like make, netcat and wget installed and PATH accessible.
The following stages has been defined on the Makefile
- make all (< 15m) runs sequentially all the workflow: "make deploy-docker", "make test" and "make deploy-docker"
- make deploy-docker (< 11m) creation of all the needed resources on local docker daemon
- make test (< 10s) runs sequentially tests by uploading a clean and an infected file
- make clean-docker (< 5m) deletes the docker-compose resources created during the deploy
- make logs (< 5s) shows logs output of resources created during the deploy
Most important configuration parameters are customizable as follows:
- MINIO_SERVER_URL: service name and port for accessing "minio-service" service. Default: https://minio-service:9000
- MINIO_ENDPOINT: service name for accessing "minio-service" service. Default: minio-service
- MINIO_ROOT_USER: admin user name for "minio-service" service. Default: minioadmin
- MINIO_ROOT_PASSWORD: admin user password for "minio-service" service.
- MINIO_USER_NAME: regular user name for "minio-service" service. Default: miniouser
- MINIO_USER_PASSWORD: regular user password for "minio-service" service.
- MINIO_BUCKET_NAME: bucket name created on "minio-service" service. Default: storagebucket
- MINIO_REGION: emulated bucket region on "minio-service" service. Default: eu-west-1
- RABBITMQ_ENDPOINT: service name for accessing "rabbitmq-service" service. Default: mq-service
- RABBITMQ_PORT: service port number for accessing "rabbitmq-service" service. Default: 15672
- RABBITMQ_DEFAULT_USER: admin user name for "rabbitmq-service" service. Default: rabbitadmin
- RABBITMQ_DEFAULT_PASS: admin user password for "rabbitmq-service" service.
- RABBITMQ_REGULAR_USER_NAME: regular user password for "minio-service" service. Default: rabbituser
- RABBITMQ_REGULAR_USER_PASS: regular user password for "minio-service" service.
- RABBITMQ_QUEUE_ROUTING_KEY: MQ routing key for directing MinIO notification to mq queue. Default: bucket_notifications
- RABBITMQ_QUEUE_NAME: MQ routing key for directing MinIO notification to mq queue. Default: s3minioqueue
- RABBITMQ_TOPIC: MQ topic created where MinIO notification are redirected to. Default: s3minioscan
- RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS: additional rabbit server setting to set console log level to "error".
- CLAMD_ENDPOINT: service name for accessing "clamd-service" service. Default: clamd-service
- CLAMD_PORT: service port number for accessing "clamd-service" service. Default: 9443
- DELETE_FILE: option enabling the file deletion on bucket is its infected. Default: true
- TAG_FILES: option enabling adding tag to scanned files on the bucket. Default: true
- TAG_KEY: tag added to scanned files if TAG_FILES is true: Default: scanned
- VOLUME_SIZE=: maxium scanned file size (in GB): Default: 2
- REPORT_CLEAN: option enabling sending of not resulting scan execution. Default: false
- PUBLISH_URL: option setting up the endpoint where to send the result of scanning execution. Default: https://some-service.example.com/notification
Paths and folders inside the different services has been distributed as follows:
| service name | path | usage |
|---|---|---|
| minio-service | /root/.minio/certs | certificates |
| minio-service | /data/<bucket_name> | bucket data |
| minio-service | /data/.minio.sys | config files |
| rabbitmq-service | /var/log/rabbitmq | rabbitmq service log |
| rabbitmq-service | /var/lib/rabbitmq | rabbitmq service queue |
| avscan-script | /opt/av-scan/worker.rb | avscan ruby script |
| avscan-script | /opt/av-scan/av-scan.conf | avscan configuration file |
| avscan-script | /opt/av-scan/av-scan.log | avscan log |
| clamd-service | /var/lib/clamav/ | clamd virus definitions |
| clamd-service | /usr/bin/entrypoint.sh | entrypoint scripts |
| clamd-service | /etc/clamav/clamd.conf | clamd configuration file |
| clamd-service | /var/log/clamav/clamd.log | clamd daemon log |
The following ports and paths are accessible on the deploy:
| service name | port and path | protocol | usage |
|---|---|---|---|
| minio-service | 3310/ | tcp | ClamD listening port |
| clamd-service | 9443/scan | https | ClamD HTTP scan |
| clamd-service | 9443/metrics | https | ClamD Prometheus metrics |
| minio-service | 9001/ | https | MinIO Web Console |
| minio-service | 9000/ | ssl | MinIO bucket service |
| mq-service | 15672/ | tcp/http | RabbitMQ queue control plugin |
| avscan-script | 8080/ | tcp | avscan-script readiness check |
The following decisions has been considered done during the implementation:
- all custom logging messages starting with [SERVICE_NAME] for a clear understanding
- using official aws s3 client: instead of using MinIO client for uploading/downloading files into the bucket, as same AWS API is implemented and also using basic funcionality.
- Docker images
- pinned versions avoiding missconfigurations for features deprecation.
- using "rabbitmq management-alpine" image instead of regular "rabbitmq" image: rabbitmq_management necessary plugin is enabled by default on management docker images, cannot be enabled by env vars and there is minimal size and software diferences, rabbitmq_management plugin is enabled by default ()
- using "ruby" regular image instead of "ruby-slim": additional libraries are needed, and it's installation extend the installation boot first boot even more.
- Docker volumes for data and logging:
- keeping state and container configuration after failure restarts
- making it accessible and visible for future additional components (logging sidecar)
- mount binding for entrypoint, avoiding new volumes creation
- Custom container entrypoints
- using dynamic "until ... do" loops instead of hardcoded sleep commands, avoiding raise conditions on boot dependencies
- using info/error logging levels as possible, avoiding unnecessary information and getting faster boot
- using separate entrypoint script files, for a better easier debug and understanding
- restart policies
- initialization services (openssl-init, minio-service-init, mq-service-init) needs to run once, just setting up the other containers and exit
- stable services (minio-service, mq-service, clamd-service), stores his state and logs on volumes, and needs to be restarted if anithing wrong is detected with the healthcheck
- avscan-script readiness check
- launching a netcat listening a port, at the last moment, as a basic way to know the gem install has been finished
- Architecture
- getting file upload scalability, due multiple avscan script and clamd containers can be deployed subscribed to the same MQ queue
- why not changing the workflow making the uploading file service scan it first to clamd before storing it into s3 bucket? Im assuming as a requirements:
- to don't modify the legacy application workflow
- Same working MinIO bucket/cluster could be needed to keep
- solution does not scale if you don't decouple the notification-script-scan by a queue
- In transit encrypted traffic by using SSL certificates on minio-service (port 9000) and clamd SSL (9443)
- no using admin neither default users, minio, and creating regular non-priviledged users (miniouser, rabbituser) non-priviledge uses
- exposing minimal necessary ports (no plain http protocols), reducing attach surface
- exposing only necessary credentials and vars on each container (non admin users)
- random filename usage inside avscan-script (securerandom gem), and ensuring its deleted after every scan
- virus definition files are updated by clamd-service
- dependencies between services/containers are not properly managed on docker-compose. As a workaround, the entrypoint script is holding the execution until the needed service/endpoint is ready
- initialization time for avscan-script service (around 11 minutes). Installing needed gem dependencies for ruby script takes time during initial boot. I would like to keep using public docker images and dynamic initialization script (entrypoint) for better solution understanding. As a workaround, a custom Docker image could be generated and used with the gem dependencies already installed.
- self-signed certificates. Due for infra limitations, certificates generated on deploy with openssl time are self-signed. Even this ensures in-transit encrypted comunication, certificates itself are not accepted as hey are not using a valid CA/DNS domain. As a workaround, ssl verify has been disabled on aws s3 sdk and also on chrome.
- Ruby Script
- better error and exception management in avscan-script ruby script, splitting by error type, and logging and error code number
- read vars from OS (docker image) on avscan-script, avoiding configuration file
- Logging management
- forward local containers logs by using sidecar strategy and Loki/Prometheus/Splunk agent
- Security
- encrypted MQ communication by adding ssl certificated on rabbitmq as well RABBITMQ_SSL_CERTFILE, RABBITMQ_SSL_KEYFILE
- import self signed certificates into local container cert database for all containers, avoiding SSL check disable on aws cli
- using docker secrets to store credentials
- generating random passwords on deploy time and in each deploy by /dev/random
- Storage
- Scaling storage solution by deploying a MinIO Tenant/Cluster sharing same bucket volume
- Deployment/architecture
- using separate files env_file for environment vars in docker-compose
- "make status" entry checking service availability and endpoints connectivity
- offline clamd update (virus definition sidecar strategy on /var/lib/clamav/), getting and entire offline solution
- upgrade used software versions (ruby, RabbitMQ, MinIO, etc...)
- Kubernetes (future)
- migrate docker compose file, and using Horizontal Pod Autoscaling (hpa) for autoscaling "avscan-script" and "clamd-service" services, based on queue/clamd CPU performance
- create ingress for each service and switch ssl management to a tls secret
- persistent MinIO data volume by using Container Storage Interface (CSI)
- minio-service
- rabbitmq-service
- avscan-script
- clamd-service
- aws s3
- ifad / clammit : discarted option due modifies the legacy application workflow, and also needs the additional clamd service/docker
- widdix / aws-s3-virusscan : discarted option due is not cloud-agnostic, open source
- awslabs / cdk-serverless-clamscan : discarted option due is not cloud-agnostic, open source
