Skip to content

joaorobson/aws_beer_classification

Repository files navigation

Punk API infrasctructure

This project uses Terraform to construct a architecture that consumes, cleans and stoesdata from the Punk API, cl. Besides that, it provides a machine learning model that could be accessed remotely to predict the IBU (International Bitterness Units) of a beer via an AWS Lambda.

Setup

To build this project, you must have terraform installed and an AWS account.

Once you have both configured, it is possible to run the project. First of all, clone the repo and open it:

git clone https://github.com/joaorobson/aws_beer_classification.git
cd aws_beer_classification

Create a Python env to install some dependencies:

python3.9 -m venv env
source env/bin/activate
pip install -r notebooks/requirements.txt

Prediction setup

Before building the main architeture, it is necessary to create a Lambda that will be responsible to load a pre-trained model from a S3 bucket and make predictions remotely. This step was done using container images, given the memory limitations imposed by AWS regarding .zip deployment packages.

This can be done with the folowing steps:

  1. Set some env variables:
export AWS_REGION=us-west-2
export BUCKET_NAME="beers-linear-regressor"
export IMAGE_NAME="ibu_prediction_image"
export IMAGE_TAG="latest"
  1. Create the ECR repository to store the generated image:
terraform apply -target=aws_ecr_repository.ibu_prediction_repository
  1. Set the REGISTRY_ID AND IMAGE_URI env variables:
export REGISTRY_ID=$(aws ecr \
  describe-repositories \
  --query 'repositories[?repositoryName == `'$IMAGE_NAME'`].registryId' \
  --output text)
export IMAGE_URI=${REGISTRY_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${IMAGE_NAME}
  1. Authenticate the docker client to the ECR registry using you AWS account id:
aws ecr get-login-password --region $AWS_REGION | docker login --username AWS --password-stdin [aws_account_id].dkr.ecr.$AWS_REGION.amazonaws.com
  1. Build and push the docker image:
cd code/model/
docker build -t $IMAGE_URI .
docker push $IMAGE_URI:$IMAGE_TAG

NOTE: Currently, the lambda function will not work properly, because it depends of a model version stored at the S3 bucket. To make it work, follow the commands in the next sections.

Main architecture setup

After that, to build the architeture in AWS, in the root directoryof the project, run:

terraform apply

This command create all the resources used by the project. The comportament is rather basic: every 5 minutes, a new beer record is retrieved and store in S3 buckets, one with the raw data and another with a cleaned version. With that, the cleaned data bucket can be used to train a machine learning model locally, which is exemplified by this notebook.

Model training

Now, it is possible to train a model given the data collected and stored by the architecture. To do that, run the notebook located here:

 ./env/bin/jupyter notebook

After run it, it will be possible to make a prediction via the Lambda created early using the notebook itself or via CLI:

cd notebooks
./invoke_predict_ibu.sh

References

About

An architecture built with Terraform and AWS components to collect and classify beer's IBU's

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors