The Car Recommendation System is a dynamic platform designed to cater to two distinct user groups: practicality seekers and automobile enthusiasts. By leveraging content-based filtering techniques such as Word Embeddings and K-means, along with the collaborative filtering technique Matrix Factorization, it constructs a robust recommendation algorithm. This algorithm, powered by machine learning models, analyzes user input and preferences through a series of tailored questions across various parameters. Whether you prioritize utility or have a passion for automobiles, this system ensures tailored recommendations to suit your needs.
- Description
- Installation
- API Documentation
- Working of Recommendation System
- Working of Web Application
- Screenshots
- Deployed Links
- Tech Stacks
- References
-
Open your terminal, and clone the repository to local machine:
git clone https://github.com/blackchapel/car-recommendation-system.git
-
Navigate to backend folder:
cd backend -
Create
.envfile in backend folder and copy contents from.env-exampleto this file.-
There are 4 variables in this file. 3 are pre-filled in the
.env-examplefile. -
To generate the
SECRET_KEY, run following command on your terminalopenssl rand -hex 32
-
Paste the generated string as the value of
SECRET_KEYvariable
-
NOTE: Project can be run locally by following either Step 4 or Step 5.
-
To run project with docker -
⚠️ Prerequisite - Docker must be installed-
Navigate to root folder (if not already in it):
cd .. -
In the root directory of the project, run
docker -compose up
⚠️ It will take around 3 to 5 minutes to build the docker containers -
-
To run project without docker -
-
In the
.envfile created above, replace the value ofMONGODB_URIvariable with the URI of your own MongoDB database -
Navigate to backend folder (if not already in it):
cd backend -
Run the following commands in this terminal:
⚠️ Prerequisite - Python3 version must be 3.9python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
uvicorn main:app
Backend server is now running in this terminal
-
Open a new terminal
-
Navigate to frontend folder in this new terminal:
cd frontend -
Run the following commands in this terminal:
npm install <or> yarn install
npm start <or> yarn start
Frontend server is now running in this terminal
-
-
Once the docker container/local servers are running -
- Frontend will be available on localhost:3000
- Backend will be available on localhost:8000
- Database will be available
- with Docker: through
mongodb://localhost:27017/ - without Docker: on URI entered in step 5.1
- with Docker: through
- Once the project is running, open your web browser and enter the following URL: Population link
- Hit the API endpoint
- The local database will now be populated with the required data
- Username: johndoe
- Password: pass@123
The documentation for APIs created for this application using FastAPI will be available at the following:
- Online: car-recommendation-backend.onrender.com/docs
- Local: localhost:8000/docs when running locally
The recommendation system operates through a multi-step process to provide personalized vehicle suggestions:
-
Data Preprocessing: The raw dataset, sourced from the all-vehicles-model dataset, undergoes preprocessing to ensure quality and relevance. This includes filling missing values, dropping unnecessary columns, removing duplicates, encoding categorical variables, scaling numerical features and enriching each entry with additional information such as images and prices. These steps enhance the dataset's completeness and accuracy, laying the groundwork for effective recommendation modeling.
-
Word Embeddings: Utilizing Word Embeddings through the Word2Vec algorithm, textual data related to vehicles such as descriptions, specifications, and user reviews are transformed into high-dimensional vectors. Each word is represented as a 100-dimensional vector, capturing semantic relationships between words and enabling the system to understand the context of vehicle attributes and enhance recommendation accuracy.
-
K-means Clustering: K-means clustering is employed to group vehicles with similar features into clusters. Prior to clustering, the vehicle attributes derived from Word Embeddings are normalized to ensure that all features contribute equally to the clustering process. By analyzing the normalized vehicle attributes, K-means helps identify patterns and similarities among vehicles. This enables the system to recommend vehicles that belong to the same cluster as those preferred by the user. K-means clustering is utilized both to recommend similar cars when a user is viewing car details and after answering the short questionnaire, ensuring relevant and diverse recommendations.
-
Matrix Factorization: Matrix Factorization is utilized as a collaborative filtering technique to personalize recommendations based on user interactions. If a user has provided ratings to different vehicles, Matrix Factorization identifies latent factors underlying user preferences and vehicle characteristics. By decomposing the user-item interaction matrix, Matrix Factorization captures hidden patterns in user behavior and generates recommendations tailored to individual preferences as well as based on similarities with other users.
For detailed implementation and code, refer to the ipynb file available in the repository. The code is also accessible here.
The web application offers three key features:
- Search Functionality: Users can search for specific vehicles.
- Tailored Recommendations: Users can answer a 5-question questionnaire to receive personalized vehicle recommendations.
- Advanced Recommendation System: Users can explore further recommendations based on various parameters.
Additionally, the platform includes a ratings feature, allowing users to rate cars they like.
Account creation and management are supported, enabling users to save their ratings and receive further recommendations based on their preferences. Authentication is secured through the implementation of JSON Web Tokens (JWT) and OAuth2.0 protocols.
To enhance the search experience, the autocomplete API implemented suggests relevant options as users type.
The application leverages Word Embeddings to analyze user search queries, enabling the recommendation of further similar cars. Upon completing the questionnaire, K-means clustering is utilized to tailor recommendations. Ratings provided by users inform the Matrix Factorization model, facilitating precise recommendations based on user preferences.
This succinct description encapsulates the key functionalities and security measures of the web application.
Here is a user guide to help you navigate the website when you run it locally. If the previous link is not accessbile please try this link.
Recommended Cars based on questionnaire -

The application is hosted on the following links, if unable to run locally.
- Frontend: https://car-recommendation.vercel.app/
- Backend: https://car-recommendation-backend.onrender.com/api
NOTE: Recommend to run the application locally. Above website may be slow since the application is hosted on free tier.
📄 Dataset: All Vehicles Model Dataset
The All Vehicles Model Dataset provides information on various vehicles, including electric vehicles (EVs), plug-in hybrid electric vehicles (PHEVs), car models, makes, fuel consumption, and CO2 emissions. The dataset is provided by the Environmental Protection Agency and is available in the public domain.
- Dataset Identifier: all-vehicles-model
- Keywords: Vehicle, Motor, EV, PHEV, Car, Model, Make, Consumption, CO2
- License: Public domain
- Publisher: Environmental Protection Agency
- Reference: Fuel Economy Website






