This repository contains a Flask web application that generates descriptive captions for user-uploaded images using a deep learning model trained from scratch. The project was developed for the Board Infinity Hackathon.
- Image Upload: Users can upload images through the web interface.
- Caption Generation: The application processes the uploaded image and generates a descriptive caption.
- Deep Learning Model: Utilizes a Long Short-Term Memory (LSTM) network trained from scratch for caption generation.
To set up the application locally, follow these steps:
-
Clone the Repository:
git clone https://github.com/aayush9400/Image-Captioning-DL-Model.git cd Image-Captioning-DL-Model -
Create a Virtual Environment (optional but recommended):
python3 -m venv venv source venv/bin/activate -
Install Dependencies:
Ensure you have all necessary dependencies by installing them with pip:
pip install -r requirements.txt
-
Download Model Weights and Vocabulary Files:
The model requires specific weight and vocabulary files to function correctly. Ensure that the following files are present in the repository:
model.h5vocab.npy
If these files are not included in the repository, you may need to contact the repository owner or refer to the project's documentation for instructions on obtaining them.
-
Run the Flask Application:
Start the Flask server by executing:
python app.py
-
Access the Web Interface:
Open your web browser and navigate to
http://127.0.0.1:5000/to access the application. -
Upload an Image:
Use the provided interface to upload an image.
-
Generate Caption:
After uploading, the application will process the image and display the generated caption.
The repository is organized as follows:
app.py: Main Flask application file.model.h5: Trained LSTM model weights.vocab.npy: Vocabulary file used by the model.templates/: Directory containing HTML templates for the web interface.static/: Directory for static files (e.g., CSS, JavaScript).requirements.txt: List of Python dependencies.
The application relies on the following Python libraries:
- Flask
- NumPy
- TensorFlow/Keras
- Pillow
Ensure these are installed in your environment.
This project was developed as part of the Board Infinity Hackathon. Special thanks to the organizers and contributors who made this project possible.
This project is licensed under the MIT License. See the LICENSE file for more details.
For any issues or contributions, please open an issue or submit a pull request.