Welcome to the Python Programming Chatbot project repository! This project is a robust and intelligent chatbot designed to assist Python developers and learners with real-time solutions to programming challenges. By leveraging cutting-edge natural language processing models, the chatbot delivers accurate, context-aware, and actionable responses to coding queries.
- Project Overview
- Features
- Technologies Used
- Dataset
- Model Training and Fine-Tuning
- Applications
- Project Structure
- Installation and Usage
- Future Work
- Contributing
- License
The Python Programming Chatbot is developed to assist users in solving Python-related queries. It is fine-tuned on Salesforce's codegen-350M-multi model and uses a custom dataset of real-world Python challenges. This chatbot can be deployed in educational platforms, developer tools, and coding assistants.
- Interactive Python Programming Assistance: Responds to Python programming queries with tailored solutions.
- Real-World Problem Solving: Handles real-world coding scenarios, including debugging, optimization, and scripting.
- Developer-Friendly Interface: Seamless integration for developers needing real-time coding support.
- Scalable Backend: Built using Flask for API development and MongoDB for chat history storage.
- Python
- Flask
- Hugging Face Transformers
- Pandas
- NumPy
- Scikit-learn
- Base Model: Salesforce's
codegen-350M-multi - Fine-tuned Model: Optimized for Python coding dialogues
- MongoDB (For chat history storage)
The custom dataset for this project includes Python programming challenges in a question-answer format.
- Instruction: Describes the task or query.
- Input: Provides additional context.
- Output: Contains the expected Python code solution.
- Combined
InstructionandInputinto a single dialogue format:User: [Instruction + Input] Chatbot: [Output] - Split into 80% training and 20% evaluation subsets.
- Converted to Hugging Face Dataset format.
The chatbot model was fine-tuned using the Hugging Face Trainer API.
- Batch Size: 4
- Learning Rate: 5e-5
- Epochs: 3
- Gradient Accumulation Steps: 8
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
tokenizer=tokenizer,
data_collator=data_collator,
)
trainer.train()Enhances e-learning by providing detailed programming solutions and personalized guidance for Python learners.
Assists developers with debugging, error resolution, and best practice suggestions in real-time.
Boosts productivity by offering quick responses to Python coding challenges, saving time on searching and troubleshooting.
├── dataset/
│ ├── train.json
│ ├── eval.json
├── model/
│ ├── fine_tuned_model/
│ ├── tokenizer/
├── api/
│ ├── app.py
│ ├── requirements.txt
├── README.md
- dataset/: Contains training and evaluation datasets.
- model/: Stores the fine-tuned model and tokenizer.
- api/: Flask-based API files for chatbot interaction.
- Python 3.8 or higher
- MongoDB
-
Clone the Repository
git clone https://github.com/your-username/python-chatbot.git cd python-chatbot -
Install Dependencies
pip install -r api/requirements.txt
-
Set Up the Database
- Install and configure MongoDB.
- Update the connection string in
app.py.
-
Run the Flask App
python api/app.py
-
Interact with the Chatbot
- Use a REST client like Postman to send queries to the chatbot API.
- Expand dataset to include more programming languages.
- Implement a web-based front end for easier user interaction.
- Enhance model capabilities for handling advanced coding tasks.
We welcome contributions!
- Fork the repository.
- Create a new branch:
git checkout -b feature-name
- Make changes and commit:
git commit -m "Add feature-name" - Push the branch and open a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.



