Skip to content

LanaGeis/MAP-Student-Math-Misunderstandings_Kaggle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MAP-Student-Math-Misunderstandings (Kaggle)

NLP + Machine Learning project identifying student math misconceptions from open-ended responses.
Techniques used: TF-IDF, embeddings, logistic regression, deep learning baselines, and full model evaluation.


Contents


Project Overview

This project explores natural language processing (NLP) and machine learning models for automatically detecting student math misconceptions by analyzing their responses. Various vectorization and modeling techniques are compared, including traditional ML and deep learning baselines. Model performance is benchmarked and explained in detail.


Dataset


Notebook

Project workflow, modeling, and evaluation are contained in the main Jupyter notebook:

The notebook includes data loading, preprocessing, feature engineering, modeling (TF-IDF, embeddings, logistic regression, deep learning), and evaluation.
Detailed markdown cells throughout the notebook explain each step.


Results

  • Main Findings:
    • (Insert your best model’s performance summary here, e.g., “The best logistic regression model achieved X% F1-score.”)
    • (Comment on key insights or interesting failure cases if desired.)
  • Sample Outputs:
    • (Optionally add images or output snippets here.)

How to Run

Locally:

  1. Clone this repository:

    git clone https://github.com/LanaGeis/MAP-Student-Math-Misunderstandings_Kaggle.git
    cd MAP-Student-Math-Misunderstandings_Kaggle
  2. (Optional) Set up a Python virtual environment.

  3. Install dependencies:

    pip install -r requirements.txt
  4. Download the dataset from the Kaggle competition page
    and place it in a data/ directory in this repository.

  5. Open the notebook:

    jupyter notebook Term_Project_geissinger_final.ipynb

    and run the cells in order.

On Kaggle:

  • If you wish to run this notebook on Kaggle, create a new Notebook and upload Term_Project_geissinger_final.ipynb.
  • Attach the MAP Charting Student Math Misunderstandings dataset from the “Add Data” sidebar.
  • Make sure code for data loading references the correct Kaggle input paths.

Requirements

See requirements.txt for all dependencies.

Key packages:

  • Python 3.8+
  • numpy, pandas, scikit-learn, matplotlib, seaborn
  • tensorflow or pytorch (for deep learning models)
  • tqdm, nltk, sentence-transformers

On Kaggle, most of these dependencies are pre-installed.


Kaggle Integration

  • Dataset:
    MAP Charting Student Math Misunderstandings Competition Data
  • Notebook:
    This notebook not yet published on Kaggle as of this release.
    To publish:
    1. Go to your Kaggle account, “Code” → “+ New Notebook”.
    2. Upload Term_Project_geissinger_final.ipynb.
    3. Attach required dataset and run all cells.
    4. Publish/share when ready.

License

MIT License.
See LICENSE for details.


Acknowledgments

  • Professor Brett Werner, Bellevue University, for feedback and review.

About

NLP + Machine Learning project identifying student math misconceptions using open-ended responses. Includes TF-IDF, embeddings, logistic regression, deep learning baselines, and full model evaluation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors