immunization_data_analysis/README.md at main · Tesfa3028/immunization_data_analysis

Immunization data analysis

This project is designed to conduct complete data analysis with the goal of developing a model to classify children based on their immunization status and to identify the most influential features associated with immunization status of children. For the analysis, a dataset derived from raw data collected in Ethiopia in 2022 is used. This data was collected as part of a baseline survey designed to assess childhood immunization coverage and its associated factors among selected host and refugee populations in one of the remote regions (Gambella Region) of Ethiopia. The survey was conducted following the World Health Organization's (WHO's) Vaccination Coverage Cluster Surveys Reference Manual. Accordingly, the data were collected in the sample of 3,200 children aged 12–23 months and their mothers or caretakers. It has a total of 84 variables (Subset of the dataset) that captured information on demographic characteristics, socioeconomic status, health‑care access, and knowledge, attitudes, and practices (KAP) related to childhood immunization are used for the present analysis.

The analysis will be presented in two parts:

Part 1 involve data cleaning and exploration to understand the nature of the data where descriptive statistics are used to summarize and present the data.
Part 2 present the application of three machine learning concepts (Logistic regression, Random forest, and Artificial Neural Network) to answer the research questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Immunization data analysis

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Immunization data analysis