Skip to content

Latest commit

 

History

History
55 lines (37 loc) · 4.01 KB

File metadata and controls

55 lines (37 loc) · 4.01 KB

Data-analysis-project

pic

Static Badge Static Badge Static Badge

Description

This project was carried out as part of a computer technology workshop training course in which the task was to analyze data on arbitrary datasets, as well as test various statistical hypotheses. The entire implementation was integrated via Jupyter Notebook, so the final version looks in the format .ipynb and .html. Separate source code implementation files are also provided.

Documentation

  • Directories:
    • 1-2 steps: main directory with final realizations of two steps of project where 1 step is the data analysis and visualization and 2 step is the test of various statisitical hypotheses.
    • Scripts: directory with source code of main functions from programmes in two steps of project.
  • Methods (Functions):
    • Python:

      • extract_sport(string): function to extract sport category;

      • grubbs_test(array): function implements Grubbs test;

      • q_dixon_test(array): function implements Dixon-Q Test;

      • plot_ecdf(array, label, ax): function for creation ECDF with seaborn.

      • custom_ecdf(array): function creates ECDF of data;

      • envelope method(array, n): function implements envelope methods using bootstrap algorithm and function 'custom_ecdf(data)';

      • perform_normality_tests(array, name):: function implements tests to check hypothesis about normality of data;

      • f_test_variance(array_x, array_y, alpha):: function implements F test to check hypothesis the equality of variances;

      • compute_chi2_statistic(table)(tble_name):: function calculates chi-square statistics;

      • fit_polynomial_regression(degree):: function fits polynomial regression with polynoms with degree.

    • R:

      • envelope_ecdf <- function(data) {...}__: function creates polygon for envelope method via ECDF;
      • other methods are the same...

Main code realizations in .ipynb files in 1-2 steps directory.

Developers

Additional useful links

All files for downlodad on the Yandex disk

Python documentation

R documentation

Test of Hypotheses using statistics

License

MITlicense