Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
b18cfac
feat: Initialize rissk_kedro project structure with pipelines and doc…
VJausovec Feb 10, 2026
ed6a615
change raw folder to parameter rather than in data catalogue, small r…
VJausovec Feb 10, 2026
7e849a7
fix: Update questionnaire version keys and handle answer_sequence typ…
VJausovec Feb 11, 2026
2011167
docs: Update README with data encryption instructions and modify test…
VJausovec Feb 11, 2026
60e3686
feat: Refactor data ingestion pipeline to separate loading of paradat…
VJausovec Feb 11, 2026
3ab3b6d
feat: Add data ingestion pipeline with extraction logic and update ca…
VJausovec Feb 11, 2026
fed9105
refactor: Simplify FeatureProcessing constructor by removing survey_i…
VJausovec Feb 11, 2026
8ca3de3
return initial folders to catalogue
VJausovec Feb 12, 2026
7dda8cc
refactoring to unzip before finding file_paths
VJausovec Feb 12, 2026
9fc9e33
Refactor feature engineering pipeline and update test ingestion notebook
VJausovec Feb 13, 2026
b1298dd
- kedro ingestion: the initial data is passed as partitioned data in …
VJausovec Feb 13, 2026
706265c
Refactor data ingestion pipeline: update partition names, enhance zip…
VJausovec Feb 15, 2026
ef24c0e
Refactor data ingestion pipeline:
VJausovec Feb 16, 2026
95ca0ce
Refactor import_utils_kedro.py to improve folder filtering logic and …
VJausovec Feb 17, 2026
9b335f6
Refactor code structure for improved readability and maintainability
VJausovec Feb 17, 2026
f2054ce
Refactor code structure for improved readability and maintainability
VJausovec Feb 17, 2026
9183550
Refactor code structure for improved readability and maintainability
VJausovec Feb 18, 2026
f2a642d
Refactor import_utils_kedro.py for timestamp processing and error han…
VJausovec Feb 23, 2026
e8074e4
Refactor import_utils_kedro.py for categories Id match; update catalo…
VJausovec Feb 26, 2026
9437bf7
Refactor read_json_questionnaire function for simplified logic
VJausovec Feb 26, 2026
ddfa051
Refactor transform_multi function for improved value normalization; u…
VJausovec Feb 28, 2026
a681d15
Refactor feature creation and pipeline configuration; add missing fea…
VJausovec Mar 1, 2026
3ad4de2
Refactor feature processing functions for consistency; rename private…
VJausovec Mar 4, 2026
5092b98
Refactor time and pause feature calculations for improved clarity; fo…
VJausovec Mar 7, 2026
750c3f8
Refactor numeric feature handling for improved robustness; add missin…
VJausovec Mar 9, 2026
54e67ca
Add Kedro pirpeline for scoring functions and detection algorithms; …
VJausovec Mar 9, 2026
aa5a7cf
Implement risk scoring pipeline with item and unit score calculations…
VJausovec Mar 9, 2026
4f73c28
Refactor scoring functions and pipeline definitions; update contamina…
VJausovec Mar 9, 2026
7f1ddfa
Refactor scoring functions for improved clarity and robustness; enhan…
VJausovec Mar 11, 2026
5d45d8e
Refactor catalog.yml to reorganize feature engineering and creation D…
VJausovec Mar 11, 2026
0f21b9c
Refactor scoring pipeline and functions to improve handling of answer…
VJausovec Mar 12, 2026
a3795ab
Refactor feature processing and item processing functions to improve …
VJausovec Mar 16, 2026
e1ad49b
Refactor feature creation and processing pipelines to enhance handlin…
VJausovec Mar 16, 2026
1697b08
Refactor scoring pipeline to add separate removed_answers dataset; up…
VJausovec Mar 16, 2026
6a2ad51
Refactor code structure for improved readability and maintainability
VJausovec Mar 17, 2026
4980670
Refactor feature processing and pipeline definitions to enhance handl…
VJausovec Mar 21, 2026
5b8e6c2
Refactor event handling in add_item_time_features and feature functio…
VJausovec Mar 24, 2026
6032e73
Refactor scoring and feature processing to align with legacy behavior…
VJausovec Mar 26, 2026
c8d968a
Refactor microdata handling in data ingestion pipeline; introduce raw…
VJausovec Mar 26, 2026
15e9a76
Enhance feature processing by adding questionnaire filtering; include…
VJausovec Mar 27, 2026
43f5bb2
Refactor numeric response feature processing to apply answer value fi…
VJausovec Mar 29, 2026
0d948c3
Monkey patches to generate legacy scoring for testing
VJausovec Mar 29, 2026
75bb61f
Add consent filtering functionality to feature creation pipeline; ref…
VJausovec Mar 30, 2026
584f732
changes to logging and clean-up
VJausovec Mar 31, 2026
03dd68d
Refactor data ingestion and feature creation pipelines; remove legacy…
VJausovec Apr 2, 2026
c713607
Refactor paradata loading functions; separate raw loading logic and e…
VJausovec Apr 2, 2026
54a6efa
Enhance GPS scoring logic; introduce s__gps flag in calculate_gps_sco…
VJausovec Apr 3, 2026
087b306
Changes to how legacy scoring data is generated for testing
VJausovec Apr 5, 2026
3c646a6
Update GPS scoring logic to handle NaN values and drop feature column…
VJausovec Apr 5, 2026
8c31f0f
Enhance pause feature handling by filling NaN values for count and du…
VJausovec Apr 5, 2026
4e69ac0
Refactor calculate_unit_scores function to streamline scoring logic; …
VJausovec Apr 6, 2026
a218820
Refactor scoring functions to improve handling of edge cases; remove …
VJausovec Apr 6, 2026
9886ab0
add Kedro NiceGUI wrappers, requirements and set-up instructions
VJausovec Apr 6, 2026
2e3e692
Update Python version requirement and restructure dependencies in pyp…
VJausovec Apr 6, 2026
1122556
Enhance documentation and setup instructions; update Python version r…
VJausovec Apr 6, 2026
74dd20c
Rename first_decimal features and functions to first_decimals for cla…
VJausovec Apr 7, 2026
3b39af1
Remove legacy data configurations and update feature parameters in YA…
VJausovec Apr 7, 2026
696aceb
cleanup
VJausovec Apr 7, 2026
27cee4d
Refactor get_numeric_mask function to remove commented-out code and s…
VJausovec Apr 29, 2026
ed7afe4
Refactor Kedro pipeline dependencies and update scoring logic
VJausovec Apr 30, 2026
57f71dd
Enhance logging and error handling for data loading and processing fu…
VJausovec May 4, 2026
fb61d88
GPS - extreme outlier only if both latitude AND longitude 0
VJausovec May 6, 2026
daf7ae6
Rename unit_risk_scores to unit_rissk_scores in catalog and pipeline;…
VJausovec May 6, 2026
8f7acec
Remove deprecated configuration files and clean up unused directories…
VJausovec May 6, 2026
98674fb
Refactor project structure and update dependencies
VJausovec May 6, 2026
645191a
Add RISSK GUI for Kedro pipeline configuration and execution
VJausovec May 6, 2026
4479c33
CLEANUP - Remove legacy feature generation, item processing, plotting…
VJausovec May 6, 2026
009c83e
Refactor: Remove unused utility files and scripts
VJausovec May 6, 2026
78817ab
Update README and SETUP documentation for RISSK Kedro pipeline: enhan…
VJausovec May 6, 2026
9f6628c
Remove loguru dependency from environment.yml
VJausovec May 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -259,4 +259,16 @@ cython_debug/
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

embedded-assets/tmpo_c6_8gw.html
.gitignore
rissk_kedro/stats.json
.vscode/mcp.json
configuration/main.yaml
rissk/prompt.md
FEATURES_SCORES.md
rissk_kedro/conf/base/catalog.yml
FEATURES_SCORES_updated.md
prompt.md
markdown_docs/
requirements_legacy.txt
rissk_kedro/notebooks/
rissk/utils/testing_utils.py
717 changes: 717 additions & 0 deletions Kedro_vs_Legacy_Changelog.md

Large diffs are not rendered by default.

100 changes: 0 additions & 100 deletions Makefile

This file was deleted.

351 changes: 343 additions & 8 deletions README.md

Large diffs are not rendered by default.

11 changes: 0 additions & 11 deletions configuration/environment/notebook_environment.yaml

This file was deleted.

101 changes: 0 additions & 101 deletions configuration/main.yaml

This file was deleted.

7 changes: 0 additions & 7 deletions env.yaml

This file was deleted.

57 changes: 27 additions & 30 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -1,42 +1,39 @@
name: rissk
name: rissk_kedro
channels:
- conda-forge
- defaults
dependencies:
- python=3.9
# R and its dependencies
- r-base
- python=3.13
# R Core - 3.13 compatible binaries (not used in pipeline code)
- r-base>=4.4
- r-ggplot2
- r-dplyr
- r-tidyr
- r-shiny
- r-readr
- r-irkernel # For running R in Jupyter Notebooks
- r-stringr
# Interoperability
- rpy2 # For using R within Python
# Graphing libs
# Graphviz System Deps
- graphviz
- pygraphviz
# Other tools
- python-graphviz
- pydot
- pip
- pip:
- jupyter_contrib_nbextensions
- awscli
- botocore
- loguru==0.7.3
- tqdm==4.67.1
- pandas==2.2.2
- seaborn==0.13.2
- docutils==0.16
- openpyxl==3.1.2
- pyarrow==15.0.2
- pyod==1.1.3
- python-dotenv==1.0.1
- pythresh==0.3.6
- ploomber==0.23.3
- ipywidgets==8.1.5
- typer==0.15.1
- boto3==1.35.88
- botocore==1.35.88
# Framework
- kedro==1.2.0
- kedro-viz>=12.3.0
- kedro-datasets[pandas,s3fs,excel,files]>=9.1.0

# GUI
- nicegui>=1.4

# Core Stack
- rpy2>=3.6.4
- pandas>=2.2.3
- numpy>=2.1.0
- pyarrow>=18.0.0

# Analysis Tools
- pyod>=1.1.5
- pythresh>=1.0.3
- tqdm>=4.67.0
- boto3>=1.35.0
- python-dotenv>=1.0.1
- -e .
48 changes: 0 additions & 48 deletions main.py

This file was deleted.

Empty file removed notebooks/exploration/.gitkeep
Empty file.
29 changes: 0 additions & 29 deletions pipeline.yaml

This file was deleted.

Empty file removed pipelines/.gitkeep
Empty file.
Empty file.
Loading