Skip to content

Conversation

@eugenio
Copy link
Collaborator

@eugenio eugenio commented Jan 30, 2026

Summary

This PR contains AI-assisted improvements to the cybersecurity attacks analysis script:

  • Fixed multiple runtime errors preventing the script from executing properly
  • Added comprehensive documentation to improve code readability
  • Enhanced visualizations with proper titles, labels, and legends

Bug Fixes

Column Transformation Guards

  • Added conditional checks for already-transformed columns throughout script.py
  • Fixed KeyError: 'Packet Type' (column was already transformed to 'Packet Type Control')
  • Added guards for: Protocol, Traffic Type, Attack Signature, Action Taken, Network Segment, Log Source, Device Information, Alert Trigger, Malware Indicators, Firewall Logs, IDS/IPS Alerts, Geo-location Data

Function Call Fixes

  • Fixed Device_type() function call (removed erroneous parentheses in .apply())

Statistical Analysis Fixes

  • Fixed chi2_contingency error by creating proper contingency table with encoded values
  • Fixed MCA ValueError by adding +1 to one_hot encoded data for positive values
  • Removed invalid mode parameter from px.line() call

Dependency Fix

  • Added statsmodels to pixi.toml pypi-dependencies

Documentation Improvements

Code Documentation

  • Added comprehensive docstrings to all functions
  • Added inline comments explaining data transformations
  • Added section headers with markdown formatting

Visualization Improvements

  • Added descriptive titles to all charts
  • Added axis labels (xaxis_title, yaxis_title) to all plots
  • Added hover templates with meaningful labels
  • Added legend titles and context

Test Plan

  • Script runs to completion without errors
  • All visualizations render correctly with proper labels
  • Dataset is saved correctly after processing

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.5 noreply@anthropic.com

This commit contains changes made with AI assistance (Claude Opus 4.5).

## Bug Fixes

### Column Transformation Guards
- Added conditional checks for already-transformed columns throughout script.py
- Fixed KeyError for 'Packet Type' (already transformed to 'Packet Type Control')
- Added guards for: Protocol, Traffic Type, Attack Signature, Action Taken,
  Network Segment, Log Source, Device Information, Alert Trigger,
  Malware Indicators, Firewall Logs, IDS/IPS Alerts, Geo-location Data

### Function Call Fixes
- Fixed Device_type() function call at line ~510 (removed erroneous parentheses)
- Changed from `df[col_name].apply(Device_type())` to `df[col_name].apply(Device_type)`

### Statistical Analysis Fixes
- Fixed chi2_contingency error by creating proper contingency table with encoded values
- Fixed MCA ValueError by adding +1 to one_hot encoded data for positive values
- Removed invalid 'mode' parameter from px.line() call

### Dependency Fix
- Added statsmodels to pixi.toml pypi-dependencies to fix ModuleNotFoundError

## Documentation Improvements

### Code Documentation
- Added comprehensive docstrings to all functions
- Added inline comments explaining data transformations
- Added section headers with markdown formatting
- Added explanatory comments for complex operations

### Visualization Improvements
- Added descriptive titles to all charts
- Added axis labels (xaxis_title, yaxis_title) to all plots
- Added hover templates with meaningful labels
- Added legend titles and context
- Improved color schemes and formatting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
print( df[ col_name ].value_counts())
df = catvar_mapping( col_name , [ "Control" ] , [ "Control" ])
piechart_col( "Packet Type Control" )
if col_name in df.columns:
Copy link
Owner

@KalooIna KalooIna Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this if section is useless because Packet Type is an original column of the dataset

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove this?

col_name = "Packet Type"
"""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STOPPED HERE, MARKUP

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean? I really don't understand, sorry :'(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants