Welcome to the Team!

Glad to have you onboard for our predictive modeling project! While you get familiar, here are a few things I'd like you all to brush up on:

1. Linear Regression

For this project, I plan on starting with Linear Regression as it is a fundamental part of machine learning and also a lot easier to grasp and understand.

Resource: Codecademy - Simple Linear Regression Course (free!)

2. Git/GitHub

I can't stress this one enough. Understanding how to use Git is absolutely fundamental — so much that I actually want to take the time during our next standup to show you all the workflow we're looking for.

IF YOU'RE UNFAMILIAR WITH GIT: Please check out this resource, it's a great interactive tool that will teach you what "git" is and how crucial it is in our project - Learn git branching

General workflow:

create feature branch → make edits → pull from main → open pull request → merge → repeat

Don't forget to pull from main regularly to keep up to date

Important: Commit Message Standards

Please follow this format for all commit messages:

[COMMIT TYPE]: [COMMIT MESSAGE]

Commit types:

feat - New features
fix - Bug fixes
doc - Documentation changes
refactor - Code refactoring
test - Test additions or modifications

Example: feat: add data preprocessing pipeline

3. Python

Pretty obvious and not too hard. There's lots of great resources out there, but this one is a personal favorite of mine:

Resource: futurecoder.io

4. CI/CD and Code Quality Tools

We have automated code quality checks that run on every pull request. Here's what you need to know:

What Gets Checked Automatically

Every time you open a PR, GitHub Actions will automatically run:

Ruff format - Ensures code formatting consistency
Ruff lint - Checks code quality and style
mypy - Type checking for better code reliability
bandit - Security scanning
pytest - Runs all tests

Setting Up Your Local Environment

Install the development tools:

pip install ruff mypy pytest bandit pre-commit

Install pre-commit hooks (highly recommended - runs checks before each commit):

pre-commit install

Running Checks Locally

Before pushing your code, you can run these commands to catch issues early:

ruff format .          # Auto-format your code
ruff check --fix .     # Lint and auto-fix issues
mypy .                 # Check types
pytest                 # Run tests

Or run all pre-commit hooks manually:

pre-commit run --all-files

Writing Tests

All tests go in the tests/ directory. Example test structure:

def test_data_preprocessing():
    """Test that preprocessing removes null values."""
    raw_data = load_sample_data()
    processed = preprocess(raw_data)
    assert not processed.isnull().any().any()

For ML projects, focus on testing:

Data preprocessing functions
Model input/output validation
Feature engineering logic
Edge cases and data quality

This is to be updated...

5. What's Next?

TBD!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Welcome to the Team!

1. Linear Regression

2. Git/GitHub

3. Python

4. CI/CD and Code Quality Tools

What Gets Checked Automatically

Setting Up Your Local Environment

Running Checks Locally

Writing Tests

5. What's Next?

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Welcome to the Team!

1. Linear Regression

2. Git/GitHub

3. Python

4. CI/CD and Code Quality Tools

What Gets Checked Automatically

Setting Up Your Local Environment

Running Checks Locally

Writing Tests

5. What's Next?