ML Analytics Tools

Utilities for common analytics and machine learning workflows: Redshift, S3, Google Sheets, Slack, MLflow, model evaluation, and SQL pipelines.

The package is intentionally infrastructure-neutral. Buckets, credentials, MLflow hosts, and tokens are provided by your environment or by explicit arguments.

What Is Included

DataConnector: run Redshift SQL, load SQL files, unload/load data through S3, and create Redshift tables from DataFrames.
S3Connector: read, write, list, delete, and query S3 data with DuckDB.
GSheet: read, write, share, and export Google Sheets data.
SlackConnector: send messages, upload files, and manage simple Slack interactions.
ModelManager: create MLflow experiments, log models, register versions, manage aliases, and handle permissions.
model_tools: classification, regression, survival analysis, CatBoost helpers, plotting, and reporting utilities.
utils: project-root discovery, SQL file loading, logging, credentials, and YAML SQL pipelines.

Install

From PyPI, after a release is available:

uv add ml-analytics-tools

Directly from GitHub:

uv add git+https://github.com/sdaza/ml-analytics-tools

For local development:

uv sync --all-groups

Configuration

The package loads a .env file from the project root when it is imported. Only configure the services you use.

# Redshift
BI_REDSHIFT_HOST=redshift-cluster.example.com
BI_REDSHIFT_DB=analytics
BI_REDSHIFT_USER=analytics_user
BI_REDSHIFT_PASSWORD=secret
BI_REDSHIFT_PORT=5439

# S3
ML_ANALYTICS_S3_BUCKET=my-analytics-bucket

# MLflow
MLFLOW_TRACKING_URI=https://mlflow.example.com
MLFLOW_TRACKING_USERNAME=user@example.com
MLFLOW_TRACKING_PASSWORD=secret

# Google Sheets
GSHEET_SPREADSHEET_ID=optional-default-sheet-id
GOOGLE_CREDENTIALS='{"type":"service_account", ...}'

# Slack
SLACK_BOT_TOKEN=xoxb-your-token

S3 buckets are never hard-coded. Pass bucket=... or s3_bucket=..., or set ML_ANALYTICS_S3_BUCKET.

AWS Authentication

Use the CLI helper for AWS SSO:

ml-analytics-auth

You can also call it from Python:

from ml_analytics import ensure_aws_authenticated

ensure_aws_authenticated()

See AWS Authentication and CLI Commands for details.

Quick Examples

Query Redshift

from ml_analytics import DataConnector

dc = DataConnector()

df = dc.sql("SELECT * FROM analytics.customer_features LIMIT 100")
df_polars = dc.sql("queries/features.sql", format="polars", country="es")

Create A Redshift Table From A DataFrame

dc.create_table_from_dataframe(
    df,
    table="model_scores",
    schema="analytics",
    drop_existing_table=True,
)

Work With S3

from ml_analytics import S3Connector

s3 = S3Connector(bucket="my-analytics-bucket", s3_root="projects/churn")

s3.save_dataframe(df, directory="outputs", file_name="scores")

summary = s3.query(
    """
    SELECT segment, count(*) AS rows
    FROM read_parquet('s3://my-analytics-bucket/projects/churn/outputs/*.parquet')
    GROUP BY segment
    """
)

Read And Write Google Sheets

from ml_analytics import GSheet

gsheet = GSheet(credentials_path="gsheet_credentials.json")

df = gsheet.read_sheet(spreadsheet_id="...", sheet_name="Input")
gsheet.write_sheet(df, spreadsheet_id="...", sheet_name="Results")

Log To MLflow

from ml_analytics import ModelManager

manager = ModelManager(model_name="churn-model", user="user@example.com")

manager.start_run("training")
manager.log_metric("auc", 0.91)
manager.end_run()

Send A Slack Message

from ml_analytics import SlackConnector

slack = SlackConnector()
slack.send_message(channel="#ml-alerts", text="Training finished")

Detailed Guides

Guide	Use It For
AWS Authentication	AWS SSO setup and Python helpers
CLI Commands	Available console commands
Google Sheets	Sheets setup, sharing, exports, and examples
Slack	Slack token setup and message/file examples
Tunnel Manager	SSH tunnel configuration and CLI usage

Development

Run the standard checks before opening a PR:

uv run ruff check
uv run pytest

CI runs Ruff and pytest on Python 3.11 and 3.12.

Releases

This repository uses Release Please. Conventional commits on main create or update a release PR with the next version and changelog. When that PR is merged, the release workflow builds the package and publishes it to PyPI through Trusted Publishing using the pypi GitHub environment.

Contributing

Keep changes small, covered by tests when behavior changes, and free of environment-specific defaults. Prefer explicit configuration over hidden infrastructure assumptions.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github/workflows		.github/workflows
docs		docs
ml_analytics		ml_analytics
tests		tests
.gitignore		.gitignore
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
release-please-config.json		release-please-config.json
system_deps.txt		system_deps.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Analytics Tools

What Is Included

Install

Configuration

AWS Authentication

Quick Examples

Query Redshift

Create A Redshift Table From A DataFrame

Work With S3

Read And Write Google Sheets

Log To MLflow

Send A Slack Message

Detailed Guides

Development

Releases

Contributing

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML Analytics Tools

What Is Included

Install

Configuration

AWS Authentication

Quick Examples

Query Redshift

Create A Redshift Table From A DataFrame

Work With S3

Read And Write Google Sheets

Log To MLflow

Send A Slack Message

Detailed Guides

Development

Releases

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages