Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions .github/workflows/update-sdk-extras.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
name: Update SDK Extras Documentation

on:
repository_dispatch:
types: [pyproject-updated]
workflow_dispatch:

jobs:
update-sdk-extras:
runs-on: ubuntu-latest

steps:
- name: Checkout docs repository
uses: actions/checkout@v5
with:
fetch-depth: 0

- name: Checkout wandb repository
uses: actions/checkout@v5
with:
repository: wandb/wandb
path: wandb-repo
token: ${{ secrets.GITHUB_TOKEN }}

- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.11'

- name: Install dependencies
run: |
pip install tomli

- name: Generate timestamp
id: timestamp
run: |
TIMESTAMP=$(date -u +"%Y-%m-%d %H:%M:%S UTC")
echo "timestamp=${TIMESTAMP}" >> $GITHUB_OUTPUT
echo "Generated at: ${TIMESTAMP}"

- name: Copy pyproject.toml to scripts directory
run: |
echo "Copying wandb/pyproject.toml to docs/scripts/"
cp wandb/pyproject.toml docs/scripts/pyproject.toml
echo "File copied successfully"

- name: Generate SDK Extras Table
run: |
echo "Generating SDK extras table from pyproject.toml..."
cd scripts
python generate_sdk_extras_table.py
echo "SDK extras table generated"

- name: Check for changes
id: check-changes
run: |
if [ -n "$(git status --porcelain)" ]; then
echo "has_changes=true" >> $GITHUB_OUTPUT
echo "Changes detected in python-sdk-extras.mdx"
git diff snippets/en/_includes/python-sdk-extras.mdx
else
echo "has_changes=false" >> $GITHUB_OUTPUT
echo "No changes detected"
fi

- name: Create Pull Request
if: steps.check-changes.outputs.has_changes == 'true'
id: create-pr
uses: peter-evans/create-pull-request@v7
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: "chore: Update SDK extras documentation"
title: "Update SDK extras documentation from pyproject.toml"
draft: false
body: |
This PR updates the SDK extras documentation based on the latest `wandb/pyproject.toml` configuration.

**Generated on**: ${{ steps.timestamp.outputs.timestamp }}
**Source**: `wandb/wandb` repository `pyproject.toml`
**Triggered by**: Repository dispatch event from wandb/wandb

### Changes
- Synced latest optional dependencies from wandb package
- Updated SDK extras table in documentation

### Review Checklist
- [ ] Verify all extras are correctly listed
- [ ] Check that package links are accurate
- [ ] Confirm excluded extras (models, kubeflow, launch, importers, perf) are intentionally omitted

---
*This PR was automatically generated by the SDK extras update workflow.*
branch: update-sdk-extras-${{ github.run_number }}
delete-branch: true
labels: |
documentation
automated
sdk-extras

- name: Display Result
run: |
if [ "${{ steps.check-changes.outputs.has_changes }}" == "true" ]; then
if [ -n "${{ steps.create-pr.outputs.pull-request-number }}" ]; then
PR_URL="https://github.com/${{ github.repository }}/pull/${{ steps.create-pr.outputs.pull-request-number }}"
echo "PR created successfully!"
echo "::notice title=Pull Request Created::A PR has been created for SDK extras updates: $PR_URL"
fi
else
echo "SDK extras documentation is up to date"
echo "::notice title=No Updates Needed::SDK extras documentation is already up to date"
fi
203 changes: 196 additions & 7 deletions models/ref/python.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,28 @@
title: Python SDK 0.23.0
module:
---

The W&B Python SDK, accessible at `wandb`, enables you to train and fine-tune models, and manage models from experimentation to production.

> After performing your training and fine-tuning operations with this SDK, you can use [the Public API](/models/ref/python/public-api) to query and analyze the data that was logged, and [the Reports and Workspaces API](/models/ref/wandb_workspaces) to generate a web-publishable [report](/models/reports/) summarizing your work.

## Installation and setup

### Sign up and create an API key

## Sign up and create an API key

To authenticate your machine with W&B, you must first generate an API key at https://wandb.ai/authorize.

### Install and import packages
## Install and import packages

Install the W&B library.
Install the W&B Python SDK using `pip`:

```
```shell
pip install wandb
```

### Import W&B Python SDK:
## Import W&B Python SDK

The following code snippet demonstrates how to import the W&B Python SDK and initialize a run. Replace `<team_entity>` with your team entity name.

```python
import wandb
Expand All @@ -33,4 +36,190 @@ project = "my-awesome-project"

with wandb.init(entity=entity, project=project) as run:
run.log({"accuracy": 0.9, "loss": 0.1})
````
```

## Python SDK extras

Install optional Python extras to extend the functionality of the W&B Python SDK.

Specify the name of the extra you want to install within square brackets after `wandb`. The syntax is:

```shell
pip install wandb[extra]
```

For example, to install W&B with Google Cloud Storage support, run:

```shell
pip install wandb[gcp]
```

Install more than one optional dependency by separating them with commas:

```shell
pip install wandb[gcp,aws,media]
```

The following table lists Python SDK extras and their suggested use cases.

{/* python-extras-start */}
| Extra | Packages included | Install if you |
|---------|---------|---------|
| `gcp` | [google-cloud-storage](https://pypi.org/project/google-cloud-storage/) | Use `gs://` artifact references. |
| `aws` | [boto3](https://pypi.org/project/boto3/), [botocore](https://pypi.org/project/botocore/) | Use `s3://` artifact references. |
| `azure` | [azure-identity](https://pypi.org/project/azure-identity/), [azure-storage-blob](https://pypi.org/project/azure-storage-blob/) | Use Azure Blob Storage artifact references. |
| `media` | [numpy](https://pypi.org/project/numpy/), [moviepy](https://pypi.org/project/moviepy/), [imageio](https://pypi.org/project/imageio/), [pillow](https://pypi.org/project/pillow/), [bokeh](https://pypi.org/project/bokeh/), [soundfile](https://pypi.org/project/soundfile/), [plotly](https://pypi.org/project/plotly/), [rdkit](https://pypi.org/project/rdkit/) | Log images, video, audio, or plots from raw data (numpy arrays, tensors). |
| `sweeps` | [sweeps](https://pypi.org/project/sweeps/) | Run local sweep controller (`wandb.controller()`). |
| `workspaces` | [wandb-workspaces](https://pypi.org/project/wandb-workspaces/) | Programmatically manage workspaces. |
{/* python-extras-end */}

The following tabs provide more details about each extra, including installation instructions, dependencies, and code examples.

<Tabs>
<Tab title="GCP">

Use the `gcp` extra if you add reference artifacts that start with `gs://` URIs.

Installation:

```shell
pip install wandb[gcp]
```

Dependencies:

- `google-cloud-storage`

Example:

```python
import wandb

artifact = wandb.Artifact("my-artifact", type="dataset")
artifact.add_reference("gs://bucket/path/to/file")

with wandb.init() as run:
run.log_artifact(artifact)
```

</Tab>
<Tab title="AWS">

Use the `aws` extra if you add reference artifacts that start with `s3://` URIs.

Installation:

```shell
pip install wandb[aws]
```

Dependencies:
- `boto3`
- `botocore>=1.5.76`

Example:

```python
import wandb

artifact = wandb.Artifact("my-artifact", type="dataset")
artifact.add_reference("s3://bucket/path/to/file")

with wandb.init() as run:
run.log_artifact(artifact)
```

</Tab>
<Tab title="Azure">

Use the `azure` extra if you add artifact references to Azure Blob Storage (URLs ending in `.blob.core.windows.net`).

Installation:

```shell
pip install wandb[azure]
```

Dependencies:
- `azure-identity`
- `azure-storage-blob`

Example:

```python
import wandb

artifact = wandb.Artifact("my-artifact", type="dataset")
artifact.add_reference("https://<account>.blob.core.windows.net/container/blob")

with wandb.init() as run:
run.log_artifact(artifact)
```

</Tab>
<Tab title="Media">

Use the `media` extra if you log images, video, audio, or plots from raw data (numpy arrays, tensors).

Installation:

```shell
pip install wandb[media]
```

Dependencies:
- `numpy`
- `moviepy>=1.0.0`
- `imageio>=2.28.1`
- `pillow`
- `bokeh`
- `soundfile`
- `plotly>=5.18.0`
- `rdkit`

The following table lists the dependencies included in the `media` extra and their use cases:

| Dependency | Use Case |
|------------------|-------------------------------------------------------------------------------------------------------|
| pillow | Logging images with `wandb.Image()` with numpy arrays or tensors (wandb/sdk/data_types/image.py) |
| moviepy, imageio | Logging videos with `wandb.Video()` with numpy arrays (wandb/sdk/data_types/video.py) |
| soundfile | Logging audio with `wandb.Audio()` with raw numpy data (wandb/sdk/data_types/audio.py) |
| plotly | Logging interactive charts with `wandb.Plotly()` (wandb/sdk/data_types/plotly.py) |
| bokeh | Logging Bokeh plots with `wandb.Bokeh()` (wandb/sdk/data_types/bokeh.py) |
| rdkit | Logging molecular structures with `wandb.Molecule.from_rdkit()` (wandb/sdk/data_types/molecule.py) |
</Tab>
<Tab title="Sweeps">

Use the `sweeps` extra if you run a local sweep controller with `wandb.controller()`.

Installation:

```shell
pip install wandb[sweeps]
```

Dependencies:
- `sweeps>=0.2.0`

<Note>
The cloud-based sweep agent (`wandb agent`) does NOT require this extra.
</Note>

</Tab>
<Tab title="Workspaces">

Use the `workspaces` extra if you programmatically create or edit W&B workspaces with `wandb.apis.workspaces`.

Installation:

```shell
pip install wandb[workspaces]
```

Dependencies:
- `wandb-workspaces`



</Tab>
</Tabs>
Loading