Documentation

Developer documentation for the Open Paws AI ecosystem: vector database, prediction models, generative language models, workflow automations, and HuggingFace datasets. This is the central reference for anyone building AI tools for animal advocacy using Open Paws infrastructure.

Quick Start

No build step -- this is a pure documentation repo. Browse by topic:

Vector database (Weaviate): Knowledge/README.md
Prediction models: Predictions/README.md
Language models (8B): Generation/README.md
n8n workflow automations: Automation/README.md

Architecture

Knowledge/       Weaviate vector-graph database docs (connection, search, RAG, schema)
Predictions/     HuggingFace text regression models (performance + preference prediction)
Generation/      8B language models (Llama 3.1 base, continual pre-training + instruct)
Automation/      n8n workflow templates for advocacy automation
.github/         Dependabot config + CI workflows

File Descriptions

Knowledge (Weaviate Vector Database)

File	Purpose
`Knowledge/README.md`	Weaviate connection details, search operations, RAG patterns, Content schema — primary reference for querying the Open Paws vector-graph database

Predictions (HuggingFace Models)

File	Purpose
`Predictions/README.md`	HuggingFace text regression model usage — performance and preference prediction, batch processing patterns, score clipping

Generation (Language Models)

File	Purpose
`Generation/README.md`	8B language model usage — Llama 3.1 base with continual pre-training and instruct tuning, generation parameters, known limitations

Automation (n8n Workflows)

File	Purpose
`Automation/README.md`	n8n workflow automation — hosting options (cloud/self-hosted), workflow import/export, activation, advocacy-specific workflow templates

Infrastructure

File	Purpose
`Infrastructure/README.md`	Clean-room agent architecture reference — shared runtime patterns, tool registry, operator controls, and safety boundaries across Open Paws repos. Canonical source for the 2026-04-01 clean-room reuse decision

Root

File	Purpose
`README.md`	Human-facing overview — quick start, architecture summary, HuggingFace dataset links
`CONTRIBUTING.md`	Contribution guidelines for documentation PRs
`.gitleaksignore`	Secret scanning exclusions — covers read-only Weaviate API keys that appear in code examples

External Dependencies

Service	Purpose	Docs
Weaviate Cloud	Vector-graph database (read-only access)	weaviate.io
HuggingFace	Model hosting + datasets	huggingface.co/open-paws
OpenAI API	Embeddings for Weaviate search	platform.openai.com
OpenRouter	LLM routing (used by downstream tools)	openrouter.ai
n8n	Workflow automation platform	docs.n8n.io

HuggingFace Datasets

Development

Adding documentation: Create a new directory with a README.md following the existing pattern
Code examples: Python with transformers or weaviate client libraries -- keep examples copy-pasteable
Style: Each section should be self-contained with connection details, code samples, and best practices

Organizational Context

Layer: 1 | Lever: Strengthen | Integration: Reference material for platform and ecosystem

This repo documents the AI infrastructure layer that Open Paws tools are built on. It is a reference for Guild developers, bootcamp students, and coalition partners building on the platform.

Settled decisions affecting this repo:

2026-04-01: Clean-room agent architecture — documentation owns the shared infrastructure note for the clean-room reuse decision. PR #7 merged (part of the 3/4 clean-room rollout completed 2026-04-09: PCC#13, platform#42, docs#7 merged). Remaining: Tools-Platform#1 repo name needs verification before the rollout is marked complete. See closed-decisions.md 2026-04-01.

Current status (as of 2026-04-09): Active reference. Clean-room architecture PR #7 merged. The shared infrastructure note is live. Tools-Platform#1 is the outstanding item in the 4-repo rollout.

Development Standards

10-Point Review Checklist (ranked by AI violation frequency)

DRY — AI clones code at 4x the human rate. Search before writing anything new
Deep modules — Reject shallow wrappers and pass-through methods
Single responsibility — Each function does one thing at one level of abstraction
Error handling — Never catch-all
Information hiding — Don't expose internal state. Mask API keys (last 4 chars only)
Ubiquitous language — Use movement terminology consistently
Design for change — Abstraction layers and loose coupling
Legacy velocity — Use characterization tests before modifying existing code
Over-patterning — Simplest structure that works
Test quality — Every test must fail when the covered behavior breaks

Quality Gates

Desloppify: desloppify scan --path . — minimum score ≥85
Speciesist language: semgrep --config semgrep-no-animal-violence.yaml on all docs edits
Two-failure rule: After two failed fixes on the same problem, stop and restart

Seven Concerns — Critical for This Repo

All 7 concerns apply. Highlighted critical ones:

Privacy (critical) — API keys documented here are read-only Weaviate keys. .gitleaksignore covers these. Never commit write-access keys. Check .gitleaksignore is current before adding new examples.
Security — Code examples must use environment variables for any keys. Never hardcode credentials.
Advocacy domain — All documentation must use movement terminology. Examples should reference farmed animals and factory farms, not industry euphemisms.
Accessibility — Documentation must work for developers on low-bandwidth connections. Avoid large embedded images.
Emotional safety — If documentation examples include advocacy content (animal welfare data, investigation statistics), apply content warnings.

Advocacy Domain Language

See CONTRIBUTING.md for the full list of required movement terminology and speciesist idioms to avoid.

Structured Coding Reference

For tool-specific AI coding instructions (Claude Code rules, Cursor MDC, Copilot, Windsurf, etc.), copy the corresponding directory from structured-coding-with-ai into this project root.

Decisions Reviewed

Last reviewed: 2026-04-11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation

Quick Start

Architecture

File Descriptions

Knowledge (Weaviate Vector Database)

Predictions (HuggingFace Models)

Generation (Language Models)

Automation (n8n Workflows)

Infrastructure

Root

External Dependencies

HuggingFace Datasets

Development

Organizational Context

Development Standards

10-Point Review Checklist (ranked by AI violation frequency)

Quality Gates

Seven Concerns — Critical for This Repo

Advocacy Domain Language

Structured Coding Reference

Decisions Reviewed

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

Documentation

Quick Start

Architecture

File Descriptions

Knowledge (Weaviate Vector Database)

Predictions (HuggingFace Models)

Generation (Language Models)

Automation (n8n Workflows)

Infrastructure

Root

External Dependencies

HuggingFace Datasets

Development

Organizational Context

Development Standards

10-Point Review Checklist (ranked by AI violation frequency)

Quality Gates

Seven Concerns — Critical for This Repo

Advocacy Domain Language

Structured Coding Reference

Decisions Reviewed