Thank you for your interest in contributing! This document provides guidelines and instructions for contributing.
Please be respectful and constructive in all interactions.
git clone https://github.com/YOUR_USERNAME/thordata-python-sdk.git
cd thordata-python-sdk# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install with dev dependencies
pip install -e ".[dev]"git checkout -b feature/your-feature-name
# or
git checkout -b fix/your-bug-fixWe use the following tools to maintain code quality:
# Format code with Black
black src tests
# Lint with Ruff
ruff check src tests --fix
# Type check with MyPy
mypy src# Run all tests
pytest
# Run with coverage (recommended: use coverage CLI for reliable results)
python -m coverage run -m pytest -p no:cov -v tests
python -m coverage report -m
# Optional: HTML report
python -m coverage run -m pytest -p no:cov tests && python -m coverage html
# From repo root with script (Git Bash / Linux / macOS)
bash scripts/run_coverage.sh
# Run specific test
pytest tests/test_client.py -v -k test_serpBefore committing, ensure:
- ✅ All tests pass
- ✅ Code is formatted (black)
- ✅ No linting errors (ruff)
- ✅ Type checks pass (mypy)
- Update documentation if you've changed APIs
- Add tests for new functionality
- Update CHANGELOG.md with your changes
- Ensure CI passes on your PR
- Request review from maintainers
feat:Add new featurefix:Fix bug descriptiondocs:Update documentationrefactor:Refactor codetest:Add testschore:Update dependencies
src/thordata/
├── __init__.py # Public API exports
├── client.py # Sync client (main)
├── async_client.py # Async client
├── unlimited.py # Sync Unlimited Proxy namespace
├── async_unlimited.py # Async Unlimited Proxy namespace
├── models.py # Re-exports (ProxyConfig, etc.)
├── enums.py # Engine, TaskStatus, ProxyType, etc.
├── exceptions.py # Exception hierarchy
├── retry.py # Retry decorator and RetryConfig
├── serp_engines.py # SERP namespace (sync/async)
├── _utils.py # Internal: auth headers, parse_json, etc.
├── core/
│ ├── http_client.py # Sync HTTP session + retry
│ ├── async_http_client.py
│ └── tunnel.py # Proxy tunneling (HTTP/HTTPS/SOCKS5)
├── types/ # Request/response types (SerpRequest, ProxyConfig, etc.)
│ ├── common.py, proxy.py, serp.py, task.py, universal.py
└── tools/ # Pre-built scrapers (Amazon, YouTube, etc.)
├── base.py, code.py, ecommerce.py, search.py, social.py, video.py
- Write tests for all new features
- Use pytest fixtures for common setups
- Mock external API calls for unit tests
- Aim for >80% coverage on new code
Default pytest runs only unit tests. No .env or Clash needed.
pytest
# or with coverage
python -m coverage run -m pytest -p no:cov -v tests
python -m coverage report -mIntegration tests are skipped unless explicitly enabled. They require a real .env with credentials.
| Env | Meaning |
|---|---|
THORDATA_INTEGRATION=true |
Enable integration tests (e.g. proxy protocol test) |
THORDATA_INTEGRATION_STRICT=true |
Fail on any proxy error; if unset, skip on likely local interference |
THORDATA_INTEGRATION_HTTP=true |
Include HTTP (in addition to HTTPS/SOCKS5h) in proxy test |
THORDATA_UPSTREAM_PROXY |
Optional. Set if you are behind GFW/corporate proxy (e.g. Clash Verge http://127.0.0.1:7897, or Clash http://127.0.0.1:7890). Unit tests do not use this. |
When to use Clash/upstream proxy
- Unit tests: No. They mock HTTP; no proxy needed.
- Integration tests (e.g.
test_integration_proxy_protocols.py): Only if your network blocks Thordata. SetTHORDATA_UPSTREAM_PROXY=http://127.0.0.1:7897(Clash Verge) orhttp://127.0.0.1:7890(Clash) in.envand run withTHORDATA_INTEGRATION=true.
# Run only unit tests (default)
pytest -m "not integration"
# Run integration tests (requires .env + THORDATA_INTEGRATION=true)
THORDATA_INTEGRATION=true pytest -m integrationimport pytest
from thordata import ThordataClient, ProxyConfig
def test_proxy_config_validation():
"""Test that ProxyConfig validates parameters correctly."""
with pytest.raises(ValueError, match="session_duration"):
ProxyConfig(
username="user",
password="pass",
session_duration=100 # Invalid: max is 90
)- Update docstrings for API changes
- Follow Google-style docstrings
- Include examples in docstrings
def serp_search(
self,
query: str,
*,
engine: str = "google",
) -> dict[str, Any]:
"""
Execute a SERP search.
Args:
query: The search keywords.
engine: Search engine to use.
Returns:
Parsed JSON results from the search.
Raises:
ThordataAuthError: If authentication fails.
ThordataRateLimitError: If rate limited.
Example:
>>> results = client.serp_search("python tutorial")
>>> print(len(results.get("organic", [])))
"""- Open an issue for bugs or feature requests
- Email support@thordata.com for other questions
Thank you for contributing! 🎉