Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 52 additions & 10 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ It should track the code in `main.py`, not stale assumptions from earlier iterat

- This is an OpenCTI external-import connector for Double Extortion Platform (DEP) announcements.
- The connector authenticates against DEP AWS Cognito, fetches announcement records from the DEP REST API, converts them to STIX 2.1, and sends bundles to OpenCTI with `update=True`.
- The connector scope is `incident,identity,indicator`.
- The connector scope is `report,incident,identity,indicator`.
- The implementation is concentrated in a single runtime file: `main.py`.

## Runtime and configuration truths
Expand Down Expand Up @@ -54,7 +54,7 @@ It should track the code in `main.py`, not stale assumptions from earlier iterat
- `sector`, `actor`, and `country` are whitespace-normalized; empty strings, `n/a`, and `none` become `None`.
- Indicator domain extraction prefers `victimDomain`, then falls back to `site`.
- Domain normalization uses `urlsplit`, extracts the hostname, and lowercases it.
- `annDescription` is URL-decoded with `urllib.parse.unquote` before the incident is created.
- `annDescription` is URL-decoded with `urllib.parse.unquote` before the report or incident is created.

## Filtering rules

Expand All @@ -80,14 +80,41 @@ It should track the code in `main.py`, not stale assumptions from earlier iterat
- type: `Identity`
- identity_class: `organization`
- contact: `https://doubleextortion.com/`
- Every emitted object and relationship carries the label `DigIntLab`.
- Every emitted object and relationship created from DEP content carries the label `DigIntLab`.
- Confidence is consistently taken from `DEP_CONFIDENCE`.
- Bundles are deduplicated by STIX ID before sending to OpenCTI.
- Prefer deterministic IDs for DEP-derived entities and relationships to keep re-imports idempotent.

## Data model mappings

### Incident
### Primary object

- Controlled by `DEP_PRIMARY_OBJECT` (default: `report`).
- `report`: each announcement is wrapped in a STIX `Report` container whose `object_refs` includes all correlated entities and relationships. This is the default and preferred mode for Knowledge Graph analysis.
- `incident`: each announcement is modeled as a standalone STIX `Incident` with explicit relationship edges (`targets`, `attributed-to`, `indicates`).

### Report (default mode)

- One report is created per DEP announcement.
- The report is always created, even when no victim identity is created.
- Deterministic report ID is based on normalized DEP `hashid`:
- `report--uuid5(NAMESPACE_URL, "dep-announcement:<hashid>")`
- Report name format:
- `DEP announcement - <victim>`
- fallback to `victimDomain`
- fallback to `Unknown Victim`
- `published` is derived from the DEP `date` at `00:00:00Z`.
- `report_types`: `["threat-report"]`
- Report custom properties (when present):
- `dep_actor`
- `dep_country`
- Report labels always include `DigIntLab`, plus one label per announcement type:
- `dep:announcement-type:<lowercased enum value>`
- Report external reference prefers `annLink`; if absent, it falls back to `site`.
- `annTitle` is attached as the external reference description when present.
- `object_refs` contains all objects in the bundle (author identity, victim, indicators, intrusion set, country, sector, and all relationships between them).

### Incident (incident mode)

- One incident is created per DEP announcement.
- The incident is always created, even when no victim identity is created.
Expand Down Expand Up @@ -172,14 +199,30 @@ It should track the code in `main.py`, not stale assumptions from earlier iterat
- pattern: `[file:hashes.'<type>' = '<hash>']`
- Indicator IDs are deterministic because they are generated from the STIX pattern.
- Indicator `valid_from` uses current UTC processing time, so timestamps are not deterministic even though IDs are.
- Indicators are linked to incidents, not to victims.
- Indicators are also linked to the victim with `related-to`.
- In incident mode, indicators are linked to the incident with `indicates`.
- In report mode, indicators are included in the report's `object_refs` and can also have explicit `related-to -> victim` edges.

## Relationships emitted

### In report mode (default)

- `victim -> sector` with `part-of`
- `victim -> country` with `located-at`
- `indicator -> victim` with `related-to`
- `intrusion-set -> sector` with `targets`
- `intrusion-set -> country` with `targets`
- `sector -> country` with `related-to`

All of the above, plus the victim, indicators, and intrusion set, are referenced in the Report's `object_refs`. There is no `attributed-to` edge from the Report itself because the Report is a container, not a relationship endpoint.

### In incident mode

- `incident -> victim` with `targets`
- `victim -> sector` with `part-of`
- `incident -> intrusion-set` with `attributed-to`
- `victim -> country` with `located-at`
- `indicator -> victim` with `related-to`
- `intrusion-set -> sector` with `targets`
- `intrusion-set -> country` with `targets`
- `sector -> country` with `related-to`
Expand All @@ -198,6 +241,7 @@ These links are created automatically when both related objects exist. There are
- `DEP_CREATE_INTRUSION_SETS`
- `DEP_CREATE_COUNTRY_LOCATIONS`
- Important non-boolean knobs:
- `DEP_PRIMARY_OBJECT` (default: `report`; valid values: `report`, `incident`)
- `DEP_DSET`
- `DEP_LOOKBACK_DAYS`
- `DEP_OVERLAP_HOURS`
Expand All @@ -217,7 +261,7 @@ These links are created automatically when both related objects exist. There are
- Keep optional enrichment behind the existing feature flags.
- Do not reintroduce removed compatibility flags for cross-entity relationships.
- If you change modeling, update `README.md`, `config.yml.sample`, and `AGENTS.md` together.
- If you touch incident or indicator generation, verify idempotency assumptions still hold under `update=True`.
- If you touch report, incident, or indicator generation, verify idempotency assumptions still hold under `update=True`.

## Validation and local workflow

Expand All @@ -234,16 +278,14 @@ These links are created automatically when both related objects exist. There are
- Run type checks:
- `task type-check`
- Main quality gate:
- `task check`
- Additional syntax check:
- `python -m compileall main.py`
- `task format check type-check`
- Docker-based runtime validation can be satisfied by either:
- building and running the connector image directly
- using `docker compose up` with the local stack when broader integration checks are needed
- Never start the connector before the OpenCTI API/platform is ready and reachable.
- During Docker-based validation, wait for OpenCTI readiness first, then start the connector.

`task check` is the canonical combined gate from `Taskfile.yml` because it runs format check, lint, and mypy.
Use `task format check type-check` for complete local checks before considering code changes done.

There is a `task test` target, but there is currently no first-party test suite in this repository. Do not assume automated test coverage exists.
For code changes, do not stop at static checks alone; perform Docker-based runtime validation as well.
Expand Down
50 changes: 26 additions & 24 deletions DOCKERHUB.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,12 @@ An [OpenCTI](https://github.com/OpenCTI-Platform/OpenCTI) external-import connec
## What it does

- Authenticates against the DEP AWS Cognito identity provider
- Polls the DEP REST API on a configurable interval and maps each announcement to an OpenCTI **Incident**
- Polls the DEP REST API on a configurable interval and maps each announcement to an OpenCTI **Report** by default, or an **Incident** when `DEP_PRIMARY_OBJECT=incident`
- Creates **Organization** identities for victim companies
- Optionally creates **Sector** identities and links victims via a `part-of` relationship
- Optionally generates **Indicators** for victim domains and leak hash identifiers
- Attaches announcement-type labels (e.g. `dep:announcement-type:pii`) to incidents
- Links generated indicators to the victim with `related-to`
- Attaches announcement-type labels (e.g. `dep:announcement-type:pii`) to the primary object
- Maintains connector state with a configurable overlap window to capture late DEP updates

---
Expand All @@ -37,31 +38,32 @@ All values can be set via environment variables (which take precedence) or via a

### Required

| Environment variable | Description |
|---|---|
| `OPENCTI_URL` | URL of your OpenCTI platform |
| `OPENCTI_TOKEN` | OpenCTI API token |
| `DEP_USERNAME` | DEP portal username |
| `DEP_PASSWORD` | DEP portal password |
| `DEP_API_KEY` | API key issued by DEP |
| `DEP_CLIENT_ID` | AWS Cognito App Client ID |
| Environment variable | Description |
| -------------------- | ---------------------------- |
| `OPENCTI_URL` | URL of your OpenCTI platform |
| `OPENCTI_TOKEN` | OpenCTI API token |
| `DEP_USERNAME` | DEP portal username |
| `DEP_PASSWORD` | DEP portal password |
| `DEP_API_KEY` | API key issued by DEP |
| `DEP_CLIENT_ID` | AWS Cognito App Client ID |

### Optional

| Environment variable | Default | Description |
|---|---|---|
| `CONNECTOR_RUN_INTERVAL` | `3600` | Polling interval in seconds |
| `DEP_CONFIDENCE` | `70` | Confidence score on generated STIX objects |
| `DEP_LOOKBACK_DAYS` | `7` | Days to look back on first run |
| `DEP_OVERLAP_HOURS` | `72` | Overlap hours from previous run to catch late updates |
| `DEP_DSET` | `ext` | Dataset to query (e.g. `ext`, `sanctions`) |
| `DEP_EXTENDED_RESULTS` | `true` | Request extended leak information |
| `DEP_ENABLE_SITE_INDICATOR` | `true` | Create a domain indicator per victim |
| `DEP_ENABLE_HASH_INDICATOR` | `true` | Create a hash indicator when a hash is provided |
| `DEP_SKIP_EMPTY_VICTIM` | `true` | Skip items where victim name is empty or n/a |
| `DEP_CREATE_SECTOR_IDENTITIES` | `true` | Create sector identities and link victims |
| `DEP_LOGIN_ENDPOINT` | `https://cognito-idp.eu-west-1.amazonaws.com/` | Cognito login endpoint |
| `DEP_API_ENDPOINT` | `https://api.eu-ep1.doubleextortion.com/v1/dbtr/privlist` | DEP REST endpoint |
| Environment variable | Default | Description |
| ------------------------------ | --------------------------------------------------------- | ----------------------------------------------------- |
| `CONNECTOR_RUN_INTERVAL` | `3600` | Polling interval in seconds |
| `DEP_CONFIDENCE` | `70` | Confidence score on generated STIX objects |
| `DEP_LOOKBACK_DAYS` | `7` | Days to look back on first run |
| `DEP_OVERLAP_HOURS` | `72` | Overlap hours from previous run to catch late updates |
| `DEP_DSET` | `ext` | Dataset to query (e.g. `ext`, `sanctions`) |
| `DEP_PRIMARY_OBJECT` | `report` | Primary STIX object to emit: `report` or `incident` |
| `DEP_EXTENDED_RESULTS` | `true` | Request extended leak information |
| `DEP_ENABLE_SITE_INDICATOR` | `true` | Create a domain indicator per victim |
| `DEP_ENABLE_HASH_INDICATOR` | `true` | Create a hash indicator when a hash is provided |
| `DEP_SKIP_EMPTY_VICTIM` | `true` | Skip items where victim name is empty or n/a |
| `DEP_CREATE_SECTOR_IDENTITIES` | `true` | Create sector identities and link victims |
| `DEP_LOGIN_ENDPOINT` | `https://cognito-idp.eu-west-1.amazonaws.com/` | Cognito login endpoint |
| `DEP_API_ENDPOINT` | `https://api.eu-ep1.doubleextortion.com/v1/dbtr/privlist` | DEP REST endpoint |

---

Expand Down
Loading
Loading