Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
cf96580
creating branch for redactor
Dec 3, 2025
c3501c6
finetuned pii redactor code
rudra-singh1 Dec 9, 2025
2aecab2
Merge main into llm-redactor - add PII redactor
rudra-singh1 Dec 9, 2025
d1dd98e
Merge pull request #37 from rudra-singh1/llm-redactor
philion Dec 9, 2025
59d0b25
1st attempt at redactor integration
Dec 10, 2025
cace3ec
finetuned 3B model to limit hallucination
rudra-singh1 Dec 10, 2025
4f6677f
Merge pull request #38 from rudra-singh1/llm-redactor
philion Dec 10, 2025
24ddd5e
added varied data for training
rudra-singh1 Dec 10, 2025
baf1801
Merge pull request #39 from rudra-singh1/llm-redactor
philion Dec 10, 2025
7db74f7
getting a little further
Dec 10, 2025
271da60
Refactoring the redactor for test. Adding some simple test cases.
Dec 10, 2025
7927648
Refine JSON key formatting and rules
rudra-singh1 Jan 7, 2026
ea66170
Merge pull request #40 from rudra-singh1/patch-1
philion Jan 7, 2026
d2f3918
integration complete. needs testing on the email threading side.
Jan 8, 2026
02b85e8
new model with stricter dataset
rudra-singh1 Jan 12, 2026
fc624bb
Merge pull request #41 from rudra-singh1/llm-redactor
rudra-singh1 Jan 12, 2026
f19f67e
Refine PII redaction rules and system prompt
rudra-singh1 Jan 12, 2026
d081dd9
Updating tests and prompt, to remove empty line.
Jan 12, 2026
262c999
commit current llm redactor and scn-add None-value bug
philion Mar 13, 2026
844bc73
removing finetuning dir; llm training moved to different project
philion Mar 13, 2026
a3b1f11
commit current llm redactor and scn-add None-value bug
rudra-singh1 Mar 13, 2026
30961ec
removing finetuning dir; llm training moved to different project
rudra-singh1 Mar 13, 2026
3262383
checking in latest changes
philion May 3, 2026
370ea87
merging
philion May 3, 2026
0babf0c
cleaning up redactor client
May 4, 2026
aceea0a
Linting.
May 4, 2026
dd7cdc8
fixing github action to use uv
May 4, 2026
d650efd
cleaning up dependencies, fixing tests
May 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 44 additions & 18 deletions .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# This workflow will install Python dependencies, run tests and lint with a single version of Python
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python

# Updating to use `uv`. Versions pinned are current as of May 4, 2026.

name: Python application

on:
Expand All @@ -18,22 +20,46 @@ jobs:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- name: Set up Python 3.13
uses: actions/setup-python@v3
- uses: actions/checkout@v6



- name: "Set up Python"
uses: actions/setup-python@v6
with:
python-version-file: ".python-version"

- name: Install uv
uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b
with:
python-version: "3.13"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8 pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest
run: |
pytest
# Install a specific version of uv.
version: "0.11.8"

- name: Install the project
run: uv sync --locked --all-extras --dev

# Broken.
- name: Run tests
# Run test using uv
run: uv run -m tests


# OLD, for reference
# - name: Set up Python 3.13
# uses: actions/setup-python@v5
# with:
# python-version: "3.13"
# - name: Install dependencies
# run: |
# python -m pip install --upgrade pip
# pip install flake8 pytest
# if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
# - name: Lint with flake8
# run: |
# # stop the build if there are Python syntax errors or undefined names
# flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
# flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
# - name: Test with pytest
# run: |
# pytest
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -163,4 +163,6 @@ pyrightconfig.json

# direnv files, used by load python venv
.direnv/
.envrc
.envrc
.local.env
.envrc
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ run:
lint:
uvx ruff check .

fix:
uvx ruff check --fix .

test: lint
uv run -m tests

Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,7 @@ A `Makefile` is provided with the following targets:
- `htmlcov` : run the unit tests and generate a full report in htmlcov/

Testing and coverage requires standing up a local testbed. For details, see [Design](docs/design.md).


## Adding `llm-redactor` branch
For the redactor feature.
4 changes: 3 additions & 1 deletion compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,7 @@ services:
- DISCORD_TOKEN=${DISCORD_TOKEN}
- REDMINE_TOKEN=${REDMINE_TOKEN}
- REDMINE_URL=${REDMINE_URL}
volumes:
- ./redaction_queue.json:/app/redaction_queue.json #share queue file
restart: on-failure:1
network_mode: host
network_mode: host
208 changes: 207 additions & 1 deletion data/custom_fields.json
Original file line number Diff line number Diff line change
@@ -1 +1,207 @@
{"custom_fields":[{"id":2,"name":"Discord ID","description":"ID used to link user to their discord account, to enable integration","customized_type":"user","field_format":"string","regexp":"","min_length":null,"max_length":null,"is_required":false,"is_filter":false,"searchable":false,"multiple":false,"default_value":"","visible":true},{"id":4,"name":"syncdata","description":"Metadata used to sync the ticket with external sources.\r\n\r\nCurrent format is thread-id|zulu-timestamp.\r\n\r\nNot recommend to edit.","customized_type":"issue","field_format":"string","regexp":"","min_length":null,"max_length":null,"is_required":false,"is_filter":true,"searchable":false,"multiple":false,"default_value":"","visible":false,"trackers":[{"id":2,"name":"Infra-Field"},{"id":4,"name":"Software-Dev"},{"id":6,"name":"Infra-Config"},{"id":8,"name":"External-Comms-Intake"},{"id":9,"name":"Outreach-Partnerships"},{"id":10,"name":"Admin"},{"id":17,"name":"Research"},{"id":18,"name":"Mutual-Aid-Action"},{"id":19,"name":"SCN-Space"}],"roles":[{"id":3,"name":"Administrator"}]},{"id":5,"name":"To/CC","description":"Contains the To and Cc headers from the email that created the ticket.","customized_type":"issue","field_format":"string","regexp":"","min_length":null,"max_length":null,"is_required":false,"is_filter":false,"searchable":true,"multiple":false,"default_value":"","visible":true,"trackers":[{"id":2,"name":"Infra-Field"},{"id":4,"name":"Software-Dev"},{"id":6,"name":"Infra-Config"},{"id":8,"name":"External-Comms-Intake"},{"id":9,"name":"Outreach-Partnerships"},{"id":10,"name":"Admin"},{"id":13,"name":"Test-Reject"},{"id":17,"name":"Research"},{"id":18,"name":"Mutual-Aid-Action"},{"id":19,"name":"SCN-Space"}],"roles":[]}]}
{
"custom_fields": [
{
"id": 2,
"name": "Discord ID",
"description": "ID used to link user to their discord account, to enable integration",
"customized_type": "user",
"field_format": "string",
"regexp": "",
"min_length": null,
"max_length": null,
"is_required": false,
"is_filter": false,
"searchable": false,
"multiple": false,
"default_value": "",
"visible": true,
"editable": true
},
{
"id": 4,
"name": "syncdata",
"description": "Metadata used to sync the ticket with external sources.\r\n\r\nCurrent format is thread-id|zulu-timestamp.\r\n\r\nNot recommend to edit.",
"customized_type": "issue",
"field_format": "string",
"regexp": "",
"min_length": null,
"max_length": null,
"is_required": false,
"is_filter": true,
"searchable": false,
"multiple": false,
"default_value": "",
"visible": false,
"editable": true,
"trackers": [
{
"id": 2,
"name": "Infra-Field"
},
{
"id": 4,
"name": "Software-Dev"
},
{
"id": 6,
"name": "Infra-Config"
},
{
"id": 8,
"name": "External-Comms-Intake"
},
{
"id": 9,
"name": "Outreach-Partnerships"
},
{
"id": 10,
"name": "Admin"
},
{
"id": 17,
"name": "Research"
},
{
"id": 18,
"name": "Mutual-Aid-Action"
},
{
"id": 19,
"name": "SCN-Space"
}
],
"roles": [
{
"id": 3,
"name": "Administrator"
}
]
},
{
"id": 5,
"name": "To/CC",
"description": "Contains the To and Cc headers from the email that created the ticket.",
"customized_type": "issue",
"field_format": "string",
"regexp": "",
"min_length": null,
"max_length": null,
"is_required": false,
"is_filter": false,
"searchable": true,
"multiple": false,
"default_value": "",
"visible": true,
"editable": true,
"trackers": [
{
"id": 2,
"name": "Infra-Field"
},
{
"id": 4,
"name": "Software-Dev"
},
{
"id": 6,
"name": "Infra-Config"
},
{
"id": 8,
"name": "External-Comms-Intake"
},
{
"id": 9,
"name": "Outreach-Partnerships"
},
{
"id": 10,
"name": "Admin"
},
{
"id": 13,
"name": "Test-Reject"
},
{
"id": 17,
"name": "Research"
},
{
"id": 18,
"name": "Mutual-Aid-Action"
},
{
"id": 19,
"name": "SCN-Space"
}
],
"roles": []
},
{
"id": 6,
"name": "redacted",
"description": "Keys and values for the redacted fields in the ticket.",
"customized_type": "issue",
"field_format": "string",
"regexp": "",
"min_length": null,
"max_length": null,
"is_required": false,
"is_filter": false,
"searchable": false,
"multiple": false,
"default_value": "",
"visible": false,
"editable": true,
"trackers": [
{
"id": 2,
"name": "Infra-Field"
},
{
"id": 4,
"name": "Software-Dev"
},
{
"id": 6,
"name": "Infra-Config"
},
{
"id": 8,
"name": "External-Comms-Intake"
},
{
"id": 9,
"name": "Outreach-Partnerships"
},
{
"id": 10,
"name": "Admin"
},
{
"id": 13,
"name": "Test-Reject"
},
{
"id": 17,
"name": "Research"
},
{
"id": 18,
"name": "Mutual-Aid-Action"
},
{
"id": 19,
"name": "SCN-Space"
}
],
"roles": [
{
"id": 3,
"name": "Administrator"
}
]
}
]
}
8 changes: 8 additions & 0 deletions docs/threader.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,14 @@ The `threader_job.sh` exists to make sure the `.env` file and the `venv` Python

All output to stdout and stderr captured and logged to syslog with the tag "threader".

## Redactor Configuration

If the `redactor` is being used, it should be configured in the `.env` file in the netbot deployment:

```
REDACTOR_URL=http://192.168.20.64:8000
```


## Threader Logs: `/var/log/syslog`

Expand Down
Loading