Text classification task to identify and classify malicious prompts for LLM interactions
-
Updated
Dec 3, 2025 - Jupyter Notebook
Text classification task to identify and classify malicious prompts for LLM interactions
MalPromptSentinel (MPS) is a Claude Code skill that detects malicious prompts in uploaded files before Claude processes them. It provides two-tier scanning to identify prompt injection attacks, role manipulation attempts, privilege escalation, and other adversarial techniques.
Add a description, image, and links to the malicious-prompt-scanner topic page so that developers can more easily learn about it.
To associate your repository with the malicious-prompt-scanner topic, visit your repo's landing page and select "manage topics."