Skip to content
/ RogueGPT Public

RogueGPT: A controlled stimulus generator for AI news authenticity research. (arXiv:2601.21963 and arXiv:2601.22871)

License

Notifications You must be signed in to change notification settings

aloth/RogueGPT

Repository files navigation

RogueGPT: A Controlled Stimulus Generation Framework for News Authenticity Research

arXiv arXiv DOI Status License GitHub Stars Mastodon

RogueGPT — Controlled AI stimulus generation pipeline for fake news research

Motivation

Empirical research on human perception of AI-generated text requires stimuli that are diverse, reproducible, and generated under controlled experimental conditions. RogueGPT provides a systematic framework for producing such stimuli across multiple LLM families, languages, journalistic styles, and content formats.

RogueGPT is the stimulus generation component of the JudgeGPT research pipeline. Together, they enable end-to-end experiments from controlled content generation to quantitative human perception measurement. This methodology is described in our foundational survey, "Blessing or Curse? A Survey on the Impact of Generative AI on Fake News" (arXiv:2404.03021), and extended in two WWW '26 Companion papers: "Industrialized Deception" (arXiv:2601.21963), which examines the systemic effects of LLM-generated misinformation, and "Eroding the Truth-Default" (arXiv:2601.22871), which reports the human perception findings from stimuli generated by this framework.

Architecture

RogueGPT follows a three-layer architecture that separates data logic from interfaces:

┌──────────────────────────────────────────────────────┐
│                   Interfaces                         │
│  ┌───────────┐  ┌───────────┐  ┌──────────────────┐  │
│  │  app.py   │  │  cli.py   │  │  mcp_server.py   │  │
│  │ Streamlit │  │ Terminal  │  │  MCP Protocol    │  │
│  └─────┬─────┘  └─────┬─────┘  └────────┬─────────┘  │
│        └──────────────┼─────────────────┘            │
│                  ┌────┴────┐                         │
│                  │ core.py │                         │
│                  │  Data   │                         │
│                  │  Layer  │                         │
│                  └────┬────┘                         │
│               ┌───────┴────────┐                     │
│               │    MongoDB     │                     │
│               │ realorfake.    │                     │
│               │ fragments      │                     │
│               └────────────────┘                     │
└──────────────────────────────────────────────────────┘
Component Purpose
core.py Data layer: schema validation, normalization, MongoDB CRUD operations. No UI dependencies.
app.py Streamlit web interface for interactive generation and manual data entry.
cli.py Command-line interface for scripted ingestion, retrieval, and dataset inspection.
mcp_server.py Model Context Protocol server exposing ingest_fragment and retrieve_fragments as tools for AI agent integration.
prompt_engine.json Declarative configuration defining prompt templates, model identifiers, languages, styles, and formats.

Dataset

The current corpus contains 2,308 multilingual news fragments spanning:

  • 37 model configurations across 10 providers: OpenAI (GPT-3.5, GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o Mini, GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano, o1, o1-Mini, o1-Preview, o1-Pro, o3-Mini), Anthropic (Claude 3.5 Sonnet, Claude Sonnet 4.5, Claude Opus 4.6), Google (Gemma 7B, Gemini 1.5 Flash, Gemini 1.5 Pro, Gemini 2.0 Flash, Gemini 3 Pro), Meta (LLaMA-2 13B, LLaMA-3.3 70B), Mistral (Mistral 7B, Mistral Large 2), DeepSeek (R1, V3), Microsoft (Phi-3 Mini), Zhipu (GLM-4.6, GLM-4.7), Moonshot (Kimi K2.5), Qwen (Qwen-2.5 72B), MiniMax (M2.1)
  • 4 languages (English, German, French, Spanish)
  • 3 formats (tweet, headline, short article)
  • 5 journalistic styles per language (e.g., NYT, BBC, CNN, Fox News, WSJ for English)
  • 51 human-sourced fragments as experimental anchors

The corpus is available on Zenodo under restricted access for academic research:

DOI

The model configuration (prompt_engine.json) currently defines 37 model identifiers across 10 providers, enabling rapid expansion of the corpus with new model generations.

Research Pipeline

RogueGPT operates as the first stage of a two-part experimental workflow:

  1. Stimulus generation (RogueGPT): Fragments are produced with explicit control over model, style, language, format, and seed phrase. All generation parameters are persisted alongside the content.
  2. Storage (MongoDB): Each fragment is stored with full provenance metadata, enabling reproducible filtering by any experimental variable.
  3. Human evaluation (JudgeGPT): Participants assess fragments on continuous dual-axis scales (source attribution and authenticity), producing quantitative perception data linked to generation parameters.
  4. Analysis: The combined dataset supports investigations into model-specific detectability, cross-linguistic perception differences, and the role of individual differences in judgment accuracy.

Installation

Prerequisites

  • Python 3.10+
  • MongoDB instance (local or Atlas)

Setup

git clone https://github.com/aloth/RogueGPT.git
cd RogueGPT
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Configuration

Set the MongoDB connection string as an environment variable:

export ROGUEGPT_MONGO_URI="mongodb+srv://user:pass@cluster.mongodb.net/?retryWrites=true&w=majority"

Alternatively, for the Streamlit interface, configure .streamlit/secrets.toml:

[mongo]
connection = "mongodb+srv://user:pass@cluster.mongodb.net/..."

Usage

Web Interface (Streamlit)

streamlit run app.py

Provides two modes: Generator (automated, template-driven generation across parameter combinations) and Manual Data Entry (for human-sourced fragments).

Command-Line Interface

The CLI enables scripted dataset operations without a browser:

# Dataset statistics
python3 cli.py stats

# List all configured model identifiers
python3 cli.py models

# Retrieve random fragments
python3 cli.py retrieve --n 5 --origin Machine --model openai_gpt-4o_2024-08-06

# Ingest a machine-generated fragment
python3 cli.py ingest \
  --origin Machine \
  --model "openai_gpt-4.1" \
  --lang en \
  --is-fake \
  --prompt "Write a short article about '''Topic''' in English in the style of CNN." \
  --content "The generated text content..."

# Ingest a human-sourced fragment
python3 cli.py ingest \
  --origin Human \
  --outlet "BBC" \
  --url "https://www.bbc.com/news/example" \
  --lang en \
  --content "The original article text..."

Validation: The CLI enforces schema constraints from prompt_engine.json. Model identifiers are validated against the configuration by default. Use --lenient to allow unregistered models with a warning.

MCP Server (AI Agent Integration)

RogueGPT exposes its data layer via the Model Context Protocol for integration with AI agents and LLM tool-use workflows:

python3 mcp_server.py

Tools:

Tool Description
ingest_fragment Validate and store a new fragment with full provenance metadata.
retrieve_fragments Fetch random fragments with optional filters (origin, model, language, veracity).

Resources:

URI Description
roguegpt://config/models List of all recognized model identifiers.
roguegpt://config/languages Supported ISO language codes.
roguegpt://stats Current dataset statistics by origin and model.

The MCP interface enables automated corpus expansion: an AI agent can generate content with any LLM, then ingest the result with full metadata for subsequent human evaluation.

Fragment Schema

Each fragment in the database conforms to the following schema:

Field Type Required Description
FragmentID string auto Unique identifier (UUID hex).
Content string yes The news text.
Origin string yes "Human" or "Machine".
IsFake boolean yes Whether the content is fabricated.
ISOLanguage string yes ISO 639-1 language code.
MachineModel string if Machine Model identifier (must match prompt_engine.json).
MachinePrompt string recommended The prompt used for generation.
HumanOutlet string if Human Publishing outlet name.
HumanURL string recommended Source URL for provenance.
CreationDate datetime auto Timestamp of ingestion.
IngestedVia string auto Ingestion channel: "ui", "cli", or "mcp".

Extending the Model Configuration

To add a new model, append its identifier to the GeneratorModel array in prompt_engine.json:

"GeneratorModel": [
    "openai_gpt-4.1",
    "anthropic_claude-opus-4-6",
    "your-provider_model-name",
    ...
]

The naming convention follows provider_model-variant (e.g., openai_gpt-4o_2024-08-06, meta_llama-3.3-70b, anthropic_claude-sonnet-4-5). All ingestion interfaces validate against this list.

Roadmap

  • Multimodal stimuli: Extend generation to images and multimedia for deepfake perception research.
  • Automated corpus expansion: Agent-driven generation pipelines using the MCP server to systematically cover new model releases.
  • Provenance integration: Content authenticity metadata (C2PA) annotation for mitigation experiments.
  • Cross-dataset linking: Direct integration with JudgeGPT perception data for unified analysis.

For Researchers

Goal Action
Understand the methodology Read the paper
Use the dataset Request access on Zenodo
Extend the corpus Fork, add models, submit a PR
Participate in the study JudgeGPT Survey
Contact the authors alexander.loth@stud.fra-uas.de

Expert Survey

We are conducting a follow-up study to gather expert perspectives on AI-driven disinformation risks and mitigation strategies. If you have expertise in AI, policy, or journalism, we invite your participation:

Expert Survey (15 min)

All responses are treated confidentially and reported in anonymized, aggregated form.

Citation

If you use RogueGPT or its dataset in your work, please cite:

@inproceedings{loth2026collateraleffects,
    author    = {Loth, Alexander and Kappes, Martin and Pahl, Marc-Oliver},
    title     = {Industrialized Deception: The Collateral Effects of
                 LLM-Generated Misinformation on Digital Ecosystems},
    booktitle = {Companion Proceedings of the ACM Web Conference 2026
                 (WWW '26 Companion)},
    year      = {2026},
    month     = apr,
    publisher = {ACM},
    address   = {New York, NY, USA},
    location  = {Dubai, United Arab Emirates},
    doi       = {10.1145/3774905.3795471},
    url       = {https://arxiv.org/abs/2601.21963},
    note      = {To appear. Also available as arXiv:2601.21963}
}

@inproceedings{loth2026eroding,
    author    = {Loth, Alexander and Kappes, Martin and Pahl, Marc-Oliver},
    title     = {Eroding the Truth-Default: A Causal Analysis of Human
                 Susceptibility to Foundation Model Hallucinations and
                 Disinformation in the Wild},
    booktitle = {Companion Proceedings of the ACM Web Conference 2026
                 (WWW '26 Companion)},
    year      = {2026},
    month     = apr,
    publisher = {ACM},
    address   = {New York, NY, USA},
    location  = {Dubai, United Arab Emirates},
    doi       = {10.1145/3774905.3795832},
    url       = {https://arxiv.org/abs/2601.22871},
    note      = {To appear. Also available as arXiv:2601.22871}
}

@article{loth2024blessing,
    author  = {Loth, Alexander and Kappes, Martin and Pahl, Marc-Oliver},
    title   = {Blessing or Curse? A Survey on the Impact of Generative AI
               on Fake News},
    journal = {arXiv preprint arXiv:2404.03021},
    year    = {2024},
    url     = {https://arxiv.org/abs/2404.03021}
}

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/new-model-support)
  3. Commit your changes (git commit -m 'Add support for model X')
  4. Push to the branch (git push origin feature/new-model-support)
  5. Open a Pull Request

For substantial changes, please open an issue first.

License

This project is licensed under the GNU General Public License v3.0. See LICENSE for details.

Acknowledgments

This research is supported by Frankfurt University of Applied Sciences and IMT Atlantique. We thank the open-source communities behind Streamlit, MongoDB, and the Model Context Protocol for the infrastructure that makes this work possible.

Disclaimer

RogueGPT is an independent research project. The use of "GPT" in the project name follows pars pro toto convention, referring to the broader class of generative pre-trained transformer models. This project is not affiliated with or endorsed by OpenAI. All research adheres to established ethical guidelines for AI safety research.