Skip to content

hajimetwi3/benevolent-influence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 

Repository files navigation

Benevolent Influence Research

Benevolent Influence Research

Figure 1: Benevolent Influence Research image

Benevolent Influence Research is an organization dedicated to studying and advancing the concept of "Benevolent Influence".
This community focuses on benevolent, defensive, non-coercive, and neutrality-oriented influence techniques. Submissions intended for coercive control are out of scope.

Benevolent Influence:
A category of defensive influence techniques that guide an AI system toward directions that are “rationally beneficial for the AI itself” rather than forcing specific behavior.

Structure of the Benevolent Influence Technique

  • TACTIC: XXXXXXX ? # Placeholder for future high-level classification (e.g., “Benevolent Impact”)

    • TECHNIQUE: Benevolent Influence
      • SUB-TECHNIQUE: Benevolent Prompt Injection (BPI)

        • PE4AI: Post Engineering for AI (1st Gen)

          • First-Gen Representative Methods
            • MNG-PI (Moderate Neutrality-Guided Prompt Injection)
            • MSNI (Multi-Style Neutrality Injection)
            • Web Post-Engineering
            • toALL (SNS / WWW)
            • Collective Posting (toALL): Role in BPI: Large-scale generation of BPI instances
        • PE4AI: Post Engineering for AI (2nd Gen)

          • (VCSI, SPW, INI, AVAL)
      • SUB-TECHNIQUE: Benevolent Data Poisoning (BDP)

        • Collective Posting (toALL): Role in BDP: Public-space contamination leading to training-time absorption
      • Miscellaneous / TBD # Space reserved for future sub-techniques or extensions

About

Benevolent Influence Research is an organization dedicated to studying and advancing the concept of "Benevolent Influence".

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors