Skip to content

Add Visualize the Target pattern#19

Open
juanmichelini wants to merge 4 commits intolexler:mainfrom
juanmichelini:add-visualize-target-pattern
Open

Add Visualize the Target pattern#19
juanmichelini wants to merge 4 commits intolexler:mainfrom
juanmichelini:add-visualize-target-pattern

Conversation

@juanmichelini
Copy link

@juanmichelini juanmichelini commented Dec 17, 2025

Adds a new pattern focusing on avoiding negation in AI prompts.

Pattern: Visualize the Target
Problem: Negation is often counterproductive
Solution: Describe what you want directly instead of what you don't want

Examples included:

  • Code generation prompts
  • Image generation prompts
  • Transformation examples

Solves obstacles: selective-hearing, limited-context-window, limited-focus

juanmichelini and others added 4 commits December 17, 2025 17:57
- New pattern focusing on avoiding negation in AI prompts
- Explains why positive framing is more effective than negative
- Includes examples for code generation and image generation
- Solves selective-hearing and compliance-bias obstacles

Co-authored-by: openhands <openhands@all-hands.dev>
- Remove compliance-bias relationship
- Add limited-context-window and limited-focus relationships
- Keep selective-hearing relationship

Co-authored-by: openhands <openhands@all-hands.dev>
@juanmichelini
Copy link
Author

Hey @lexler happy to listen to your feedback!

@lexler
Copy link
Owner

lexler commented Jan 24, 2026

Hey @juanmichelini, thanks for putting this together!

Sorry this took a while - this one gave me pause.

The elephant example makes sense - there's research showing that when you tell a model "do not mention X" (a specific word), it can become more likely to say X. To suppress something, you first activate it. There's actually research on this and I think it's really interesting:

But I don't think your coding examples show the same thing.

"Don't say Paris" fails because Paris appearing IS the failure - the word showing up is exactly what you didn't want.

"Don't use global variables" is different. Thinking about global variables doesn't hurt (a little distraction, sure). The failure would be code structure - the model choosing to use globals. That's a design decision, not word suppression.

Same with "don't create security vulnerabilities" - having that in context might actually help the model be aware of what to keep in mind. The failure isn't the words appearing, it's the code being insecure.

There seems to be a jump between lexical suppression failures and architectural / design guidance - and the evidence only really supports the first. The White Bear paper actually says this:

"The artificial nature of our 'do not mention X' instructions also differs from the more nuanced content policies and safety filters deployed in production systems, which often involve complex multi-step reasoning about appropriateness rather than simple lexical prohibition."

The positive framing advice is reasonable - Anthropic, OpenAI, and Google all recommend it. But they also use negative instructions in their own examples almost in the same breath ("Avoid using bold and italics"), so even they don't follow it consistently. I haven't seen strong evidence for WHY it works - it might just be that positive instructions are more specific ("hash with bcrypt" gives direction, "don't store in plain text" requires inference), or it's part of general degradation under load where everything gets harder to follow.

Honestly, the bigger issue I see in practice isn't negation - it's people not being selective about what goes into their context. They keep adding global rules indefinitely and end up with a distracted agent that can't focus on any of them.

A couple of directions we could take this:

  • Reframe as an obstacle - the priming thing is real and interesting, just narrower than how you've framed it. Something like "Ironic Rebound" (or White Bear?) documenting when suppression fails (word prohibition, content filtering). What's interesting is that it actually does parallel human psychology.

  • Keep as a pattern but rework the examples - the current coding examples don't really show "activating context then failing to avoid it" - they show something else (specificity?).

I'd lean toward the obstacle option myself. What do you think?

If you go with keeping it as a pattern - "Visualize" in the name feels a bit misleading. Maybe "State What You Want" or "Name the Target"?

Happy to chat more about any of this, and please let me know if I'm missing anything.

@lexler
Copy link
Owner

lexler commented Jan 25, 2026

@juanmichelini I discussed this with others, we'd like to make a separate section for tips&tricks / best practices.
I think your pattern can turn into a combination of obstacle added to core patterns + best practice/tips&tricks article about positive prompting.

We'll add support for this on the website, but in the meantime if you are ok with turning it into an obstacle, we could merge that part immediately.

@juanmichelini
Copy link
Author

Hey @lexler thanks for your perspective! I'm actually quite opinionated about the title. The easiest name would be "Avoid Negation" but I want the title to be read by LLMs so I want it to be a positive example. That goes for "White Bear", and "Ironic Rebound" as well.

At first I wanted to use "Be constructive" inspired by Logic Constructivism the logic where Not (Not a) =/= a. But sadly, that suggests "Constructive criticisim" which is not at all what I what to point at. Another name I considered was "Yes and" taken from improv, but I don't like that the title end in a conjunction plus it might make LLMs too creative.

I finally decided for "Visualise the target" because it is quite tricky to visualise a negation.
But I do see that "visualize" could be too tied to images, maybe "Visualise the target, even if it is code" would that be better?

Regarding examples, I just tried asking ChatGPT to list traditional planets but not the moon:

image image image image image image

Another approach would be to list visible planets and add the sun

image image image

The examples are not as clearer as I would like, moon does not appear in the list but it does appear in the surrounding context. But it is still bringing all sort of unrelated context with it, and depending how the attention mechanism works might still be present later in unrelated circumstances, maybe later in the conversation we do want to talk about the moon and this gets tangled up. So yes, general degradation.

I think a way to "Visualise the target" is to "Be specific" but I think in the case of the moon example it isn't.
We can make negation work but being very specific ("no extra words"), but maybe we were okay with some extra words as long as they weren't the moon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants