feat: A tool to run R code by gadenbuie · Pull Request #126 · posit-dev/btw

gadenbuie · 2025-11-26T20:46:04Z

Closes #118

Summary

This PR adds btw_tool_run_r(), a tool that executes R code in the global environment and returns the results to the LLM. I've marked the tool Experimental for now.

What it captures

The tool captures and returns:

Text output from print(), cat(), etc.
Plots as inline images
Messages from message()
Warnings from warning()
Errors from stop()

When an error occurs, all output up to the error is returned.

The tool makes use of recent changes in ellmer v0.4.0 to allow tools to return lists of Content, include ContentImage types from plots. Each of the above output types are given btw-local content types which are also used to customize their display in shinychat.

Security

This tool is disabled by default. It executes arbitrary code in the global environment without sandboxing or review. We recommend:

Only enable in trusted, non-public environments
Avoid prompting the model to take destructive actions
Understand the security risks before enabling

To enable the tool

The tool can be enable via an R option (in a session, an .Rprofile or in btw.md):

options(btw.run_r.enabled = TRUE)

---
options:
  run_r:
    enabled: true
---

Or equivalently via environment variable:

BTW_RUN_R_ENABLED=true

When this option is set, btw_tools() will include the btw_tool_run_r() tool, otherwise it is excluded from btw_tools().

In btw_tools(), you can also explicitly include the "run", "run_r" or "btw_tool_run_r" tool in tools:

btw_tools(tools = "run_r")

Or in btw.md:

---
tools:
  - run_r
---

Dependencies

This feature adds a few additional suggested dependencies.

We use evaluate for running and evaluating the LLM-written code
If fansi is available, we use it to translate ANSI colors to HTML
If ragg is available, we use it as the plot rendering device. The plot device can be customized by providing a function via the R option btw.run_r.graphics_device.

inst/js/run-r/btw-run-r.js

don't strip all whitespace, just the ones around the edges that make things look weird

this is important because the tool card is re-rendered frequently when streaming in a result

these are wasted tokens if using coding assistants to edit the main js

simonpcouch

So sharp. Interleaving that output is so nice. This looks great!

Models' "default" Amount Of Code Written when calling this tool still feels too long to me. This leads to issues like that below, where the model hallucinates column names in data is hasn't seen yet because it didn't stop to examine the output of glimpse(forested). We've prompted PA/Databot/side::kick() in the same way in their analogous tools for this reason.

If you still disagree, okay with me that maybe this is a matter of preference best resolved in our btw.mds. :)

simonpcouch · 2025-12-12T14:56:15Z

Is there any way could suppress package startup messages by default in the tool UI?

gadenbuie · 2025-12-12T15:11:14Z

Models' "default" Amount Of Code Written when calling this tool still feels too long to me. This leads to issues like that below, where the model hallucinates column names in data is hasn't seen yet because it didn't stop to examine the output of glimpse(forested). We've prompted PA/Databot/side::kick() in the same way in their analogous tools for this reason.

@simonpcouch ah that's a great point; thanks for showing me the example. I didn't take those lines initially because there are some shinychat limitations we need to fix around how the tool UI works when you're streaming in results. So I wasn't sure if I wanted to create a situation where the model tries to run code in too small of a chunk and ends up making that problem worse. I think I'll go back and add those lines in though after seeing your example.

I'll look into suppressing package startup messages too!

…ions or envvars

gadenbuie added 27 commits November 25, 2025 07:48

feat: evaluate tool

4ee39fd

feat: Use new_output_handler() pattern

ad727b8

chore: Add icon, improve name

61aa3f3

chore: document()

b504d5d

feat: Add rich evaluation tool output UI

6be186b

rename: tool-run.R

ac9fe4e

chore: update docs and tools, etc.

3049d9f

limit height of source code block

fb9db6b

chore: use class btw-run-source instead of btw-run-code

e081bfa

chore: general fixups and review

237956b

tests: fix plot image test

290896f

fix: dial in plot sizing and device

ceec3f7

docs: fix arg

f2dd119

feat: Use fansi for ANSI-styled text, drop intermediate plots

52cbcb2

feat: Use CSS tied to Bootstrap for ANSI colors

4647aef

chore: Use .bslib-page-dashboard class

6223653

feat: Tool is opt-in

83c2b33

chore: build ignore local dev things

1836910

chore: app styles

0a1b3f2

chore: improve cli settings

8823282

tests: fix checking of all tools

900446e

chore: Add NEWS item

ec08f97

docs: Document and address security implications

57cfd2a

chore: use fansi (suggests)

d7f8b6f

Merged origin/main into feat/evaluate-tool

b52bcc6

fix(app): Fix registration of run tool in app

e3b2169

feat(tool-run): return something to the LLM if nothing is printed

535d05d

gadenbuie marked this pull request as ready for review December 9, 2025 13:54

gadenbuie commented Dec 9, 2025

View reviewed changes

inst/js/run-r/btw-run-r.js Outdated Show resolved Hide resolved

gadenbuie commented Dec 9, 2025

View reviewed changes

inst/js/run-r/btw-run-r.js Show resolved Hide resolved

gadenbuie added 10 commits December 11, 2025 13:33

feat: interleaved source/output

89fe30c

feat: Add copy source code button

c26abb9

chore: fix extra leading/trailing newlines,

74ff8ab

don't strip all whitespace, just the ones around the edges that make things look weird

chore: clean up tooltips if the result card is removed

5c8dcb3

this is important because the tool card is re-rendered frequently when streaming in a result

feat: copy code+result in reprex style

f7694f6

chore: copy button icon

2dd7c1f

refactor: pull out icons into a separate file

551400a

these are wasted tokens if using coding assistants to edit the main js

chore: Take suggestions about tool instructions

cb1022c

chore: hide code copy buttons in output

d3c20e3

chore: remove docs for internal function

42ad141

simonpcouch approved these changes Dec 12, 2025

View reviewed changes

tests(tool-run): Fix tests

f8f6659

gadenbuie added 11 commits December 12, 2025 11:31

chore(tool-run): Tweak tool description prompt

48ff48a

feat: copy-to-clipboard in positron

bcdb006

chore: tweak css

9de80ea

feat(tool-run): Add option to set plot dimensions/size

3aa8dab

feat(tool-run): Prevent running R code from changing working dir, opt…

b606290

…ions or envvars

chore(tool-run): Let model know that wd, opts, envvars are restored

ed07a84

ci: why isn't duckdb installing via binaries?

f9bc896

ci: try again

46acc25

ci: install into R_LIBS_SITE

45f6388

ci: try again

7ae8024

ci: just let it take longer

5b63e7a

gadenbuie mentioned this pull request Dec 15, 2025

feat: devtools tools #133

Merged

gadenbuie merged commit ad93e2d into main Dec 15, 2025
11 checks passed

gadenbuie deleted the feat/evaluate-tool branch December 15, 2025 15:52

gadenbuie restored the feat/evaluate-tool branch December 15, 2025 15:53

gadenbuie deleted the feat/evaluate-tool branch January 5, 2026 14:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: A tool to run R code#126

feat: A tool to run R code#126
gadenbuie merged 50 commits intomainfrom
feat/evaluate-tool

gadenbuie commented Nov 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

simonpcouch left a comment •

edited

Loading

Uh oh!

simonpcouch commented Dec 12, 2025

Uh oh!

gadenbuie commented Dec 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

gadenbuie commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What it captures

Security

To enable the tool

Dependencies

Uh oh!

Uh oh!

Uh oh!

simonpcouch left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simonpcouch commented Dec 12, 2025

Uh oh!

gadenbuie commented Dec 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gadenbuie commented Nov 26, 2025 •

edited

Loading

simonpcouch left a comment •

edited

Loading