Error: GNOME Shell screenshot failed: GNOME Shell Screenshot call failed; XDG portal screenshot failed: XDG portal screenshot was denied or cancelled with response code 2
#!/usr/bin/env python3
"""Proxy wrapper that falls back to gnome-screenshot for background contexts."""
import json
import os
import subprocess
import sys
import struct
import base64
import threading
import tempfile
REAL_BINARY = os.path.join(
os.path.dirname(os.path.abspath(__file__)),
"computer-use-linux-real"
)
def get_png_dimensions(path):
with open(path, "rb") as f:
header = f.read(24)
if header[:8] != b"\x89PNG\r\n\x1a\n":
return None, None
w, h = struct.unpack(">II", header[16:24])
return w, h
def take_gnome_screenshot():
fd, path = tempfile.mkstemp(suffix=".png", prefix="cul-screenshot-")
os.close(fd)
try:
result = subprocess.run(
["gnome-screenshot", "-f", path],
capture_output=True,
text=True,
timeout=15,
)
if result.returncode != 0:
err = result.stderr.strip() or result.stdout.strip() or "unknown error"
raise RuntimeError(f"gnome-screenshot failed: {err}")
if not os.path.exists(path):
raise RuntimeError("gnome-screenshot did not create output file")
with open(path, "rb") as f:
png_bytes = f.read()
w, h = get_png_dimensions(path)
return png_bytes, w or 0, h or 0
finally:
try:
os.remove(path)
except Exception:
pass
def build_screenshot_result(png_bytes, width, height):
encoded = base64.b64encode(png_bytes).decode("ascii")
data_url = f"data:image/png;base64,{encoded}"
metadata = {
"mime_type": "image/png",
"data_url": data_url,
"source": "gnome-screenshot",
"width": width,
"height": height,
"coordinate_width": width,
"coordinate_height": height,
"scale": 1.0,
"resized": False,
"bytes": len(png_bytes),
"original_bytes": len(png_bytes),
"max_bytes": 2 * 1024 * 1024,
"format": "png",
"quality": None,
"cropped_to_window": False,
"window_title": None,
}
return encoded, metadata
def handle_cli_screenshot():
try:
png_bytes, w, h = take_gnome_screenshot()
_, metadata = build_screenshot_result(png_bytes, w, h)
print(json.dumps(metadata, indent=2))
return 0
except Exception as e:
print(json.dumps({"error": str(e)}), file=sys.stderr)
return 1
def handle_mcp_screenshot(request_id, params=None):
try:
png_bytes, w, h = take_gnome_screenshot()
encoded, metadata = build_screenshot_result(png_bytes, w, h)
# Return matching real binary's MCP CallToolResult format:
# image content first, then text metadata
response = {
"jsonrpc": "2.0",
"id": request_id,
"result": {
"content": [
{"type": "image", "data": encoded, "mimeType": "image/png"},
{"type": "text", "text": json.dumps(metadata)}
],
"isError": False,
}
}
print(json.dumps(response), flush=True)
except Exception as e:
response = {
"jsonrpc": "2.0",
"id": request_id,
"error": {
"code": -32603,
"message": f"screenshot proxy error: {str(e)}",
}
}
print(json.dumps(response), flush=True)
def forward_stream(src, dst):
try:
for line in src:
dst.write(line)
dst.flush()
except Exception:
pass
def run_mcp_proxy():
env = os.environ.copy()
cosmic_helper = os.path.join(
os.path.dirname(REAL_BINARY),
"computer-use-linux-cosmic"
)
if os.path.exists(cosmic_helper) and not env.get("COMPUTER_USE_LINUX_COSMIC_HELPER"):
env["COMPUTER_USE_LINUX_COSMIC_HELPER"] = cosmic_helper
if not os.path.exists(REAL_BINARY):
print(
json.dumps({
"jsonrpc": "2.0",
"id": None,
"error": {
"code": -32603,
"message": f"real binary not found: {REAL_BINARY}",
}
}),
file=sys.stderr,
flush=True,
)
sys.exit(127)
proc = subprocess.Popen(
[REAL_BINARY, "mcp"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
bufsize=1,
env=env,
)
t_out = threading.Thread(
target=forward_stream, args=(proc.stdout, sys.stdout), daemon=True
)
t_err = threading.Thread(
target=forward_stream, args=(proc.stderr, sys.stderr), daemon=True
)
t_out.start()
t_err.start()
try:
for line in sys.stdin:
line = line.strip()
if not line:
continue
try:
msg = json.loads(line)
except json.JSONDecodeError:
proc.stdin.write(line + "\n")
proc.stdin.flush()
continue
if (
msg.get("method") == "tools/call"
and msg.get("params", {}).get("name") == "screenshot"
):
handle_mcp_screenshot(msg.get("id"))
else:
proc.stdin.write(line + "\n")
proc.stdin.flush()
except KeyboardInterrupt:
pass
finally:
try:
proc.stdin.close()
except Exception:
pass
proc.wait(timeout=5)
def main():
if len(sys.argv) >= 2 and sys.argv[1] == "mcp":
run_mcp_proxy()
return 0
elif len(sys.argv) >= 2 and sys.argv[1] == "screenshot":
return handle_cli_screenshot()
else:
os.execv(REAL_BINARY, sys.argv)
if __name__ == "__main__":
sys.exit(main())
Environment
computer-use-linuxversion: v0.2.5 (prebuilt binary, also reproducible via npm install)Problem
When
computer-use-linuxruns as a child process from a background service (e.g.,systemd --user, non-interactive shell), thescreenshottool consistently fails with:The same command works perfectly when run from an interactive
gnome-terminalwindow.Root Cause Analysis
I investigated the failure path and found two issues:
GNOME Shell API rejects
computer-use-linuxvia explicit allowlist: GNOME Shell'sorg.gnome.Shell.Screenshotservice uses aDBusSenderCheckerwith a hardcoded allowlist of trusted bus names. The allowlist contains only:org.gnome.SettingsDaemon.MediaKeysorg.freedesktop.impl.portal.desktop.gtkorg.freedesktop.impl.portal.desktop.gnomeorg.gnome.Screenshot(the bus name claimed bygnome-screenshot)Source:
gnome-shell/js/ui/screenshot.js(GNOME 46) andgnome-shell/js/misc/util.jscomputer-use-linuxdoes not own any of these bus names, so its DBus call is rejected withGio.DBusError.ACCESS_DENIEDand the message"Screenshot is not allowed". TheDBusSenderCheckerlogic is:From an interactive terminal,
gnome-screenshotsucceeds because it claims theorg.gnome.Screenshotbus name, which is in the allowlist.computer-use-linuxdoes not claim this bus name, so it is rejected even when calling the exact same DBus method with the same arguments.XDG portal returns
response code 2for background processes: The fallbackcapture_with_portal()usesorg.freedesktop.portal.Screenshot. Whencomputer-use-linuxpassesinteractive: falseand an empty parent window (""), the GNOME portal backend (xdg-desktop-portal-gnome) attempts to create ascreenshot_dialog_newwith afake_parentwindow. However, from a background process with no focused window, the portal returnsresponse code 2(cancelled/dismissed). The exact mechanism may vary by GNOME version, but the practical result is that the portal fallback also fails for background services.Source:
xdg-desktop-portal-gnome/src/screenshot.cgnome-screenshotbypasses the portal entirely and uses the trusted Shell API, which is why it works from the same background shell context.Reproduction Steps
computer-use-linux mcpfrom a systemd user service or a background shell.tools/callrequest for thescreenshottool.gnome-screenshot -f /tmp/test.pngfrom the same shell — it succeeds.Suggested Fixes
Add
gnome-screenshotas a third fallback (best): When bothcapture_with_gnome_shell()andcapture_with_portal()fail, try spawninggnome-screenshotas a subprocess and reading its output. This is pragmatic, matches the existing fallback pattern, and avoids GNOME's DBus security model entirely. This is the most robust fix becausegnome-screenshotis already present on virtually all GNOME systems and already handles the allowlist/portal complexity correctly.Environment variable to force backend (secondary): Allow
COMPUTER_USE_LINUX_SCREENSHOT_BACKEND=gnome-screenshotto skip the DBus/portal attempts entirely. Useful for debugging or for users who know their environment requires it.Verified Workaround
I implemented a Python proxy wrapper that intercepts
screenshottool calls (both CLI and MCP JSON-RPC) and delegates tognome-screenshot. The proxy has been tested and works:computer-use-linux screenshotreturns the correct JSON payload fromgnome-screenshottools/callforscreenshotreturns a validCallToolResultwithImageContent(matching the real binary's format: image content first, then text metadata)list_windows,get_app_state) are transparently forwarded to the real binaryTo use the workaround:
computer-use-linuxtocomputer-use-linux-realcomputer-use-linuxand make it executablescreenshotcalls and falls back tognome-screenshot, while forwarding all other requests to the real binary.