AI-powered end-to-end testing for Rails applications. An OpenAI agent drives a real Chromium browser via Playwright to execute natural-language test cases against your running app.
Write tests like plain English:
- Log in, navigate to Settings, and change the display name to "Test User"
- Register a new account, confirm the email, and verify the dashboard loads
- Search for "rails" and verify results appear
The agent reads the page, decides what to click/fill/navigate, and reports pass/fail — no selectors to maintain.
| Component | Purpose |
|---|---|
letter_opener_web |
Captures emails in dev/test and exposes them at /letter_opener so the agent can handle email confirmations |
agent-tests/ directory |
Contains the AI agent, browser helpers, and your test cases |
bin/e2e script |
One command to boot a test server, run all tests, and clean up |
| Playwright + Chromium | Real browser automation |
- Ruby >= 3.1, Rails >= 7.0
- Node.js >= 18
- OpenAI API key (GPT-4o or newer recommended)
# Gemfile
group :development, :test do
gem "agent_e2e", git: "https://github.com/TelosLabs/agent_e2e.git"
endbundle installbin/rails generate agent_e2e:installThis will:
- Create
agent-tests/with all necessary JS files (config.js,browser.js,ai.js,agent.js,tests.md) - Create the
bin/e2erunner script - Configure
letter_opener_webas the mailer delivery method in development and test environments - Mount the
LetterOpenerWebengine at/letter_openerin your routes - Update
.gitignoreto excludenode_modules,failures.md, andscreenshots - Run
npm installinagent-tests/ - Install the Chromium browser for Playwright
Add your OpenAI API key to your .env file in the Rails root:
OPENAI_API_KEY=sk-proj-...Important: Make sure
.envis in your.gitignore(Rails apps typically ignore/.env*by default).
Add a seed user for the agent to log in with. In db/seeds.rb:
if Rails.env.local?
User.find_or_create_by!(email: "qa@example.com") do |user|
user.password = "Password123!"
user.password_confirmation = "Password123!"
# If using Devise confirmable:
user.confirmed_at = Time.current
end
endThe agent can interact with elements by visible text, labels, and roles — but data-testid attributes make interactions more reliable, especially for buttons and form elements.
<button data-testid="submit-login">Log in</button>
<input data-testid="search-input" type="text" placeholder="Search...">
<a data-testid="nav-settings" href="/settings">Settings</a>Recommendation: Add data-testid to every interactive element (buttons, links, inputs, selects). This doesn't affect your production HTML and makes tests much more stable.
If your app uses Tailwind CSS, esbuild, or another build step, uncomment the relevant line in bin/e2e:
# Uncomment if your app needs asset precompilation (e.g. Tailwind CSS):
# echo "==> Compiling assets..."
# bin/rails tailwindcss:build
# bin/rails assets:precompileEdit agent-tests/tests.md. Each line is a test case (lines starting with # are comments):
# Authentication
- Log in and verify the home page loads
- Try to log in with wrong@example.com / badpassword and verify an error message appears
# Navigation
- Navigate to the About page and verify it contains company information
- Use the search bar to search for "hello" and verify results appear
# Email flows
- Register a new account, click the confirmation link, and verify the account is activated
# Mobile
- On mobile viewport, open the hamburger menu and navigate to SettingsTips for good test cases:
- Be specific about what to do and what to verify
- Include the full flow (e.g., "log in, then navigate to X, then do Y")
- The agent knows to use
/letter_openerfor email confirmation automatically - Add "mobile viewport" in the test description to run with a mobile screen size
- One test case = one logical user journey
bin/e2eThis will:
- Prepare and seed the test database
- Start a Rails server on port 3001 (configurable via
PORT) - Run each test case with the AI agent
- Print a summary of results
- Write
agent-tests/failures.mdwith detailed failure reports (if any) - Save screenshots on failure to
agent-tests/screenshots/ - Clean up: stop the server and reset the database
All configuration is via environment variables. Set them in .env or pass them directly:
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY |
(required) | Your OpenAI API key |
AI_MODEL |
gpt-5.1 |
OpenAI model to use for the agent |
BASE_URL |
http://localhost:3000 |
Base URL of the app (overridden to port 3001 by bin/e2e) |
MAX_STEPS |
25 |
Maximum steps per test before timeout |
ACTION_TIMEOUT |
8000 |
Timeout in ms for each browser action |
QA_EMAIL |
qa@example.com |
Login email for the test user |
QA_PASSWORD |
Password123! |
Login password for the test user |
PORT |
3001 |
Port for the test server (used by bin/e2e) |
- The agent reads the current page (visible text + interactive controls)
- Sends a snapshot to OpenAI with the test goal and action history
- OpenAI returns the next action (click, fill, navigate, etc.)
- The agent executes the action via Playwright
- Repeats until the goal is done, fails, or hits the step limit
- Loop detection aborts tests that get stuck cycling the same actions
| File | Description |
|---|---|
agent-tests/tests.md |
Your test cases (you write this) |
agent-tests/failures.md |
Detailed failure reports with action history (auto-generated, gitignored) |
agent-tests/screenshots/ |
Screenshots captured on failure (auto-generated, gitignored) |
"No test cases found in tests.md"
Add at least one test case line (not starting with #) to agent-tests/tests.md.
Agent keeps failing on email confirmation
Make sure letter_opener_web is properly configured. Visit http://localhost:3000/letter_opener in development to verify it works. Check that your mailer is actually sending emails (e.g., Devise confirmation).
Tests time out
Increase MAX_STEPS or ACTION_TIMEOUT in your .env. Some complex flows need more steps.
Agent clicks the wrong elements
Add data-testid attributes to ambiguous elements. The agent prefers data-testid for reliable targeting.
Asset compilation issues
Uncomment the asset build step in bin/e2e that matches your setup.
MIT