@@ -12,7 +12,194 @@ npm run build
1212npx playwright install chromium
1313```
1414
15- ## Quick Start
15+ ## Quick Start: Choose Your Abstraction Level
16+
17+ Sentience SDK offers ** 4 levels of abstraction** - choose based on your needs:
18+
19+ ### 💬 Level 4: Conversational Agent (Highest Abstraction) - ** NEW in v0.3.0**
20+
21+ Complete automation with natural conversation. Just describe what you want, and the agent plans and executes everything:
22+
23+ ``` typescript
24+ import { SentienceBrowser , ConversationalAgent , OpenAIProvider } from ' sentience-ts' ;
25+
26+ const browser = await SentienceBrowser .create ({ apiKey: process .env .SENTIENCE_API_KEY });
27+ const llm = new OpenAIProvider (process .env .OPENAI_API_KEY ! , ' gpt-4o' );
28+ const agent = new ConversationalAgent ({ llmProvider: llm , browser });
29+
30+ // Navigate to starting page
31+ await browser .getPage ().goto (' https://amazon.com' );
32+
33+ // ONE command does it all - automatic planning and execution!
34+ const response = await agent .execute (
35+ " Search for 'wireless mouse' and tell me the price of the top result"
36+ );
37+ console .log (response ); // "I found the top result for wireless mouse on Amazon. It's priced at $24.99..."
38+
39+ // Follow-up questions maintain context
40+ const followUp = await agent .chat (" Add it to cart" );
41+ console .log (followUp );
42+
43+ await browser .close ();
44+ ```
45+
46+ ** When to use:** Complex multi-step tasks, conversational interfaces, maximum convenience
47+ ** Code reduction:** 99% less code - describe goals in natural language
48+ ** Requirements:** OpenAI or Anthropic API key
49+
50+ ### 🤖 Level 3: Agent (Natural Language Commands) - ** Recommended for Most Users**
51+
52+ Zero coding knowledge needed. Just write what you want in plain English:
53+
54+ ``` typescript
55+ import { SentienceBrowser , SentienceAgent , OpenAIProvider } from ' sentience-ts' ;
56+
57+ const browser = await SentienceBrowser .create ({ apiKey: process .env .SENTIENCE_API_KEY });
58+ const llm = new OpenAIProvider (process .env .OPENAI_API_KEY ! , ' gpt-4o-mini' );
59+ const agent = new SentienceAgent (browser , llm );
60+
61+ await browser .getPage ().goto (' https://www.amazon.com' );
62+
63+ // Just natural language commands - agent handles everything!
64+ await agent .act (' Click the search box' );
65+ await agent .act (" Type 'wireless mouse' into the search field" );
66+ await agent .act (' Press Enter key' );
67+ await agent .act (' Click the first product result' );
68+
69+ // Automatic token tracking
70+ console .log (` Tokens used: ${agent .getTokenStats ().totalTokens } ` );
71+ await browser .close ();
72+ ```
73+
74+ ** When to use:** Quick automation, non-technical users, rapid prototyping
75+ ** Code reduction:** 95-98% less code vs manual approach
76+ ** Requirements:** OpenAI API key (or Anthropic for Claude)
77+
78+ ### 🔧 Level 2: Direct SDK (Technical Control)
79+
80+ Full control with semantic selectors. For technical users who want precision:
81+
82+ ``` typescript
83+ import { SentienceBrowser , snapshot , find , click , typeText , press } from ' sentience-ts' ;
84+
85+ const browser = await SentienceBrowser .create ({ apiKey: process .env .SENTIENCE_API_KEY });
86+ await browser .getPage ().goto (' https://www.amazon.com' );
87+
88+ // Get semantic snapshot
89+ const snap = await snapshot (browser );
90+
91+ // Find elements using query DSL
92+ const searchBox = find (snap , ' role=textbox text~"search"' );
93+ await click (browser , searchBox ! .id );
94+
95+ // Type and submit
96+ await typeText (browser , searchBox ! .id , ' wireless mouse' );
97+ await press (browser , ' Enter' );
98+
99+ await browser .close ();
100+ ```
101+
102+ ** When to use:** Need precise control, debugging, custom workflows
103+ ** Code reduction:** Still 80% less code vs raw Playwright
104+ ** Requirements:** Only Sentience API key
105+
106+ ### ⚙️ Level 1: Raw Playwright (Maximum Control)
107+
108+ For when you need complete low-level control (rare):
109+
110+ ``` typescript
111+ import { chromium } from ' playwright' ;
112+
113+ const browser = await chromium .launch ();
114+ const page = await browser .newPage ();
115+ await page .goto (' https://www.amazon.com' );
116+ await page .fill (' #twotabsearchtextbox' , ' wireless mouse' );
117+ await page .press (' #twotabsearchtextbox' , ' Enter' );
118+ await browser .close ();
119+ ```
120+
121+ ** When to use:** Very specific edge cases, custom browser configs
122+ ** Tradeoffs:** No semantic intelligence, brittle selectors, more code
123+
124+ ---
125+
126+ ## Agent Layer Examples
127+
128+ ### Google Search (6 lines of code)
129+
130+ ``` typescript
131+ import { SentienceBrowser , SentienceAgent , OpenAIProvider } from ' sentience-ts' ;
132+
133+ const browser = await SentienceBrowser .create ({ apiKey: apiKey });
134+ const llm = new OpenAIProvider (openaiKey , ' gpt-4o-mini' );
135+ const agent = new SentienceAgent (browser , llm );
136+
137+ await browser .getPage ().goto (' https://www.google.com' );
138+ await agent .act (' Click the search box' );
139+ await agent .act (" Type 'mechanical keyboards' into the search field" );
140+ await agent .act (' Press Enter key' );
141+ await agent .act (' Click the first non-ad search result' );
142+
143+ await browser .close ();
144+ ```
145+
146+ ** See full example:** [ examples/agent-google-search.ts] ( examples/agent-google-search.ts )
147+
148+ ### Using Anthropic Claude Instead of GPT
149+
150+ ``` typescript
151+ import { SentienceAgent , AnthropicProvider } from ' sentience-ts' ;
152+
153+ // Swap OpenAI for Anthropic - same API!
154+ const llm = new AnthropicProvider (
155+ process .env .ANTHROPIC_API_KEY ! ,
156+ ' claude-3-5-sonnet-20241022'
157+ );
158+
159+ const agent = new SentienceAgent (browser , llm );
160+ await agent .act (' Click the search button' ); // Works exactly the same
161+ ```
162+
163+ ** BYOB (Bring Your Own Brain):** OpenAI, Anthropic, or implement ` LLMProvider ` for any model.
164+
165+ ** See full example:** [ examples/agent-with-anthropic.ts] ( examples/agent-with-anthropic.ts )
166+
167+ ### Amazon Shopping (98% code reduction)
168+
169+ ** Before (manual approach):** 350 lines
170+ ** After (agent layer):** 6 lines
171+
172+ ``` typescript
173+ await agent .act (' Click the search box' );
174+ await agent .act (" Type 'wireless mouse' into the search field" );
175+ await agent .act (' Press Enter key' );
176+ await agent .act (' Click the first visible product in the search results' );
177+ await agent .act (" Click the 'Add to Cart' button" );
178+ ```
179+
180+ ** See full example:** [ examples/agent-amazon-shopping.ts] ( examples/agent-amazon-shopping.ts )
181+
182+ ---
183+
184+ ## Installation for Agent Layer
185+
186+ ``` bash
187+ # Install core SDK
188+ npm install sentience-ts
189+
190+ # Install LLM provider (choose one or both)
191+ npm install openai # For GPT-4, GPT-4o, GPT-4o-mini
192+ npm install @anthropic-ai/sdk # For Claude 3.5 Sonnet
193+
194+ # Set API keys
195+ export SENTIENCE_API_KEY=" your-sentience-key"
196+ export OPENAI_API_KEY=" your-openai-key" # OR
197+ export ANTHROPIC_API_KEY=" your-anthropic-key"
198+ ```
199+
200+ ---
201+
202+ ## Direct SDK Quick Start
16203
17204``` typescript
18205import { SentienceBrowser , snapshot , find , click } from ' ./src' ;
@@ -349,6 +536,12 @@ element.z_index // CSS stacking order
349536
350537See the ` examples/ ` directory for complete working examples:
351538
539+ ### Agent Layer (Level 3 - Natural Language)
540+ - ** ` agent-google-search.ts ` ** - Google search automation with natural language commands
541+ - ** ` agent-amazon-shopping.ts ` ** - Amazon shopping bot (6 lines vs 350 lines manual code)
542+ - ** ` agent-with-anthropic.ts ` ** - Using Anthropic Claude instead of OpenAI GPT
543+
544+ ### Direct SDK (Level 2 - Technical Control)
352545- ** ` hello.ts ` ** - Extension bridge verification
353546- ** ` basic-agent.ts ` ** - Basic snapshot and element inspection
354547- ** ` query-demo.ts ` ** - Query engine demonstrations
0 commit comments