Discovery runs locally. Your page snapshots and interaction data are sent only to your configured LLM provider and to the BrowserWire CLI running on your machine. Nothing is uploaded to external BrowserWire servers.
How a discovery session works
Start an exploration session
After installing the Chrome extension and starting the BrowserWire CLI, open any website and click Start Exploring in the BrowserWire sidepanel. From this point, BrowserWire begins watching.
The extension captures page snapshots
As you navigate and interact with the site, the Chrome extension captures snapshots of the live page — the DOM structure, visible text, element roles, and the interactions you perform (clicks, form fills, navigation). Each captured interaction is recorded as a trace event tied to the current session.
A vision LLM perceives the page
BrowserWire sends page snapshots to your configured vision LLM (OpenAI, Anthropic, Gemini, or Ollama). The LLM identifies:
- Entities — what domain objects are present (a login form, a product listing, a search bar)
- Actions — what operations are available on each entity (fill a field, click a button, submit a form)
- Views — what structured data can be extracted (a list of results, a price, a status indicator)
Locators are synthesized
For each identified element, BrowserWire generates a set of locator strategies — multiple independent ways to target the same element. Strategies include ARIA role + name, CSS selector, XPath,
data-testid attribute, and DOM path. Each strategy is assigned a confidence score.Having multiple strategies means the runtime can fall back gracefully if the page layout changes slightly between sessions.The manifest is compiled
BrowserWire assembles all discovered entities, actions, views, and locators into a versioned manifest. Each item carries a confidence score and a provenance record so you can inspect exactly how it was learned.Once compiled, the manifest is immediately available via the REST API and OpenAPI spec.
Confidence levels
Every discovered item — entity, action, or view — carries a confidence score reflecting how reliably BrowserWire identified it.| Level | Score range | What it means |
|---|---|---|
high | 0.85 – 1.0 | The item was identified with strong signal. Safe to use in automated workflows. |
medium | 0.50 – 0.84 | The item was identified but some signals were ambiguous. Review before relying on it in production. |
low | 0.0 – 0.49 | The item was tentatively identified. Manual review is recommended. |
Provenance tracking
BrowserWire records where each manifest item came from through provenance metadata. Every entity, action, and view carries:| Field | Description |
|---|---|
source | Who or what produced the definition: human, agent, or hybrid |
sessionId | The exploration session that captured it |
traceIds | The specific interaction traces from that session |
capturedAt | When the item was captured (ISO 8601 timestamp) |
human— The item was identified from an interaction you performed directly (e.g. you filled in a form field)agent— The item was inferred entirely by the LLM from a page snapshot without direct user interactionhybrid— The item combined human interaction data with LLM inference
The role of the LLM provider
BrowserWire requires a vision-capable LLM to perceive page content. The LLM is used at two stages of the pipeline: identifying what’s on the page from screenshots, and assigning semantic names to discovered items. You configure your provider once, and all discovery sessions use it:| Provider | Default model |
|---|---|
openai | gpt-4o |
anthropic | claude-sonnet-4-20250514 |
gemini | gemini-2.5-flash |
ollama | llama3 |
Ollama runs locally and requires no API key, but discovery quality depends on the vision capabilities of the model you have installed.