CLI

Using OmniScout with AI agents

Drop-in prompts and skill files for Claude Code, Cursor, Codex, and Kimi to drive OmniScout.

OmniScout is built for AI agents to drive. This page collects everything you need to make Claude Code, Cursor, Codex, Kimi, or any other shell-capable agent productive with OmniScout in under a minute.

0. Install OmniScout

pip install omniscout

1. Install the skill

omniscout install --skill

This copies SKILL.md (the agent-facing instructions) plus references/operations.md into the well-known agent skill locations:

Agent	Skill location
Claude Code	`~/.claude/skills/scout/SKILL.md`
Cursor	`~/.cursor/skills-cursor/scout/SKILL.md`
Codex	`~/.codex/skills/scout/SKILL.md`
Antigravity	`~/.gemini/config/skills/scout/SKILL.md`

The agent picks it up automatically on next session — no manual config.

You can also just paste the contents of docs/agent-skill.md into any agent's system prompt if it doesn't have a skill mechanism.

2. Start the daemon (once)

omniscout daemon start

It's idempotent — running it again is a no-op. The daemon survives between agent sessions. Verify with omniscout daemon status.

3. Drop a prompt

These prompts assume the skill is installed. Pick the closest match to your task; the agent will translate it into the right sequence of omniscout browser ... commands.

Ask an AI to build a knowledge graph

Build a knowledge graph for "Cursor" with OmniScout. Run
`omniscout graph "Cursor" --json --data`, show the tree, and cite sources.
If I give a website URL, use site-only mode (`omniscout graph "cursor.com"` or
`-w cursor.com`).

@omniscout Map "Cursor" as a knowledge graph: company, founders, competitors,
pricing, features. Use `omniscout graph` with JSON output. For a specific
site, pass the URL or `--website`.

Use `omniscout graph "<entity>" --json` to produce a structured knowledge
graph tree. Add `--data` for sources. Use a URL entity or `--website` when
the user points at one site only.

Ask an AI to research a question

Research "state of open-source browser-using AI agents in 2026" using
OmniScout. Run `omniscout research` with at least 8 results, then summarize the
findings in 3-5 bullet points with citations to source URLs.

@omniscout Use OmniScout to research the question: "what's the practical
difference between Browserbase, Browser-Use, and Stagehand?"
Compare them in a table covering pricing model, runtime, element
addressing strategy, and best-fit use case.

You have access to OmniScout via the `omniscout` CLI. Research "FedRAMP-ready
LLM hosting providers" using `omniscout research --results 10 --json`,
extract the top 5 sources, and write a concise report to /tmp/report.md.

You have the `omniscout` CLI available. Use it to research a topic the user
gives you. Use `omniscout research "<topic>" --json` and summarize the
returned `summary` plus the top 3 passages.

Topic: <YOUR TOPIC HERE>

Ask an AI to navigate and click

Use OmniScout to open https://news.ycombinator.com, snapshot the page, and
click the top story link. Then take a screenshot saved to /tmp/hn-top.png
and tell me the page title of the article that loaded.

@omniscout Open the GitHub trending page in OmniScout, snapshot the page, click
the first repo link, and read me the README using `omniscout extract` on
the resulting URL.

Use the OmniScout CLI to:
1. Navigate to https://hn.algolia.com
2. Fill the search box with "browser agents"
3. Click search
4. Take a screenshot to /tmp/results.png
5. Use `omniscout browser snapshot --refs-only` and list the first 5 link
   refs you see.

Ask an AI to log in once and reuse

Help me set up a logged-in profile for GitHub via OmniScout. Run
`omniscout browser login https://github.com/login --profile work` to open a
headful window. Wait for me to authenticate in the browser, then I'll
tell you "done". After that, prove the profile works by running
`omniscout browser navigate https://github.com/settings --profile work` and
showing me the resulting page title.

@omniscout I want to scrape my private GitHub starred repos. First create a
profile via `omniscout profile create work`. Then run
`omniscout browser login https://github.com/login --profile work`. Pause
for me to log in. Then navigate to https://github.com/?tab=stars and
snapshot the page. Loop over the @eN refs whose role is "link" and
extract their titles.

Ask an AI to fill a form and submit

Use OmniScout to:
1. Open https://duckduckgo.com
2. Snapshot the page
3. Fill the search textbox with "local first AI agents"
4. Press Enter
5. Wait for the results to load (wait --idle)
6. Snapshot again and list the title of the first 5 results.

@omniscout Open https://google.com/forms/example, fill the visible form
fields in order with these values: name="Test User",
email="test@example.com", message="Hello from OmniScout". Then take a
screenshot and tell me which fields you filled.

Ask an AI to capture network traffic

Use OmniScout to investigate what tracking calls a site makes. Start network
capture, navigate to https://example.com, scroll to the bottom, stop
capture, then list any captured requests whose URL matches
"analytics|google|facebook|stripe". For the matching ones, run `network
detail` and tell me the response status code.

Profile the network behavior of https://vercel.com/pricing using OmniScout:
- `omniscout browser network start`
- `omniscout browser navigate https://vercel.com/pricing`
- `omniscout browser scroll down --amount 10`
- `omniscout browser network stop`
- `omniscout browser network list --filter "stripe|payment"`
Return the result as JSON.

Ask an AI to handle a CAPTCHA

Open https://site-with-captcha.example using OmniScout. If you detect a
CAPTCHA, run `omniscout browser captcha --detect-only` and tell me the
type. Then run `omniscout browser captcha` (no solver flag) so the tab
flips headful and pauses. I'll solve it manually; you continue once
detection comes back clean.

Open https://site-with-captcha.example using OmniScout. If there's a CAPTCHA,
solve it via 2Captcha — `TWOCAPTCHA_API_KEY` is set in my env. Run
`omniscout browser captcha --solver 2captcha` and continue once it returns
solved=true.

The default --solver none is the only mode that's truly local-first. The third-party solvers (2captcha, capsolver) send the sitekey + page URL to those services, which costs money and leaves your machine. Opt in explicitly.

4. Multi-step agent loop (the killer use case)

Most useful agent loops combine all of the above in a single task. Here's a complete worked example you can hand to Claude Code verbatim:

You have OmniScout available via the `omniscout` CLI. Carry out this task end
to end:

1. Start the daemon if it isn't running.
2. Create a profile "research" if it doesn't exist.
3. Use OmniScout to research "vector databases benchmark 2026" with at least
   10 results. Save the report JSON to /tmp/research.json.
4. From the report, pick the 3 source URLs with the highest passage
   scores. For each one:
     a. Open it in OmniScout (profile=research).
     b. Take a screenshot to /tmp/<i>.png.
     c. Use `omniscout extract <url>` to get the clean Markdown.
     d. Save the markdown to /tmp/<i>.md.
5. Tell me the 3 source URLs and the path to each screenshot/markdown
   pair. Then summarize the *common* themes across the 3 articles in 5
   bullet points, citing which article each theme came from.
6. Close all OmniScout sessions when done.

The agent will run something like:

omniscout daemon status
omniscout profile create research
OMNISCOUT_JSON=1 omniscout research "vector databases benchmark 2026" --results 10 > /tmp/research.json
URLS=$(jq -r '.sources[] | .url' /tmp/research.json | head -3)
i=1
for url in $URLS; do
  omniscout browser navigate "$url" --profile research --session research
  omniscout browser screenshot --session research --out /tmp/$i.png
  omniscout extract "$url" > /tmp/$i.md
  i=$((i+1))
done
omniscout browser close --all

5. JSON contract for agents

Every command supports --json (or env OMNISCOUT_JSON=1). Output is deterministic, with logs separated to stderr. Agents should:

Set OMNISCOUT_JSON=1 once at the start of a session.
Pipe stdout through jq for structured access.
Treat any response with ok: false as recoverable; use error_kind to decide whether to retry, re-snapshot, or surface to the user.

OMNISCOUT_JSON=1 omniscout browser snapshot 2>/dev/null | jq '.refs[] | select(.role == "button")'

Two fields you should always pay attention to:

Field	Where	What it means
`snapshot_generation`	every response's `data`	Monotonic per session. If it changed since your last `snapshot`, re-snapshot before re-using `@eN` refs.
`action_id`	every response (top level)	Stable hex ID for this exact invocation. Use it to call `omniscout daemon trace <action_id>` or `omniscout daemon replay <action_id>`.

6. Trace, replay, watch (the agent debugging loop)

# What did OmniScout just do?
omniscout daemon trace -n 5 --session demo

# Re-run a single call by ID (skips interactive verbs like login).
omniscout daemon replay 8f3a7c9e1b2d4e5f

# Re-run every replayable action for a session in the last minute.
omniscout daemon replay --session demo --since 60

# Tail live events (use --json-lines for machine-readable output).
omniscout daemon watch

When an agent fails halfway through a task, an agentic prompt like "replay the last 30 seconds against session demo" tends to recover state faster than asking the agent to remember each step itself.

7. Stable `error_kind` values agents can branch on

`error_kind`	Meaning	Recommended action
`timeout`	Operation exceeded its budget	Retry with `--timeout-ms` larger, or `wait --idle` first
`no_such_session`	Session doesn't exist	Re-`navigate` to recreate it
`no_such_ref`	`@eN` ref expired or page changed	Re-run `snapshot` and use the new refs
`backend_unavailable`	Extension backend not connected	Drop `--backend extension` to fall back to Playwright
`invalid_args`	Bad CLI arguments	Surface to user; don't retry blindly
`internal`	Unhandled daemon error	Check `omniscout daemon logs -n 100`
`requires_user`	CAPTCHA / login needs human	Tell the user what to do and pause
`unsupported`	Backend can't do this verb	Switch to Playwright (e.g. `pdf`, `upload`)

8. Skill template for any new agent

If you're integrating OmniScout into an agent without a skill mechanism, paste this into the agent's system prompt:

You have access to OmniScout via the `omniscout` CLI. OmniScout drives a browser
locally — it can navigate, click, fill, scroll, take screenshots, capture
network traffic, detect CAPTCHAs, and run multi-step research pipelines.

Always:
- Set OMNISCOUT_JSON=1 for structured output.
- Run `omniscout browser snapshot --refs-only` to find elements; use the
  returned @eN refs for click/fill rather than guessing CSS selectors.
- After EVERY response, check data.snapshot_generation. If it differs
  from the value your last `snapshot` returned, re-snapshot before re-
  using cached @eN refs.
- Save the top-level `action_id` from each response — you can replay or
  trace by ID later if something goes sideways.
- Close sessions with `omniscout browser close --all` at the end of a task.
- Screenshots go to disk; read them via your file-read tool, never embed
  base64 in your reply.

Available verbs:
navigate, snapshot, click, fill, scroll, key, hover, upload, screenshot,
pdf, eval, wait, tab list|close|switch, network start|stop|list|detail,
login, captcha, close.

Diagnostics: omniscout daemon trace, replay, watch, status, logs.

Run `omniscout <verb> --help` if unsure of arguments.

Edit this pageorReport an issue

What is OmniScout?

OmniScout — local-first browser control, semantic search, and research for AI agents.

Examples & recipes

Real workflows for OmniScout — both shell scripts and prompts you can hand to an AI agent.