CLI

Probe Zero

OmniScout's local answer engine for grounded, EXA-style web answers.

Probe Zero is OmniScout's local answer engine. It turns retrieved web evidence into concise, grounded sentences — without a hosted synthesis API.

Product page with live charts: omniscout.xyz/probe-zero

What it does

When you run omniscout answer, OmniScout:

Retrieves live web hits (titles, snippets, URLs)
Grounds the best supporting passages
Formats a tight answer with Probe Zero (or Classic extractive fallback)

Probe Zero is tuned for complete sentences that retain the gold fact from sources — the same style Exa popularized for agent search.

Enable Probe Zero

One-shot

omniscout answer "who is the CEO of nvidia" --probe

Default engine

omniscout settings
# → Answer engine → Probe Zero

Or in config.toml:

answer_engine = "probe"

Download local weights

omniscout install --probe-0-mini

Weights are ~556 MB FP16, fetched once from the public GitHub Release manifest (probe-zero-mini.manifest.json). Internal paths still use probe-zero-mini for backward compatibility.

Warmup (optional)

omniscout warmup --probe-0-mini

Classic vs Probe Zero

Engine	Best for
Classic	Fast defaults — direct web hits, extractive fallback, SmolLM2 synthesis
Probe Zero	Formatted, evidence-backed one-sentence answers from search supports

Flags:

--probe — use Probe Zero for this invocation
--classic — force Classic for this invocation

Gold benchmark

Probe Zero is evaluated on answer_gold.json (21 queries) with hand-validated teachers in probe-gold-training.json.

Format pass requires:

More than a one-word span
A complete sentence (≥5 words, proper punctuation)
The gold fact retained in the answer

Probe Zero scores use hand-validated teachers. The Classic baseline uses rule-based synthesis from the same search supports (format_answer_from_supports), matching the extractive path before optional SmolLM2 polish.

On the public gold suite (21 queries), Probe Zero teachers score 100% format pass; Classic rule-based synthesis from the same supports scores ~90%. Gaps show up most on entity (80%) and navigational (50%) categories.

Regenerate the website benchmark JSON:

cd cli
python scripts/export_probe_benchmark.py

This writes website/artifacts/omniscout/src/data/probe-zero-benchmark.json.

omniscout answer "..." --probe --data    # answer + diagnostics
omniscout benchmark answers --pipeline llm
omniscout benchmark format --dataset tests/fixtures/answer_gold.json

See Commands reference for full flag lists.

Edit this pageorReport an issue

Troubleshooting

Common OmniScout failure modes and how to fix them — especially when an AI agent is driving.

SDK Documentation

OmniScout SDK - Python API for programmatic access to OmniScout CLI