Bright Data — Search
Find things on the web. Two commands live in this skill:
bdata search— classic keyword SERP (Google/Bing/Yandex). Best when you want "what ranks for keyword X."bdata discover— AI intent-ranked discovery with optional page content. Best when you want "pages about topic Y that match intent Z."
For structured data from a known platform (Amazon, LinkedIn, TikTok, …), stop and use data-feeds instead.
Setup gate (run first)
if ! command -v bdata >/dev/null 2>&1; then
echo "bdata CLI not installed — see bright-data-best-practices/references/cli-setup.md"
elif ! bdata zones >/dev/null 2>&1; then
echo "bdata not authenticated — run: bdata login (or: bdata login --device for SSH)"
fi
Halt and route to skills/bright-data-best-practices/references/cli-setup.md if either check fails.
Pick your path
| Situation | Action |
|---|---|
| Single keyword query, just SERP | bdata search "<query>" --engine google --json --pretty |
| Paginated SERP (more results) | loop --page 0, --page 1, … (0-indexed) |
| Multiple queries | shell loop over a queries file |
| Intent-ranked / semantic (not keyword) | bdata discover "<query>" --intent "<intent>" --num-results 20 |
| Want page bodies along with results, one pass | bdata discover ... --include-content |
| News / images / shopping SERP | bdata search "<query>" --type news (or images, shopping) |
| Want Amazon/LinkedIn/TikTok/… structured data | stop — hand off to data-feeds |
| Have URLs, want content | hand off to scrape |
Action
Core commands:
# Google SERP, structured JSON
bdata search "site:example.com privacy policy" --engine google --json --pretty
# Localized Bing (German results, German language)
bdata search "datenschutz" --engine bing --country de --language de --json
# Second page of results (0-indexed)
bdata search "machine learning papers" --page 1 --json
# Mobile SERP (rankings differ from desktop)
bdata search "best coffee shops" --device mobile --json
# News vertical
bdata search "openai" --type news --json --pretty
# Intent-ranked discovery
bdata discover "enterprise LLM platforms" \
--intent "vendor pages with pricing" \
--num-results 15 --json
# Discovery with page content in markdown
bdata discover "webhook best practices" \
--include-content --num-results 10 -o results.json
# Date-filtered discovery
bdata discover "react server components" \
--start-date 2025-01-01 --end-date 2025-12-31 --num-results 20
Full flag reference: references/flags.md.
search vs discover — pick the right one
| You want | Use |
|---|---|
| "What Google ranks for this exact keyword" | search |
| "Pages that match this meaning/intent" | discover |
| "News / images / shopping vertical SERP" | search --type <vertical> |
| "Results + page bodies in one call" | discover --include-content |
| "Dedup / semantic ranking across queries" | discover |
Verification gate
- JSON parses cleanly:
jq . <output>returns 0. - Result array non-empty — if empty, the query is legitimately zero-result; relax the query and re-run. Don't claim success on empty results without telling the user.
- Required fields present:
search: results live at.organic[]; each hastitle+linkdiscover: results live at.results[]; each hastitle+link; if--include-content, alsocontent
- For
discover --include-content: no block-page signatures in thecontentfield (same list as scrape, case-insensitive):Access DeniedJust a momentAttention RequiredChecking your browsercaptchacf-browser-verificationcloudflare(with < 2KB total body)
- Geo sanity: if the user expected country-specific results, inspect TLDs / languages of top results. If mis-localized, re-run with explicit
--countryand--language.
Red flags
- Using
searchto fetch content from Amazon, LinkedIn, TikTok, etc. whendata-feedsreturns clean structured data in one call. - Scraping every SERP result blindly — filter first (domain allowlist, keyword in title, relevance heuristic).
- Confusing
search(keyword) withdiscover(semantic). They answer different questions. - Running multiple queries without deduping URLs across result sets before scraping.
- Assuming SERP order is universal — it's personalized by geo + device. Always set
--countryand--deviceexplicitly for reproducibility. - Using
--pageas a result count — it's a page index, not a limit. Each page returns ~10 results. - Assuming SERP results are at
.results[]— forbdata searchthey live at.organic[]. (Discover uses.results[].) - Hardcoding
--num-results 100ondiscoverwithout realizing the pipeline polls until that many are found; can be slow.
References
references/flags.md— full flags forsearchanddiscoverwith when-to-use notes.references/patterns.md— multi-query dedup, SERP → filter → scrape pipeline,searchvsdiscoverdecision, legacycurlfallback, shared verification checklist.references/examples.md— (1) single Google query, (2) localized Bing, (3) batch queries + dedup into URL list, (4)discover --include-contentend-to-end.