data-feeds

Extract structured data from 40+ websites including Amazon, LinkedIn, Instagram, TikTok, Facebook, YouTube, and more. Uses Bright Data's Web Data APIs with automatic polling. Returns clean JSON with product details, profiles, reviews, posts, and comments.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "data-feeds" with this command: npx skills add brightdata/skills/brightdata-skills-data-feeds

Bright Data - Structured Data Feeds

Extract structured data from major websites with automatic parsing. No scraping logic needed - just provide a URL and get clean JSON data.

Setup

Environment Variables (Required)

export BRIGHTDATA_API_KEY="your-api-key"

Optional

export BRIGHTDATA_POLLING_TIMEOUT=600  # Max seconds to wait (default: 600)

Get your API key from Bright Data Dashboard.

Usage

bash scripts/datasets.sh <dataset_type> <url> [additional_params...]

Available Datasets

E-Commerce

DatasetCommandDescription
Amazon Productdatasets.sh amazon_product <url>Product details, pricing, ratings
Amazon Reviewsdatasets.sh amazon_product_reviews <url>Customer reviews for a product
Amazon Searchdatasets.sh amazon_product_search <keyword> <domain_url>Search results
Walmart Productdatasets.sh walmart_product <url>Product details from Walmart
Walmart Sellerdatasets.sh walmart_seller <url>Seller information
eBay Productdatasets.sh ebay_product <url>eBay listing details
Home Depotdatasets.sh homedepot_products <url>Home Depot product data
Zaradatasets.sh zara_products <url>Zara product details
Etsydatasets.sh etsy_products <url>Etsy listing data
Best Buydatasets.sh bestbuy_products <url>Best Buy product info

Professional Networks

DatasetCommandDescription
LinkedIn Persondatasets.sh linkedin_person_profile <url>Profile data (experience, skills)
LinkedIn Companydatasets.sh linkedin_company_profile <url>Company page data
LinkedIn Jobsdatasets.sh linkedin_job_listings <url>Job posting details
LinkedIn Postsdatasets.sh linkedin_posts <url>Post content and engagement
LinkedIn Searchdatasets.sh linkedin_people_search <url> <first> <last>Find people
Crunchbasedatasets.sh crunchbase_company <url>Company funding, employees
ZoomInfodatasets.sh zoominfo_company_profile <url>Company profile data

Instagram

DatasetCommandDescription
Profilesdatasets.sh instagram_profiles <url>Bio, followers, following
Postsdatasets.sh instagram_posts <url>Post details, likes, captions
Reelsdatasets.sh instagram_reels <url>Reel data and metrics
Commentsdatasets.sh instagram_comments <url>Post comments

Facebook

DatasetCommandDescription
Postsdatasets.sh facebook_posts <url>Post content and reactions
Marketplacedatasets.sh facebook_marketplace_listings <url>Listing details
Reviewsdatasets.sh facebook_company_reviews <url> [num]Company reviews
Eventsdatasets.sh facebook_events <url>Event details

TikTok

DatasetCommandDescription
Profilesdatasets.sh tiktok_profiles <url>Creator profile data
Postsdatasets.sh tiktok_posts <url>Video details and metrics
Shopdatasets.sh tiktok_shop <url>TikTok Shop product data
Commentsdatasets.sh tiktok_comments <url>Video comments

YouTube

DatasetCommandDescription
Profilesdatasets.sh youtube_profiles <url>Channel data
Videosdatasets.sh youtube_videos <url>Video details and stats
Commentsdatasets.sh youtube_comments <url> [num]Video comments (default: 10)

Other Social

DatasetCommandDescription
X (Twitter)datasets.sh x_posts <url>Tweet data
Redditdatasets.sh reddit_posts <url>Post and comment data

Google Services

DatasetCommandDescription
Maps Reviewsdatasets.sh google_maps_reviews <url> [days]Business reviews (default: 3 days)
Shoppingdatasets.sh google_shopping <url>Product comparison data
Play Storedatasets.sh google_play_store <url>App details and reviews

Other

DatasetCommandDescription
Apple App Storedatasets.sh apple_app_store <url>iOS app data
Reuters Newsdatasets.sh reuter_news <url>News article content
GitHubdatasets.sh github_repository_file <url>Repository file data
Yahoo Financedatasets.sh yahoo_finance_business <url>Stock and company data
Zillowdatasets.sh zillow_properties_listing <url>Property listing details
Booking.comdatasets.sh booking_hotel_listings <url>Hotel listing data

Examples

Get LinkedIn Profile

bash scripts/datasets.sh linkedin_person_profile "https://www.linkedin.com/in/satyanadella/"

Get Amazon Product

bash scripts/datasets.sh amazon_product "https://www.amazon.com/dp/B09V3KXJPB"

Get Instagram Profile

bash scripts/datasets.sh instagram_profiles "https://www.instagram.com/natgeo/"

Get YouTube Comments

bash scripts/datasets.sh youtube_comments "https://www.youtube.com/watch?v=dQw4w9WgXcQ" 20

Search Amazon

bash scripts/datasets.sh amazon_product_search "wireless headphones" "https://www.amazon.com"

Output Format

Returns structured JSON with website-specific fields. Example for LinkedIn profile:

{
  "name": "Satya Nadella",
  "headline": "Chairman and CEO at Microsoft",
  "location": "Greater Seattle Area",
  "connections": "500+",
  "experience": [...],
  "education": [...],
  "skills": [...]
}

How It Works

  1. Trigger: Sends URL to Bright Data's Web Data API
  2. Poll: Waits for data collection to complete (checks every second)
  3. Return: Outputs structured JSON when ready

The polling mechanism handles rate limits and ensures data quality by waiting for full extraction.

Advanced: Direct Fetch

For custom dataset IDs or advanced use cases:

bash scripts/fetch.sh <dataset_id> '<json_input>'

Example:

bash scripts/fetch.sh gd_l1viktl72bvl7bjuj0 '{"url":"https://linkedin.com/in/someone"}'

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

bright-data-mcp

No summary provided by upstream source.

Repository SourceNeeds Review
General

scrape

No summary provided by upstream source.

Repository SourceNeeds Review
General

search

No summary provided by upstream source.

Repository SourceNeeds Review
General

bright-data-best-practices

No summary provided by upstream source.

Repository SourceNeeds Review