fetching-blocked-urls

Retrieve clean markdown from URLs when web_fetch fails. Converts pages via Jina AI reader service with automatic retry. Use when web_fetch or curl returns 403, blocked, paywall, timeout, JavaScript-rendering errors, or empty content or user explicitly suggests using jina.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "fetching-blocked-urls" with this command: npx skills add oaustegard/claude-skills/oaustegard-claude-skills-fetching-blocked-urls

Fetching Blocked URLs

Retrieve readable content from URLs that web_fetch cannot access. Jina AI's reader service renders JavaScript, bypasses soft blocks, and returns clean markdown.

Activation Triggers

Invoke this skill immediately when web_fetch returns:

  • 403 Forbidden or access denied
  • Paywall or login wall indicators
  • Empty, garbled, or truncated content
  • JavaScript-heavy SPA failures
  • Timeout errors

Core Command

curl -s --max-time 30 "https://r.jina.ai/TARGET_URL"

The service returns markdown with page title, body text, and preserved links.

Retry Pattern

Jina's backend has ~10% intermittent failures. Use retry logic to achieve 99%+ success:

for attempt in 1 2 3; do
  result=$(curl -s --max-time 30 "https://r.jina.ai/TARGET_URL" 2>&1)
  echo "$result" | grep -q "upstream connect error" || { echo "$result"; break; }
  [ $attempt -lt 3 ] && sleep 1
done

Workflow Integration

  1. Primary: Use web_fetch (native tool)
  2. Fallback: This skill with retry when web_fetch fails
  3. Escalate: Request user assistance only after retry exhaustion

Attempt this fallback before asking users to copy-paste content manually.

Output Format

Jina returns structured markdown:

  • Title: page title
  • URL Source: original URL
  • Markdown Content: extracted body text, links preserved

Limitations

  • Long pages may truncate
  • Sites blocking all scrapers remain inaccessible
  • Login-required content limited to public portions
  • Real-time dynamic content may not render

Domain Access

r.jina.ai is whitelisted in Claude container network configuration.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

crafting-instructions

No summary provided by upstream source.

Repository SourceNeeds Review
General

remembering

No summary provided by upstream source.

Repository SourceNeeds Review
General

controlling-spotify

No summary provided by upstream source.

Repository SourceNeeds Review
General

browsing-bluesky

No summary provided by upstream source.

Repository SourceNeeds Review