DeepXiv CLI

deepxiv is a progressive-reading paper tool for open-access literature (arXiv, PMC, Semantic Scholar) with optional web search and an LLM-powered research agent.

The single most important rule: read the smallest amount of text that answers the question. Climb the ladder only as far as needed.

Progressive reading ladder

For any paper, prefer the cheapest rung that still answers the question:

Rung	Command	What you get	When to use
1	`paper <id> --brief`	Title, TLDR, keywords, citations, GitHub URL	First triage of any paper
2	`paper <id> --head`	Metadata + section list (JSON)	Decide which sections matter
3	`paper <id> --preview`	First ~10k chars (intro + early method)	Need more than TLDR but not full sections
4	`paper <id> --section <Name>`	One named section	Targeted answer (Method / Results / etc.)
5	`paper <id>` or `--raw`	Full markdown	Only when explicitly required

Never jump to rung 5 unless the user asked for a full read or the task truly needs it.

Setup

Before using deepxiv, verify it is available:

deepxiv --help

If missing, stop and tell the user — do not install it on your own. If the user asks you to install it, follow references/install.md, which has per-OS instructions, and only run install commands after the user explicitly approves them. deepxiv requires Python 3.10+.

Health and diagnostics (safe to run any time):

deepxiv health     # API + token reachability check
deepxiv debug      # environment diagnostics

Decision: which command do I want?

User wants…	Start with
"Find papers about X"	`search` (with filters if narrow)
"What's hot recently in X?"	`trending`
"Explain this paper" (has ID)	`paper --brief` → `--head` → section
"Compare these N papers"	`paper --brief` for each, then targeted sections
Biomedical / PubMed paper	`pmc <PMC_ID>`
Has only Semantic Scholar ID	`sc <id>`
"Who is this author / what's this project?"	`wsearch`
"Is this paper actually getting traction?"	`paper --popularity`
Open-ended multi-step research question	`agent query` (see caveats)

Core commands

search — arXiv search with filters

deepxiv search "agent memory" --limit 5
deepxiv search "multimodal reasoning" --limit 10 --format json

Filters (combine freely):

# Category filter (arXiv categories)
deepxiv search "retrieval" --categories cs.IR,cs.CL --limit 5

# Date window
deepxiv search "diffusion" --date-from 2025-01-01 --date-to 2025-06-30

# Citation floor — useful to skip obscure preprints
deepxiv search "world model" --min-citations 50 --limit 5

# Search mode: hybrid (default), bm25 (literal), vector (semantic)
deepxiv search "chain of thought" --mode bm25 --limit 5
deepxiv search "models that can think before answering" --mode vector

Defaults:

--limit 3 to 5 for triage; raise only when explicitly needed
--format json whenever you intend to post-process (pipe to jq)
Use bm25 for exact phrasing, vector for fuzzy concepts, hybrid otherwise

paper — get an arXiv paper

deepxiv paper 2409.05591 --brief        # rung 1
deepxiv paper 2409.05591 --head         # rung 2
deepxiv paper 2409.05591 --preview      # rung 3
deepxiv paper 2409.05591 --section Method   # rung 4
deepxiv paper 2409.05591                # rung 5 — full
deepxiv paper 2409.05591 --popularity   # social impact / trending signal
deepxiv paper 2409.05591 --raw          # raw markdown (full)

Section names come from --head. Common names: Introduction, Related Work, Method, Experiments, Results, Discussion, Limitations, Conclusion. Names are paper-specific — do not guess; check --head first if unsure.

Use --popularity when the user asks "is this paper a big deal" or you need to rank by attention rather than citations.

pmc — PubMed Central / biomedical

deepxiv pmc PMC544940 --head
deepxiv pmc PMC544940

PMC currently returns JSON only. Use when the target is biomedical or a PMC ID is given.

sc — Semantic Scholar lookup

deepxiv sc 258001
deepxiv sc 258001 --json

Use when the user gives a Semantic Scholar ID, or when you need richer metadata (citation graph, author info) for an arXiv paper that you have already cross-referenced.

trending — hot papers

deepxiv trending --days 7 --limit 10 --json
deepxiv trending --days 30 --limit 5

--days accepts only 7, 14, or 30. Use for weekly digests and "what's hot" requests.

wsearch — web search

deepxiv wsearch "karpathy"
deepxiv wsearch "DeepSeek R1 release notes" --json

Use for non-paper context: author background, project home pages, blog posts, release announcements. Cheap and broad — good for grounding before a paper read.

agent query — LLM-powered research agent

deepxiv agent query "Compare RAG vs long-context for code QA"
deepxiv agent query "Latest agent memory papers" --max-turn 10 --verbose

This is a multi-turn research agent that can search and read papers on its own. Caveats:

Requires the user to run deepxiv agent config once to set up their preferred LLM
Consumes LLM usage on the user's account
Slower and less predictable than manual search + paper flows
Prefer manual progressive reading by default; reach for agent query only when the question is genuinely open-ended and the user has agreed to the cost

JSON post-processing

When you need to slice search/trending output, prefer JSON + jq over re-running text searches:

deepxiv search "agent memory" --limit 10 --format json \
  | jq -r '.[] | "\(.arxiv_id)\t\(.citations // 0)\t\(.title)"' \
  | sort -k2 -n -r

deepxiv trending --days 7 --limit 20 --json \
  | jq -r '.[] | select(.categories[]? | test("cs\\.(AI|CL|LG)")) | .arxiv_id'

Recommended workflows

Topic exploration

"帮我找最近关于 agent memory 的论文":

deepxiv search "agent memory" --limit 5 --format json (add --date-from if "最近")
paper <id> --brief for each promising hit
Pick 1–2 for deeper reading

Single paper explanation

"讲讲这篇论文 <id>":

paper <id> --brief
paper <id> --head
Read 1–2 sections most relevant to the question (or --preview if unsure)
Summarize, and say which rung you stopped at

Baseline / comparison table

"帮我整理这个方向的 baseline":

Narrow search with --categories and optionally --min-citations
--brief every candidate
Read only Method / Experiments / Results for top picks
Extract: paper, task, dataset, metric, score, key idea

Author / project background check

"这篇论文的作者还做过什么？" / "这个项目背景是什么？":

deepxiv wsearch "<author or project>" --json
If a related arXiv paper surfaces, climb the reading ladder on it
Optionally sc <id> for citation context

Citation-aware filtering

"找有影响力的相关工作":

deepxiv search "..." --min-citations 100 --format json
Sort by citations via jq
Triage with --brief

Hot digest

"本周热门论文":

deepxiv trending --days 7 --limit 10 --json
--brief the top picks
Optional --popularity to rank by attention
Compact digest: theme overview → one-line per paper → which to read deeper

See references/workflows.md for fuller versions of these.

Output rules

Always say which rung of the ladder informed your conclusion (e.g. "based on --brief only" vs. "after reading the Method section")
Do not make section-level claims you didn't actually read
Prefer concise bullet summaries when comparing multiple papers
Keep context use low: small --limit, climb the ladder only as needed
For literature reviews, prefer iterative narrowing over one giant search

Common failure modes

Auth or rate-limit issues — run deepxiv health to check service reachability. If rate-limited or unauthorized, say so plainly and stop; do not silently retry.

Paper not found — verify ID format and source: arXiv (2409.05591), PMC (PMC544940), Semantic Scholar (258001). If unsure which, try wsearch first.

Section name mismatch — section names are paper-specific. Run --head to list real section names before --section.

Over-reading — do not jump to full text when --brief, --preview, or one section would do.

Python 3.9 install — deepxiv may install via pip but crash on first run. Switch to a Python 3.10+ environment.

Good defaults

search --limit: 3–5 for triage, 10 max
trending --limit: 5–10
Section reads per paper: 1–2 unless asked otherwise
Full paper reads: opt-in only
agent query: only for genuinely open-ended multi-step research, with user consent

deepxiv-cli

Safety Notice

Copy this and send it to your AI assistant to learn