ScienceDirect PDF Download
Download PDF files from ScienceDirect articles to the user's local disk.
Prerequisites
- The user must have access to the article (institutional subscription, open access, or personal subscription).
- If the article is behind a paywall and the user has no access, the download will fail.
Single Article Download
Step 1: Navigate to article page and extract PDF link
If already on the article page, skip navigation. Otherwise use navigate_page with initScript:
navigate_page({
url: "{BASE_URL}/science/article/pii/{PII}",
initScript: "Object.defineProperty(navigator, 'webdriver', {get: () => undefined})"
})
Then extract the PDF URL with evaluate_script:
async () => {
// Wait for page to load
for (let i = 0; i < 20; i++) {
if (document.querySelector('a[href*="pdfft"]') || document.querySelector('.access-options')) break;
await new Promise(r => setTimeout(r, 500));
}
const pdfLink = document.querySelector('a[href*="pdfft"][class*="accessbar"]')
|| document.querySelector('a[href*="pdfft"]');
if (pdfLink) {
return { pdfUrl: pdfLink.href };
}
const noAccess = document.querySelector('.access-options, .get-access, [class*="GetAccess"]');
if (noAccess) {
return { error: 'No access. User needs institutional or subscriber access.' };
}
return { error: 'PDF link not found on this page.' };
}
Step 2: Navigate to PDF URL
Open the PDF URL with initScript:
navigate_page({
url: "{pdfUrl}",
initScript: "Object.defineProperty(navigator, 'webdriver', {get: () => undefined})"
})
Step 3: Handle Cloudflare verification
After navigation, wait 5s then check the page state with evaluate_script:
async () => {
await new Promise(r => setTimeout(r, 5000));
return {
contentType: document.contentType,
title: document.title,
url: window.location.href.substring(0, 80)
};
}
Three possible outcomes:
A) contentType === 'application/pdf' → PDF loaded directly. Skip to Step 4.
B) Title is "Security verification" or body contains "Are you a robot" → Turnstile checkbox appeared. Auto-click:
- Use
take_snapshotto get the a11y tree. Find the checkbox inside the Cloudflare iframe:Iframe "包含 Cloudflare 安全质询的小组件" checkbox "确认您是真人" ← target uid - Use
click(uid)on the checkbox element. (Chrome DevTools MCP works at CDP protocol level and can interact with cross-origin iframes.) - Wait 5s, then re-check
document.contentType:- If
application/pdf→ success, proceed to Step 4. - If still on verification → retry click once. After 2 failed attempts, tell user: "请在浏览器中完成验证后告知我。"
- If
C) Title is "请稍候…" → JS challenge still running. Wait another 5s and re-check. Usually resolves to A or B.
Step 4: Trigger download to local disk
Once contentType === 'application/pdf', the PDF is displayed in browser but NOT saved to disk yet. Trigger the actual download with evaluate_script:
(pii) => {
const a = document.createElement('a');
a.href = window.location.href;
a.download = pii + '-main.pdf';
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
return { downloaded: true, filename: pii + '-main.pdf' };
}
Pass the PII as an argument so the filename is meaningful (e.g. S0022169426003136-main.pdf).
Tell the user the PDF has been downloaded and the filename.
Batch Download from Search Results
Note: Batch download requires authentication. Checkboxes and download buttons are only visible to logged-in users.
Step 1: Select articles
On the search results page, select articles using checkboxes. Each checkbox id is the PII.
Step 2: Submit download form
Use evaluate_script:
(piis) => {
piis.forEach(pii => {
const cb = document.getElementById(pii);
if (cb && !cb.checked) cb.click();
});
const downloadBtn = document.querySelector('.download-all-link-button');
if (downloadBtn) {
downloadBtn.click();
return { success: true, message: 'Download initiated for selected articles.' };
}
const form = document.querySelector('form[action*="pdf/download"]');
if (!form) return { error: 'Download form not found. User may need to log in.' };
return { error: 'Download button not found.' };
}
PDF URL Pattern
{BASE_URL}/science/article/pii/{PII}/pdfft?md5={HASH}&pid=1-s2.0-{PII}-main.pdf
The md5 hash is unique per article and session. It cannot be constructed — it must be extracted from the page.
Notes
- This skill is set to
disable-model-invocation: true— it must be explicitly invoked with/sd-download. - Open access articles can be downloaded without authentication.
- For subscription articles, the user must be authenticated (institutional access, VPN, or personal subscription).
- After solving Cloudflare once, subsequent PDF downloads in the same session work without interruption.
- If download fails, suggest the user check their access status or try logging in.