Add Directories
Workflow
- Parse the input (URL or pasted text) into a list of directories
- Deduplicate against existing entries in
directories.json - Append new entries with required fields
- Classify by running the analysis and verification pipeline
- Discover forms for submission targets
- Submit via automation or manual browser interaction
Step 1: Parse Input
From URL
Fetch the page and extract directory entries. Look for patterns like:
- Name + URL pairs in lists, tables, or cards
- Structured data (JSON-LD, markdown tables, CSV)
- Repeated DOM patterns with links
From GitHub Topics/Repos
Use gh CLI to explore curated lists:
gh repo clone <owner>/<repo>to clone awesome-lists- Parse README.md for directory links (markdown link format)
- Check for JSON/YAML data files with directory entries
- Can also create PRs to add your product to these lists
From Pasted Text
Parse lines/rows. Common formats:
Name - https://url.comorName | https://url.com- Markdown links:
[Name](https://url.com) - Markdown tables with Name and URL columns
- Plain URLs (one per line) — derive name from domain
- CSV/TSV with headers
Extract at minimum: name and url (submission or homepage).
Step 2: Deduplicate
Load directories.json and check each parsed entry against existing ones by:
- Exact URL match (normalize: strip trailing slash, lowercase domain)
- Domain match (same domain = likely duplicate)
- Name match (case-insensitive)
Report duplicates to the user and skip them.
Step 3: Append New Entries
For each new directory, create an entry with this structure:
{
"categories": ["General"],
"description": "",
"is_active": true,
"name": "Directory Name",
"pricing_type": "free",
"slug": "directory-name",
"submission_url": "https://example.com/submit",
"url": "https://example.com/submit"
}
Field rules:
slug: lowercase name, spaces to hyphens, strip special charssubmission_urlandurl: use the submission/signup URL if available, otherwise homepagedescription: leave empty string (will be filled later or by user)categories: default["General"]unless context provides a categorypricing_type: default"free"unless explicitly marked paidis_active: alwaystruefor new entries
Save the updated directories.json.
Step 4: Classify
Run the pipeline scripts in order using the project venv at .venv/:
# 1. HTTP-level analysis (auth, captcha, pricing signals, dead domains)
.venv/bin/python analyze_directories.py
# 2. Cleanup obvious failures + build browser check list
.venv/bin/python cleanup_and_categorize.py
# 3. Browser verification with Playwright (10 concurrent workers)
.venv/bin/python browser_verify.py
# 4. Deep recheck any remaining unknowns
.venv/bin/python browser_verify.py --recheck-unknown
Each script reads/writes directories.json. Steps 3-4 use browser_check_list.json as intermediate state (generated by step 2).
After completion, report the summary: how many added, and the auth/status breakdown for the new entries.
Step 5: Discover Forms
For directories that are active and have auth_type = none or auth_type = email_password:
# Discover form fields on submission pages
.venv/bin/python discover_forms.py
This visits each submission URL with Playwright, extracts form fields via DOM queries, and updates submission_plan.json with discovered fields and form paths.
Step 6: Submit
Automated Submission
Configure the PRODUCT dict in submit_directories.py with your details (search for YOUR_ placeholders), then:
# Auto-submit to all discovered directories
.venv/bin/python submit_directories.py
The script uses heuristic field mapping (matching field names/labels to product data) and handles file uploads for logo/screenshot.
Manual Browser Submission (via Playwright MCP)
For directories that need manual interaction (captcha, OAuth, complex forms), use the Playwright browser tools:
- Navigate to the submission URL
- Take a snapshot to understand the page structure
- Fill form fields using
browser_fill_formorbrowser_type - Handle OAuth flows by switching tabs when Google login popups open
- Upload files via
browser_file_upload - Click submit and verify confirmation
GitHub PR Submissions
Some directories accept submissions via GitHub PRs to awesome-lists:
- Fork the repo:
gh repo fork <owner>/<repo> - Clone and create a branch
- Add your product entry following the repo's format
- Push and create PR:
gh pr create
Notes
Pipeline Scripts
analyze_directories.pyusesThreadPoolExecutorwith plain HTTP — fast first passcleanup_and_categorize.pytriages errors (dead domains, invalid URLs, Facebook groups) and buildsbrowser_check_list.jsonbrowser_verify.pyuses async Playwright with 10 concurrent tabs;--recheck-unknowndoes a deep DOM pass on active unknowns onlydiscover_forms.pyuses async Playwright with 10 concurrent tabs; extracts form field names, types, labels, and pathssubmit_directories.pyuses async Playwright with 5 concurrent tabs; heuristic field mapping with file upload support- All scripts are idempotent — safe to re-run
Common Submission Blockers
When evaluating or submitting to directories, watch for these issues:
| Blocker | Frequency | How to Detect |
|---|---|---|
| Paid listing required | ~20% | Look for pricing page, Stripe/PayPal links, "$" on submit page |
| reCAPTCHA / Turnstile | ~10% | iframe[src*=recaptcha] or [data-turnstile] elements |
| Broken captcha | ~2% | "Invalid site key" errors, disabled submit buttons |
| Login/account required | ~15% | Redirect to /login or /register on submit URL |
| Business email required | ~3% | Rejects gmail/yahoo domains (e.g., SoftwareSuggest) |
| Reciprocal link required | ~5% | Old web directories require backlink before listing |
| Newsletter-only forms | ~10% | Page looks like submit but is actually email signup |
| Backend API broken | ~2% | Form submits but returns GraphQL/API errors |
| Domain parked/dead | ~8% | No content, parking page, DNS failure |
| Cloudflare blocked | ~3% | Challenge page, 403 errors |
Automation Tips
- Simple HTML forms have highest auto-submit success rate
- reCAPTCHA v3 (invisible) sometimes passes; v2 (checkbox) never does automatically
- Google Forms are reliably automatable
- Rich text editors (TinyMCE, Quill) need
browser_evaluateto set content - Cloudinary/custom upload widgets often break automation — use manual browser
- Cross-origin OAuth popups: Switch tabs with
browser_tabsaction to handle Google login - Combobox/select fields: Use
browser_clickon the dropdown, then click the option - Multi-step forms: Take snapshot after each step to see new fields
Submission Plan Structure
Each entry in submission_plan.json contains:
{
"directory_name": "Example AI",
"submission_url": "https://example.com/submit",
"status": "discovered",
"copy": {
"title": "Product Title Variation",
"description": "Product description variation for this directory."
},
"discovered_fields": [...],
"form_path": "form#submit-form",
"credentials": {
"email": "YOUR_EMAIL",
"name": "YOUR_NAME",
"username": "YOUR_USERNAME",
"password": "YOUR_PASSWORD"
}
}
Status values: discovered, submitted, skipped, skipped_paid, timeout, no_form_found, no_fields_matched, submit_timeout, captcha, cloudflare_blocked, domain_parked, skipped_login_required, deferred.
Best ROI Directory Types (for AI/SaaS products)
- AI tool directories with simple forms (FutureTools, SaaSHub, AItools.inc, etc.)
- Startup directories with Google Form submissions
- GitHub awesome-lists accepting PRs (free, high-quality backlinks)
- NoCode/SaaS aggregators (NoCodeList, NoCodeDevs)
- General web directories with DA≥30 (for SEO value)
Security Note
Before pushing to GitHub, ensure all personal data is stripped:
- Search for
YOUR_placeholders insubmission_plan.jsonandsubmit_directories.py - Never commit real emails, passwords, or API keys
- The
.playwright-mcp/folder may contain console logs with personal data — add to.gitignore