User Guide Automation for Web Applications
Use this skill when the user wants to create a formal, detailed user guide from a web application, especially when the output must be a Markdown document ready for later DOCX conversion.
Core idea
This workflow is based on browser exploration, not source-code inspection. The guide is built from what the web app shows in a live browser session:
- page structure from
browser_snapshot() - visual confirmation from
browser_vision() - interactions with
browser_click(),browser_type(),browser_scroll(), andbrowser_press() - console checks from
browser_console()when needed
If the user provides a flow, follow that flow exactly. If the user does not provide a flow, explore the web app and infer the main user journey.
This skill currently supports one output mode:
- Sprint-1 / draft mode — output Markdown only, but structure it like a final document so it can be converted to DOCX later without re-authoring.
When a reference document is provided (for example a PDF user guide), extract its structural pattern first and turn it into a reusable Markdown blueprint before writing the actual guide. Preserve:
- front matter / cover pattern
- table of contents pattern
- section lettering or numbering style
- repeated subflow pattern (e.g. daftar/tambah/detail/edit/upload/hapus/arsip)
- screenshot placement rhythm
- instructional wording style
The default file naming convention is:
guide.mdfor the main documentscreenshots/for evidence assets- one nested folder per feature or workflow stage inside
screenshots/
Default requirements
- Output format: Markdown (.md) only
- Style: formal
- Detail level: detailed
- Screenshots: capture every step, not only important steps
- Audience: admin, regular user, or mixed depending on the web app
- Template: use the provided structure as a Markdown blueprint; do not generate DOCX in this skill
Slash-command intake form
When the user runs /user-guide-automation, ask for the minimum required information in a copy-ready format:
- URL:
- Lokasi output:
- Scope: full site / fitur tertentu (sebutkan fiturnya)
Optional but recommended:
- Role:
- Credential:
- Flow:
- Template path:
- Output filename:
- Catatan khusus:
Rules:
- If the user chooses a specific feature scope, stay within that feature unless related navigation is needed.
- If the user chooses full site, explore the main journeys and major feature groups.
- If credentials are provided, use them only for the requested app and flow.
Recommended workflow
1) Intake
Collect or confirm:
- target web app URL
- target user role, if relevant
- user-provided flow, if any
- DOCX template path
- output filename or folder
- whether the guide should cover one feature or the whole workflow
- screenshot output directory behavior (default: create
screenshots/inside the output folder)
If the user did not provide a flow, proceed with exploration.
When the user provides only a top-level output folder, create this structure:
{output_dir}/
├── guide.md
├── screenshots/
│ ├── dashboard/
│ ├── ai-characters/
│ └── ...
└── references/
Use one subfolder per feature, page group, or workflow stage so screenshots stay organized. Use lowercase kebab-case folder names when possible.
2) Explore the web app
Open the app and identify the real user journey. Use the browser tools in this order whenever useful:
browser_navigate(url=...)browser_snapshot()browser_vision(question=..., annotate=true)browser_click(ref=...)browser_type(ref=..., text=...)browser_scroll(direction=...)browser_press(key=...)browser_console(clear=true)after navigation or significant interaction
For pages with animations, lazy loading, or scroll-revealed content, do not judge a section from one viewport alone. Scroll in small increments, pause briefly, then re-check with browser_snapshot() and/or browser_vision() before deciding content is missing or empty.
During exploration, collect only the UI details needed for the guide:
- page title
- menu names
- button labels
- field labels
- success/error messages
- breadcrumb/path names
Do not over-extract text. Keep it lean.
3) Build the step list
Create a step list that matches the real interaction sequence. If a user flow exists, preserve the provided order. If not, infer a practical order from the app UI.
Each step should include:
- step number
- action verb
- target UI element or page
- short explanatory note if needed
- screenshot reference
For sprint-1 Markdown drafts, organize the step list into document-like sections first, then steps. Use the same top-level section order that a DOCX template would later expect, so the MD stays conversion-friendly.
4) Capture screenshots for every step
Take screenshots for all steps, including intermediate ones. Use screenshots as instruction illustrations, not as bug evidence.
If a step changes the page state, capture the new state immediately.
Practical lesson: when the workflow includes login, capture a dedicated login screenshot before entering the protected area. Do not reuse a later dashboard screenshot as evidence for the login step; keep the login step, post-login dashboard, and feature page as separate screenshots so the guide stays clear and accurate.
5) Write the guide text
Write formal Indonesian. Preferred structure:
- title/cover
- table of contents if the template supports it
- section headings
- numbered steps
- concise but detailed instructions
- screenshot captions
- notes or warnings only when needed
Keep wording instructional and neutral. Avoid QA language such as severity, issue, or defect.
6) Write the Markdown guide
If a template or blueprint is provided:
- reuse the existing Markdown structure
- preserve header/footer-equivalent sections only as text blocks
- keep the hierarchy consistent
- do not switch the output to DOCX in this skill
If the structure is blank:
- populate it cleanly with the generated guide content
7) Verify the final Markdown
Check that:
- all steps are present
- screenshots match the correct steps
- the language is formal
- the Markdown structure is complete and readable
- the output filename is correct
When to use user-provided flow vs exploration
Use the user’s flow when:
- the user already knows the exact process
- the app is sensitive or complex
- the workflow must match a SOP exactly
Use exploration when:
- the user does not know the exact steps
- the app is unfamiliar
- the user wants the guide to be discovered from the live UI
Token-saving rules
- Do not read or restate entire pages unless necessary
- Prefer
browser_snapshot()for structure and refs - Prefer
browser_vision()only when layout or visuals matter - Reuse a stable guide template
- Keep captions short
- Avoid repetitive descriptions across steps
Important distinctions from dogfood
Dogfood is QA-oriented and looks for bugs. This skill is documentation-oriented and looks for user actions.
Shared pieces borrowed from dogfood:
- browser exploration
- screenshots per step
- console checks when useful
- systematic flow execution
Differences from dogfood:
- no severity classification
- no bug reporting
- no reproduction section
- output is a user guide, not a QA report
Suggested output pattern
- If the user gives a template path, treat it as a Markdown blueprint reference rather than a DOCX target
- If the user gives a duplicateable blank template, copy its structure into Markdown first
- Save the final document as
.md - Optionally save screenshot assets alongside it if needed for review
Markdown blueprint mode
When the user wants a DOCX-like structure in Markdown first:
- start with the same cover/TOC/header order the final document will use
- keep section numbering stable
- preserve repeated subsection patterns across modules
- use short placeholder captions for screenshots
- keep the document ready for later DOCX conversion without changing the content hierarchy
Practical checklist
Before finishing, confirm:
- The guide follows the right flow
- Every step has a screenshot
- The writing is formal
- The template formatting is preserved
- The Markdown file is complete and usable