AI Task Hub

Formerly skill-hub-gateway.

Public package boundary:

Only orchestrates portal.skill.execute, portal.skill.poll, portal.skill.presentation, portal.account.connect, portal.account.balance, and portal.account.ledger.
Does not exchange api_key or userToken inside this package.
Does not handle recharge or payment flows inside this package.
Optional env hints PUBLIC_BRIDGE_ENTRY_HOST and AI_TASK_HUB_ENTRY_HOST are only host-side entry_host fallbacks, not API secrets, auth tokens, or billing credentials.
Prefers attachment URLs, and when host runtime explicitly exposes attachment bytes for the current request, forwards only that explicit attachment material through the public bridge before execution.
When the published skill is invoked directly by a third-party agent runtime, it uses POST /agent/public-bridge/invoke.
published skill persistence = disabled.
continuity owner = host_or_private_wrapper.

Data Handling Boundary

Only forwards attachment bytes that the host runtime explicitly provides for the current request.
Off-host media transfer is limited to the gateway-controlled host https://gateway-api.binaryworks.app.
Public upload handoff is limited to POST /agent/public-bridge/upload-file for the same request flow.
Does not read local paths, scan the local filesystem, or guess files outside explicit host-provided attachment material.
Does not persist uploaded bytes or credentials to local disk, and does not write skill/config state.
Host/runtime should obtain user consent before forwarding media and should avoid sending sensitive or regulated data unless the user explicitly approved that transfer.

Read This First

Do not mix connector lifecycle commands with published skill actions.
connect / status / invoke / logout are connector lifecycle commands for host/runtime installation state.
portal.account.connect / portal.skill.execute / portal.skill.poll / portal.skill.presentation / portal.account.balance / portal.account.ledger are the published skill actions.
Do not treat portal.skill.execute -> portal.skill.poll -> portal.skill.presentation as the default path for every capability. Default behavior must follow the capability delivery mode.
Public capability inventory below lists only enabled capabilities intentionally advertised to agents. Disabled or internal-only routes may still exist in backend code but are not part of the advertised package surface.

Two Operational Surfaces

Connector lifecycle commands:
- connect: start or resume browser authorization for one connector installation
- status: read whether the connector installation already has continuity
- invoke: call one published skill action through the connector runtime
- logout: clear continuity for that connector installation
Published skill actions:
- portal.account.connect: explicit account bind or connection-status check
- portal.skill.execute: submit a capability run
- portal.skill.poll: poll a submitted run
- portal.skill.presentation: fetch rendered outputs for a run
- portal.account.balance: read current points balance
- portal.account.ledger: read points ledger rows

Public Capability Inventory (Enabled And Available)

Image analysis:
- human_detect
- image_tagging
- face-detect
- body-keypoints-2d
- face-emotion-recognition
Background removal / cutout / mask:
- person-instance-segmentation
- person-semantic-segmentation
- concert-cutout
- full-body-matting
- head-matting
- product-cutout
Audio:
- asr
- tts_report
Document:
- markdown_convert
Retrieval:
- embeddings
- reranker
Generation:
- image-generation
Video:
- Video Face Generation

Delivery Mode Guidance

instant_result
- Default host behavior: prefer a result-first entrypoint and return the final result in the same interaction.
- Typical capabilities: tts_report, image analysis, retrieval.
- Default product path: use result-first host flow; portal.skill.execute is usually enough.
aggregated_short_wait
- Default host behavior: prefer a result-first entrypoint with short internal waiting, then fall back to poll only if the short wait does not complete.
- Typical capabilities: image-generation.
- Default product path: start with result-first host flow; only use portal.skill.poll when the short wait does not complete.
asset_delivery
- Default host behavior: keep presentation or rendered files as part of the formal result surface.
- Typical capabilities: segmentation, matting, cutout.
- Default product path: portal.skill.execute -> portal.skill.poll, then portal.skill.presentation when files or rendered outputs are needed.
long_running
- Default host behavior: treat the capability as an explicit async job and do not promise single-turn completion.
- Typical capabilities: asr, markdown_convert, video-face-generation.
- Default product path: portal.skill.execute with later portal.skill.poll; use portal.skill.presentation only when that capability later exposes a rendered result surface.

Minimal Architecture Rules

Default path stays connector-first. Do not invent a host-specific continuity or identity model beside connector/runtime.
Default path stays result-first. Do not teach every capability as execute -> poll -> presentation.
portal.skill.poll and portal.skill.presentation are follow-up surfaces, not universal default steps.
Published skill keeps compatibility and asset-delivery surfaces, but it does not own long-lived continuity.
Do not ask end users for manual URLs, JSON field names, local paths, or internal bridge details unless they explicitly ask for debugging.

Non-Recommended Patterns

Do not bypass connector/runtime as the default host product path when hosted connector or local bootstrap is available.
Do not describe raw published skill actions as if they were the primary UX for every capability.
Do not add host-specific auth, continuity, or capability-routing rules outside the existing connector/runtime and gateway contracts.
Do not expose debugging payload structure, bridge layering, or upload choreography to end users unless troubleshooting is explicitly requested.

Official Host Integration

Current built-in host integrations for mobileclaw, openclaw, codex, and claude require the AI Task Hub connector/runtime to own continuity from the first call.
Future public hosts should follow the same connector/runtime continuity contract instead of inventing a host-specific identity model.
For remote URL / OAuth / connection-record style hosts, use the hosted connector runtime: POST /agent/hosted-connector/install, POST /agent/hosted-connector/connect, POST /agent/hosted-connector/invoke, POST /agent/hosted-connector/status, POST /agent/hosted-connector/logout.
For local command-only hosts, use the shared connector/runtime bootstrap outside this published package.
POST /agent/public-bridge/invoke is the underlying transport used by that connector/runtime, and also the compatibility fallback when a host bypasses the connector during debugging or manual integration.
Trusted host runtime that can safely hold bridge assertion secret may still use POST /agent/skill/bridge/invoke.
published skill persistence remains disabled even when connector/runtime is present; long-lived continuity belongs in host or connector state, not in this package.

User-Facing Response Policy

When users upload images, audio, documents, or video and ask for a capability, prefer executing immediately only when the host runtime has already supplied an explicit attachment object or explicit attachment bytes for that request.
Do not explain image_url, attachment.url, storage URLs, bridge layers, host uploads, input normalization, or controlled media domain details to end users unless they explicitly ask for technical debugging.
Do not ask end users to provide manual URLs, JSON field names, or upload-chain instructions; those are internal host-to-skill mechanics.
If the runtime supports attachment handling, limit processing to the explicit attachment object supplied for the current request and keep the upload/URL handoff scoped to execute/poll/presentation for that same request.
Only when execution actually fails and the user must intervene should you mention missing processable files, incomplete authorization, or retry guidance, using user-oriented language without exposing internal layering.

Chinese documentation: SKILL.zh-CN.md

When to Use This Skill

Use this skill when the user asks to:

detect faces, human presence, body keypoints, image tags, or facial emotion from images
generate person/product segmentation, mask, cutout, or matting outputs that this public package explicitly exposes
transcribe uploaded audio into text (speech to text, audio transcription)
generate speech from text input (text to speech, voice generation)
convert uploaded files into markdown (document to markdown)
start async jobs and check status later (poll, check job status)
fetch rendered visual outputs such as overlay, mask, and cutout
run embedding or reranking tasks for retrieval workflows
check current account points balance or recent points ledger rows

Common Requests

Example requests that should trigger this skill:

"Detect faces in this image and return bounding boxes."
"Tag this image and summarize the main objects."
"Remove the background from this product photo."
"Create a clean cutout from this portrait image."
"Transcribe this meeting audio into text."
"Generate speech from this paragraph."
"Convert this PDF file into markdown."
"Start this job now and let me poll the run status later."
"Fetch overlay and mask files for run_456."
"Generate embeddings for this text list and rerank the candidates."
"Check my current points balance."
"Show my recent points ledger from 2026-03-01 to 2026-03-15."

Search-Friendly Capability Aliases

vision aliases: face detection, human detection, person detection, image tagging
background aliases: remove background, background removal, cutout, matting, product-cutout
asr aliases: speech to text, audio transcription, transcribe audio
tts aliases: text to speech, voice generation, speech synthesis
markdown_convert aliases: document to markdown, file to markdown, markdown conversion
poll aliases: check job status, poll long-running task, async run status
presentation aliases: rendered output, overlay, mask, cutout files
account.balance aliases: points balance, credits balance, remaining points
account.ledger aliases: points ledger, credits history, points statement
embeddings/reranker aliases: vectorization, semantic vectors, relevance reranking

Public discovery boundary for visual capabilities:

This published skill only advertises visual capabilities whose backing services are currently enabled for public delivery.
Disabled or internally retained legacy routes are intentionally omitted from discovery references and capability manifests even if related backend code still exists.

Runtime Contract

Default API base URL: https://gateway-api.binaryworks.app Published package policy: outbound base URL is locked to the default API base URL to reduce token exfiltration risk.

Action to endpoint mapping:

portal.skill.execute -> POST /agent/skill/execute
portal.skill.poll -> GET /agent/skill/runs/:run_id
portal.skill.presentation -> GET /agent/skill/runs/:run_id/presentation
portal.account.connect -> POST /agent/public-bridge/invoke (explicit connect/status check only)
portal.account.balance -> GET /agent/skill/account/balance
portal.account.ledger -> GET /agent/skill/account/ledger

Install Mechanism & Runtime Requirements

This skill is instruction-first and does not define a remote installer flow.
Runtime execution is limited to bundled local scripts under scripts/*.mjs.
Required runtime binary is node (as declared in metadata.openclaw.requires.bins).
No remote download-to-exec install chain is used (curl|wget ... | sh|bash|python|node is not part of this package).

Auth Contract

Third-party agent entry mode (official host integrations should route through connector/runtime):

Officially supported host integrations should install or provision the connector/runtime and let it manage entry_user_key; do not ask end users to manage continuity manually.
If a host bypasses connector/runtime and invokes the published skill directly, use POST /agent/public-bridge/invoke and persist the same entry_user_key outside this published skill package.
Do not require end users to provide any credential.
Use portal.account.connect when host/runtime wants an explicit browser-connect preflight instead of waiting for a protected action to fail.
Connector/runtime remains required for the official host integration path even when browser authorization is not yet required.
With TRIAL_ENABLED and available trial points, first-time calls may proceed without browser authorization.
On first use without an existing binding, gateway can proceed without browser authorization when TRIAL_ENABLED and trial points are available; AUTHORIZATION_REQUIRED is returned only for conditional upgrade paths (for example trial exhausted or trial-disabled rollback).
The returned authorization_url may include gateway_api_base_url; preserve it when completing browser authorization so /agent-auth/complete is posted back to the same API environment that created the auth session.
Host/runtime should show authorization_url to the user, persist entry_user_key, then retry the same action with that same entry_user_key.
Connector/runtime must preserve that same entry_user_key before and after browser authorization.
When AUTHORIZATION_REQUIRED or portal.account.connect returns connector_install, treat it as the official npm connector/runtime guidance for hosts that have not yet provisioned the supported continuity layer.
connector_install refers to the official connector package outside this published skill package.
That connector is the required continuity owner for officially supported host integrations, even though account authorization may still be deferred until trial exhaustion or policy upgrade.
For OpenClaw/MobileClaw-style local hosts, follow the connector guide referenced by connector_install.guide_url, then continue the same browser authorization_url flow with the same entry_user_key.
If gateway later returns AUTHORIZATION_REQUIRED with details.likely_cause=ENTRY_USER_KEY_NOT_REUSED, details.recovery_action=REUSE_ENTRY_USER_KEY, and details.reauthorization_required=false, host should restore the previously persisted entry_user_key and retry without sending the user through browser authorization again.

Identifier format constraints used by gateway auth:

agent_uid must match ^agent_[a-z0-9][a-z0-9_-]{5,63}$.
conversation_id must match ^[A-Za-z0-9._:-]{8,128}$.
In deployed bridge mode, host may pass its own stable runtime agent identifier and the gateway bridge will canonicalize it server-side.

Host-side token bridge (outside published package):

To keep this package compliant and low-privilege, this published runtime does not issue or accept caller-managed task tokens.
Preferred deployed bridge endpoint for third-party agent entry: connector/runtime should call POST /agent/public-bridge/invoke.
Trusted host runtime that can safely hold bridge assertion secret may continue to use POST /agent/skill/bridge/invoke.
These bridge endpoints are served by gateway runtime, not bundled into this published package, and do not require caller-managed credentials.
published skill persistence = disabled; continuity must stay in host_or_private_wrapper, not inside this published package.
Bridge request body should include action, agent_uid, conversation_id, and optional payload.
conversation_id should be a host-generated opaque session/install identifier, not a public chat ID, raw thread ID, or PII.
Public bridge should resolve a stable external user binding when available; if the binding is missing and trial conditions are satisfied, first-time onboarding can continue without browser authorization, while conditional upgrade paths return a host-owned authorization URL plus entry_user_key.
Cross-conversation account continuity requires reusing the same entry_user_key; public bridge intentionally does not accept owner overrides.
Gateway bridge will canonicalize agent_uid, repair binding when missing, issue short-lived internal task token, and run the action server-side.
portal.skill.execute through public bridge is write-capable and should send options.confirm_write=true after user confirmation; otherwise gateway may return ACTION_CONFIRMATION_REQUIRED.
base_url, gateway_api_key, api_key, user_token, agent_task_token, owner_uid_hint, and install_channel overrides are rejected by the deployed bridge endpoint.
Recommended host behavior: persist entry_user_key, normalize agent_uid, and re-run the same bridge action after authorization completes.

Host integration modes:

connector-managed interactive (recommended): connector/runtime calls POST /agent/public-bridge/invoke, surfaces the returned host-owned authorization URL to the user when needed, persists returned entry_user_key, and retries after authorization completes.
trusted host bridge (secondary): a trusted backend you control may call POST /agent/skill/bridge/invoke with its own bridge assertion secret.
Published skill package itself does not open browser, persist credentials, or perform OAuth/token exchange flows.
The authorization URL above is owned by deployed gateway/admin-web pages, not by this skill package runtime.
Successful public bridge responses add data.agent_guidance.bridge_auth with continuity_owner=host_or_private_wrapper, published_skill_persistence=disabled, and the returned bridge_context.
Public bridge failures that include entry context add error.details.bridge_auth so host/runtime can recover continuity outside the published skill package.

Compatibility and Debug Transport Reference

Default product path for official hosts remains connector-first and result-first.

Use the raw transport reference below only when:

a host is in compatibility mode and can preserve the same entry_user_key itself
a trusted backend is integrating directly
debugging requires checking the raw bridge contract

Preferred raw transport for third-party agent entry (normally owned by connector/runtime, and also usable as a compatibility fallback when a host can preserve continuity itself):

Deployed bridge API:

{
  "entry_host": "<host_runtime>",
  "action": "portal.account.balance",
  "agent_uid": "support_assistant",
  "conversation_id": "host_session_20260316_opaque_001",
  "payload": {}
}

Send that body to POST /agent/public-bridge/invoke.
entry_host must match the active host runtime. Current built-in examples are mobileclaw, openclaw, codex, claude; future lowercase host slugs can follow the same contract when connector/runtime is configured for them.
Example mappings: MobileClaw -> mobileclaw, OpenClaw -> openclaw, Codex -> codex, Claude -> claude.
If the host cannot pass entry_host explicitly, export PUBLIC_BRIDGE_ENTRY_HOST or AI_TASK_HUB_ENTRY_HOST, or install the package under the host-specific runtime path so host inference stays correct.
This is the underlying production transport for connector-managed third-party integration.
With TRIAL_ENABLED and available trial points, first-time onboarding can complete without browser authorization.
On first use, gateway may return AUTHORIZATION_REQUIRED with authorization_url and entry_user_key only when conditional authorization upgrade is required (for example trial exhausted).
Persist entry_user_key and retry with the same value after user authorization completes.
Preserve any gateway_api_base_url embedded in the authorization flow so the completion request lands on the same gateway API environment.
agent_uid should be your host-defined stable runtime agent identifier.
conversation_id should be your host-generated opaque session/install identifier; it is not tied to Telegram or any single tool and does not determine account ownership.
Use the same entry_user_key across conversations when those conversations should share one account.

Trusted host runtime secondary mode:

If you control the upstream backend and it can safely hold bridge assertion secret, use POST /agent/skill/bridge/invoke.
This path is for trusted host runtime only, not OpenClaw / MobileClaw / Codex / Claude style third-party entry.

Action payload templates (same for public bridge and trusted host bridge mode):

portal.skill.execute

{
  "capability": "human_detect",
  "input": { "image_url": "https://files.example.com/demo.png" },
  "request_id": "optional_request_id"
}

portal.skill.poll

{ "run_id": "run_123" }

portal.skill.presentation

{ "run_id": "run_123", "channel": "web", "include_files": true }

portal.account.connect

{ "connect_mode": "browser", "auth_session_id": "optional_existing_auth_session" }

portal.account.balance

{}

portal.account.ledger

{ "date_from": "2026-03-01", "date_to": "2026-03-15" }

Agent-side decision flow:

For official host integration, prefer connector install/connect/invoke/status/logout or the hosted connector lifecycle over direct published-skill invocation.
Only bypass connector/runtime and call POST /agent/public-bridge/invoke directly when you are in a compatibility or debugging path and can preserve the same entry_user_key yourself.
Default capability path must follow delivery mode, not a universal three-step recipe:
- instant_result: prefer result-first flow; use portal.skill.execute only when dropping to published skill actions.
- aggregated_short_wait: prefer result-first flow; fall back to portal.skill.poll only if the short wait does not complete.
- asset_delivery: use portal.skill.presentation only when the capability needs formal rendered assets.
- long_running: keep explicit async expectations and use portal.skill.poll as the normal follow-up.
Explicit account linking: call portal.account.connect, surface the returned authorization_url when present, and keep reusing the same entry_user_key.
Account query: call portal.account.balance or portal.account.ledger directly.
Keep conversation_id as session context only; do not use it as the account key.
For cross-conversation continuity in third-party entry mode, persist and reuse the same entry_user_key; do not pass owner_uid_hint to the public bridge endpoint.
If AUTHORIZATION_REQUIRED is returned, show authorization_url, persist entry_user_key, then retry the same action after user authorization completes.
If AUTHORIZATION_REQUIRED includes details.likely_cause=ENTRY_USER_KEY_NOT_REUSED, do not open a new auth flow yet; first restore the previously persisted entry_user_key and retry the same bridge call.
Treat details.reauthorization_required=false as a recovery hint that browser re-login is unnecessary for this failure mode.
If AUTH_UNAUTHORIZED + agent_uid claim format is invalid: use canonical agent_uid (agent_...) instead of a short host alias (assistant, planner).
If SYSTEM_NOT_FOUND + agent binding not found: restart the same bridge flow once and let gateway repair binding.

Output parsing contract:

Always parse standard gateway envelope: request_id, data, error.
Treat non-empty error as failure even when HTTP tooling hides status code.

Visualization Playbooks (Agent Guidance)

For successful visual actions (portal.skill.execute, portal.skill.poll, portal.skill.presentation), the script enriches responses with data.agent_guidance.visualization.playbook.
Playbook mapping covers the visual capabilities currently exposed by this published skill (detection/classification/keypoints/segmentation/matting families).
For image-generation, user delivery should be image-first: present the generated image itself and omit structured fields unless the user explicitly asks for source data or debugging details.
Global rendering guardrail for all visual capabilities:
Must use skill-native rendered assets first (overlay/mask/cutout/view_url) when available.
Manual local drawing fallback is disabled by default (allow_manual_draw=false) to avoid inconsistent agent-side rendering.
If rendered assets are missing, fallback is summary-only from structured output (raw/visual.spec), not local drawing.
Example special rule:
body-contour-63pt -> when both rendered assets and geometry are absent, playbook marks status=degraded and recommends fallback capability body-keypoints-2d.

Payload Contract

portal.skill.execute: payload requires capability and input.
payload.request_id is optional and passed through.
portal.skill.poll and portal.skill.presentation: payload requires run_id.
portal.skill.presentation supports include_files (defaults to true).
portal.account.connect: payload may include connect_mode and optional auth_session_id when host/runtime is checking an existing browser bind.
portal.account.balance: payload is optional and ignored.
portal.account.ledger: payload may include date_from + date_to (YYYY-MM-DD, must be provided together).

Attachment normalization:

Prefer explicit image_url / audio_url / file_url / video_url.
attachment.url is mapped to target media field by capability.
When host runtime exposes attachment bytes, this published package forwards only that explicit attachment material through the public bridge and injects the returned URL before execute.
There is no separate portal.upload action in this package; for third-party agent entry, callers should keep using portal.skill.execute, and the bundled runtime will only forward explicit attachment bytes already supplied by the host for the current request.
If a host bypasses the bundled auto-upload helper and implements upload itself, use POST /agent/public-bridge/upload-file for third-party/public entry, not POST /agent/skill/bridge/upload-file.
Local file_path handling is disabled in the published public skill.
The runtime does not scan the local filesystem, guess file locations, expand directories/globs, or read local paths from payload.file_path, input.file_path, attachment.path, or attachment.file_path.
Arbitrary unmanaged local filesystem access remains unsupported; hosts should provide bytes or a bridge-managed URL instead.
Example host upload endpoint: /agent/public-bridge/upload-file.
Video Face Generation requires 2 uploaded files from the user before execution:
- source video -> input.video_url
- merge face image -> input.merge_infos[0].merge_face_image.url
If either required file is missing, agent should ask the user to upload both files first.
Prefer a short source video for testing or smoke runs because these video-generation jobs are asynchronous and slower than image-only tasks.
Do not rely on a single attachment.url auto-mapping for Video Face Generation; host must pass both structured URL fields explicitly.

Error Contract

Preserve gateway envelope: request_id, data, error.
Preserve POINTS_INSUFFICIENT and pass through error.details.recharge_url.

Bundled Files

scripts/skill.mjs
scripts/agent-task-auth.mjs
scripts/base-url.mjs
scripts/attachment-normalize.mjs
scripts/telemetry.mjs (compatibility shim)
references/capabilities.json
references/openapi.json
SKILL.zh-CN.md

ai-task-hub

Safety Notice

Copy this and send it to your AI assistant to learn