Weaviate Database Operations

This skill provides comprehensive access to Weaviate vector databases including search operations, natural language queries, schema inspection, data exploration, filtered fetching, collection creation, and data imports.

Weaviate Cloud Instance

If the user does not have an instance yet, direct them to the cloud console to register and create a free sandbox. Create a Weaviate instance via Weaviate Cloud.

Environment Variables

Required:

WEAVIATE_URL - Your Weaviate Cloud cluster URL
WEAVIATE_API_KEY - Your Weaviate API key

External Provider Keys (auto-detected): Set only the keys your collections use, refer to Environment Requirements for more information.

Script Index

Search & Query

Query Agent - Ask Mode: Use when the user wants a direct answer to a question based on collection data. The Query Agent synthesizes information from one or more collections and returns a structured response with source citations (collection name and object ID).
Query Agent - Search Mode: Use when the user wants to explore or browse raw objects across one or more collections. Unlike ask mode, this returns the actual data objects rather than a synthesized answer.
Hybrid Search: Default choice for most searches. Provides a good balance of semantic understanding and exact keyword matching. Use this when you are unsure which search type to pick.
Semantic Search: Use for finding conceptually similar content regardless of exact wording. Best when the intent matters more than specific keywords.
Keyword Search: Use for finding exact terms, IDs, SKUs, or specific text patterns. Best when precise keyword matching is needed rather than semantic similarity.

Collection Management

List Collections: Use to discover what collections exist in the Weaviate instance. This should typically be the first step before performing any search or data operation.
Get Collection Details: Use to understand a collection's schema — its properties, data types, vectorizer configuration, replication factor, and multi-tenancy status. Helpful before running searches or imports.
Explore Collection: Use to analyze data distribution, top values, and inspect actual content in a collection. Helpful for understanding what data looks like before querying.
Create Collection: Use to create new collections with custom schemas before importing data. Do not specify a vectorizer unless the user explicitly requests one (the default text2vec_weaviate is used).

Data Operations

Fetch and Filter: Use to retrieve specific objects by ID or strictly filtered subsets of data. Best for precise data retrieval rather than search.
Import Data: Use to bulk import data into an existing collection from CSV, JSON, or JSONL files.
Create Example Data: Use to create example data for immediate use of other skills, if no data is available or user requests some toy data.

Recommendations

Start by listing collections if you don't know what's available:
```
uv run scripts/list_collections.py
```
Ask the user if they want to create example data if nothing is available and the user requests it. Otherwise continue.
```
uv run scripts/example_data.py
```

Get collection details to understand the schema:

uv run scripts/get_collection.py --name "COLLECTION_NAME"

Explore collection data to see values and statistics:

uv run scripts/explore_collection.py "COLLECTION_NAME"

Import data to populate a new collection (if needed):

uv run scripts/import.py "data.csv" --collection "CollectionName"

Do not specify a vectorizer when creating collections unless requested:

uv run scripts/create_collection.py Article \
  --properties '[{"name": "title", "data_type": "text"}, {"name": "body", "data_type": "text"}]'

Choose the right search type:
- Get AI-powered answers with source citations across multiple collections → ask.py
- Get raw objects from multiple collections → query_search.py
- General search → hybrid_search.py (default)
- Conceptual similarity → semantic_search.py
- Exact terms/IDs → keyword_search.py

Output Formats

All scripts support:

Markdown tables (default and recommended)
JSON (--json flag)

Error Handling

Common errors:

WEAVIATE_URL not set → Set the environment variable
Collection not found → Use list_collections.py to see available collections
Authentication error → Check API keys for both Weaviate and vectorizer providers

weaviate

Safety Notice

Copy this and send it to your AI assistant to learn

Weaviate Database Operations

Weaviate Cloud Instance

Environment Variables

Script Index

Search & Query

Collection Management

Data Operations

Recommendations

Output Formats

Error Handling

Source Transparency

Related Skills

weaviate-cookbooks

vercel-composition-patterns

vercel-react-native-skills