Cloudflare-R2 Skill

Comprehensive assistance with cloudflare-r2 development, generated from official documentation.

When to Use This Skill

This skill should be triggered when:

Working with cloudflare-r2
Asking about cloudflare-r2 features or APIs
Implementing cloudflare-r2 solutions
Debugging cloudflare-r2 code
Learning cloudflare-r2 best practices

Quick Reference

Common Patterns

R2 Data Catalog - Apache Iceberg Integration

Enable Data Catalog on bucket (Wrangler CLI)

npx wrangler r2 bucket catalog enable my-bucket

Enable via Dashboard: R2 > Bucket > Settings > R2 Data Catalog > Enable

Note the Warehouse and Catalog URI values

Python - PyIceberg Catalog Connection

from pyiceberg.catalog.rest import RestCatalog

Connection configuration

catalog = RestCatalog( name="my_catalog", warehouse="<WAREHOUSE_ID>", # From catalog settings uri="<CATALOG_URI>", # From catalog settings token="<API_TOKEN>", # Admin Read & Write token )

Create namespace

catalog.create_namespace_if_not_exists("default")

Create table with schema

test_table = ("default", "people") table = catalog.create_table(test_table, schema=df.schema)

Append data

table.append(df)

Query data

result = table.scan().to_arrow()

Drop table

catalog.drop_table(test_table)

Data Catalog - API Token Setup

1. Navigate to: R2 > Manage API tokens > Create API token

2. Select "Admin Read & Write" permission (required for catalog access)

3. Save token for authentication

Token must grant both R2 and catalog permissions for Iceberg clients

S3-Compatible SDK Usage (boto3)

import boto3

Configure S3 client for R2

s3_client = boto3.client( 's3', endpoint_url='https://<ACCOUNT_ID>.r2.cloudflarestorage.com', aws_access_key_id='<ACCESS_KEY_ID>', aws_secret_access_key='<SECRET_ACCESS_KEY>', region_name='auto' )

Upload object

s3_client.put_object( Bucket='my-bucket', Key='path/to/file.txt', Body=b'File contents' )

Download object

response = s3_client.get_object(Bucket='my-bucket', Key='path/to/file.txt') data = response['Body'].read()

List objects

response = s3_client.list_objects_v2(Bucket='my-bucket', Prefix='path/') for obj in response.get('Contents', []): print(obj['Key'])

Delete object

s3_client.delete_object(Bucket='my-bucket', Key='path/to/file.txt')

Presigned URLs

Generate presigned URL for upload (expires in 1 hour)

presigned_url = s3_client.generate_presigned_url( 'put_object', Params={ 'Bucket': 'my-bucket', 'Key': 'uploads/file.txt' }, ExpiresIn=3600 )

Generate presigned URL for download

download_url = s3_client.generate_presigned_url( 'get_object', Params={ 'Bucket': 'my-bucket', 'Key': 'path/to/file.txt' }, ExpiresIn=3600 )

Multipart Upload

Initiate multipart upload

multipart = s3_client.create_multipart_upload( Bucket='my-bucket', Key='large-file.bin' ) upload_id = multipart['UploadId']

Upload parts

parts = [] for i, chunk in enumerate(file_chunks, start=1): part = s3_client.upload_part( Bucket='my-bucket', Key='large-file.bin', PartNumber=i, UploadId=upload_id, Body=chunk ) parts.append({'PartNumber': i, 'ETag': part['ETag']})

Complete multipart upload

s3_client.complete_multipart_upload( Bucket='my-bucket', Key='large-file.bin', UploadId=upload_id, MultipartUpload={'Parts': parts} )

Workers Integration

export default { async fetch(request, env) { const bucket = env.MY_BUCKET; // R2 bucket binding

// Upload to R2
await bucket.put('key', 'value', {
  httpMetadata: {
    contentType: 'text/plain',
  },
  customMetadata: {
    user: 'example',
  },
});

// Retrieve from R2
const object = await bucket.get('key');

if (object === null) {
  return new Response('Object Not Found', { status: 404 });
}

// Return object with metadata
return new Response(object.body, {
  headers: {
    'Content-Type': object.httpMetadata.contentType,
    'ETag': object.httpEtag,
  },
});

}, };

Bucket CORS Configuration

// Set CORS policy via S3 SDK const corsConfig = { CORSRules: [ { AllowedOrigins: ['https://example.com'], AllowedMethods: ['GET', 'PUT', 'POST', 'DELETE'], AllowedHeaders: ['*'], MaxAgeSeconds: 3000, }, ], };

await s3_client.put_bucket_cors( Bucket='my-bucket', CORSConfiguration=corsConfig );

Object Metadata

Upload with custom metadata

s3_client.put_object( Bucket='my-bucket', Key='document.pdf', Body=file_data, Metadata={ 'author': 'John Doe', 'department': 'Engineering', 'classification': 'internal', }, ContentType='application/pdf', )

Retrieve metadata without downloading object

response = s3_client.head_object(Bucket='my-bucket', Key='document.pdf') metadata = response['Metadata'] content_type = response['ContentType']

Data Catalog - Apache Spark Integration

from pyspark.sql import SparkSession

Configure Spark with R2 Data Catalog

spark = SparkSession.builder
.config("spark.sql.catalog.r2", "org.apache.iceberg.spark.SparkCatalog")
.config("spark.sql.catalog.r2.catalog-impl", "org.apache.iceberg.rest.RESTCatalog")
.config("spark.sql.catalog.r2.uri", "<CATALOG_URI>")
.config("spark.sql.catalog.r2.warehouse", "<WAREHOUSE_ID>")
.config("spark.sql.catalog.r2.token", "<API_TOKEN>")
.getOrCreate()

Create table

spark.sql(""" CREATE TABLE r2.default.events ( event_id STRING, timestamp TIMESTAMP, user_id STRING, action STRING ) USING iceberg """)

Insert data

spark.sql(""" INSERT INTO r2.default.events VALUES ('evt123', current_timestamp(), 'user456', 'login') """)

Query data

df = spark.sql("SELECT * FROM r2.default.events WHERE action = 'login'") df.show()

R2 SQL - Serverless Analytics Query Engine

Query R2 Data Catalog tables with R2 SQL (Wrangler CLI)

npx wrangler r2 sql query "YOUR_WAREHOUSE_NAME" "SELECT * FROM default.table LIMIT 10"

Authentication setup (required before querying)

export WRANGLER_R2_SQL_AUTH_TOKEN="YOUR_API_TOKEN"

API token needs: Admin Read & Write + R2 SQL Read permissions

Create at: R2 > Manage API tokens > Create API token

R2 SQL - Basic Query Patterns

Select with filtering

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, event_type, product_id, amount FROM default.ecommerce WHERE event_type = 'purchase' LIMIT 10"

Aggregation queries

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, COUNT(*) as transaction_count, SUM(amount) as total_spent FROM default.transactions GROUP BY user_id HAVING total_spent > 1000 ORDER BY total_spent DESC"

Time-based filtering

npx wrangler r2 sql query "warehouse-123"
"SELECT * FROM default.events WHERE timestamp >= '2024-01-01' AND timestamp < '2024-02-01' ORDER BY timestamp DESC"

Join operations

npx wrangler r2 sql query "warehouse-123"
"SELECT u.user_id, u.name, t.transaction_id, t.amount FROM default.users u JOIN default.transactions t ON u.user_id = t.user_id WHERE t.fraud_flag = false"

R2 SQL - Stream Processing

Insert from stream to sink table (continuous processing)

npx wrangler r2 sql query "warehouse-123"
"INSERT INTO ecommerce_sink SELECT * FROM ecommerce_stream"

Filtered stream transformation

npx wrangler r2 sql query "warehouse-123"
"INSERT INTO high_value_transactions SELECT transaction_id, user_id, amount, timestamp FROM transaction_stream WHERE amount > 10000"

R2 SQL - Advanced Analytics

Window functions for ranked queries

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, product_id, amount, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY amount DESC) as purchase_rank FROM default.purchases QUALIFY purchase_rank <= 5"

Time-series aggregations

npx wrangler r2 sql query "warehouse-123"
"SELECT DATE_TRUNC('hour', timestamp) as hour, COUNT(*) as event_count, AVG(response_time_ms) as avg_response_time FROM default.api_logs GROUP BY hour ORDER BY hour DESC LIMIT 24"

Fraud detection pattern

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, COUNT(*) as transaction_count, SUM(CASE WHEN fraud_flag THEN 1 ELSE 0 END) as fraud_count, AVG(amount) as avg_amount FROM default.transactions WHERE timestamp >= CURRENT_DATE - INTERVAL '7' DAY GROUP BY user_id HAVING fraud_count > 0 ORDER BY fraud_count DESC"

Reference Files

This skill includes comprehensive documentation in references/ :

api.md - Api documentation
buckets.md - Buckets documentation
getting_started.md - Getting Started documentation
other.md - Other documentation

Use view to read specific reference files when detailed information is needed.

Working with This Skill

For Beginners

Start with the getting_started or tutorials reference files for foundational concepts.

For Specific Features

Use the appropriate category reference file (api, guides, etc.) for detailed information.

For Code Examples

The quick reference section above contains common patterns extracted from the official docs.

Resources

references/

Organized documentation extracted from official sources. These files contain:

Detailed explanations
Code examples with language annotations
Links to original documentation
Table of contents for quick navigation

scripts/

Add helper scripts here for common automation tasks.

assets/

Add templates, boilerplate, or example projects here.

Notes

This skill was automatically generated from official documentation
Reference files preserve the structure and examples from source docs
Code examples include language detection for better syntax highlighting
Quick reference patterns are extracted from common usage examples in the docs

Updating

To refresh this skill with updated documentation:

Re-run the scraper with the same configuration
The skill will be rebuilt with the latest information

cloudflare-r2

Safety Notice

Copy this and send it to your AI assistant to learn

Enable Data Catalog on bucket (Wrangler CLI)

Enable via Dashboard: R2 > Bucket > Settings > R2 Data Catalog > Enable

Note the Warehouse and Catalog URI values

Connection configuration

Create namespace

Create table with schema

Append data

Query data

Drop table

1. Navigate to: R2 > Manage API tokens > Create API token

2. Select "Admin Read & Write" permission (required for catalog access)

3. Save token for authentication

Token must grant both R2 and catalog permissions for Iceberg clients

Configure S3 client for R2

Upload object

Download object

List objects

Delete object

Generate presigned URL for upload (expires in 1 hour)

Generate presigned URL for download

Initiate multipart upload

Upload parts

Complete multipart upload

Upload with custom metadata

Retrieve metadata without downloading object

Configure Spark with R2 Data Catalog

Create table

Insert data

Query data

Query R2 Data Catalog tables with R2 SQL (Wrangler CLI)

Authentication setup (required before querying)

API token needs: Admin Read & Write + R2 SQL Read permissions

Create at: R2 > Manage API tokens > Create API token

Select with filtering

Aggregation queries

Time-based filtering

Join operations

Insert from stream to sink table (continuous processing)

Filtered stream transformation

Window functions for ranked queries

Time-series aggregations

Fraud detection pattern

Source Transparency

Related Skills

lark-docs

slurm

lark-messages

mac-setup