Skip to content

How to Stream OTLP to Cloudflare R2 Data Catalog

Use the duckdb-otlp Docker image in r2-data-catalog mode to stream OTLP/HTTP exports into an Iceberg catalog hosted by Cloudflare R2 Data Catalog.

The container initializes DuckDB, loads the required extensions, attaches the R2 Data Catalog warehouse, starts the ingest server, and commits accepted rows in batches.

Choose R2 Data Catalog when you want Cloudflare to host an Iceberg REST catalog for an R2 bucket. To write Iceberg metadata and Parquet files to a regular R2 bucket through the S3-compatible API, use a different catalog setup.

Live ingestion uses OTLP/HTTP on port 4318. WASM builds do not include the ingest server.

Choose a bucket name and a Cloudflare account/token that can create R2 buckets and enable R2 Data Catalog:

Terminal window
export CLOUDFLARE_ACCOUNT_ID=<account-id>
export CLOUDFLARE_API_TOKEN=<r2-admin-read-write-token>
export R2_BUCKET_NAME="duckdb-otlp-r2catalog-${CLOUDFLARE_ACCOUNT_ID}"

Do not paste the API token into logs or source files. The token needs R2 storage read/write and R2 Data Catalog read/write permissions. Cloudflare’s R2 Admin Read & Write token includes both.

Create the bucket:

Terminal window
wrangler r2 bucket create "$R2_BUCKET_NAME"

Enable R2 Data Catalog on the bucket:

Terminal window
wrangler r2 bucket catalog enable "$R2_BUCKET_NAME"

Wrangler prints the catalog values DuckDB needs:

Catalog URI: 'https://catalog.cloudflarestorage.com/<account-id>/<bucket-name>'
Warehouse: '<account-id>_<bucket-name>'

Save them in your shell:

Terminal window
export R2_CATALOG_URI="https://catalog.cloudflarestorage.com/${CLOUDFLARE_ACCOUNT_ID}/${R2_BUCKET_NAME}"
export R2_WAREHOUSE="${CLOUDFLARE_ACCOUNT_ID}_${R2_BUCKET_NAME}"

Enable R2 Data Catalog table maintenance before sustained ingest. Live OTLP ingest commits rows in batches, which can leave many small data files and table snapshots without Cloudflare managed maintenance.

Terminal window
wrangler r2 bucket catalog compaction enable "$R2_BUCKET_NAME" \
--target-size 128 \
--token "$CLOUDFLARE_API_TOKEN"
wrangler r2 bucket catalog snapshot-expiration enable "$R2_BUCKET_NAME" \
--token "$CLOUDFLARE_API_TOKEN" \
--older-than-days 7 \
--retain-last 10

Compaction combines small files into larger files for faster queries; catalog-level compaction applies retroactively to existing tables. Snapshot expiration removes old Iceberg snapshots and unreferenced files according to the retention policy. Snapshot expiration requires Wrangler 4.56.0 or newer.

You also need an R2 S3-compatible access key pair that can write objects to the bucket. Save those values as CLOUDFLARE_ACCESS_KEY_ID and CLOUDFLARE_SECRET_ACCESS_KEY in the next step.

Create cloudflare.env:

DUCKDB_MODE=r2-data-catalog
DUCKDB_OTLP_TOKEN=dev-token-123456
DUCKDB_CATALOG=r2catalog
DUCKDB_SCHEMA=otlp
DUCKDB_QUACK_ENABLED=1
DUCKDB_QUACK_ADDR=0.0.0.0:9494
DUCKDB_QUACK_TOKEN=dev-quack-token-123456
CLOUDFLARE_ACCOUNT_ID=<account-id>
CLOUDFLARE_R2_BUCKET=<bucket-name>
CLOUDFLARE_ACCESS_KEY_ID=<r2-s3-access-key-id>
CLOUDFLARE_SECRET_ACCESS_KEY=<r2-s3-secret-access-key>
CLOUDFLARE_CATALOG_URI=https://catalog.cloudflarestorage.com/<account-id>/<bucket-name>
CLOUDFLARE_CATALOG_TOKEN=<r2-admin-read-write-token>
Terminal window
docker run --rm --name duckdb-otlp \
--env-file cloudflare.env \
-p 4318:4318 \
-p 9494:9494 \
ghcr.io/smithclay/duckdb-otlp:latest

The container creates these Iceberg tables in the R2 Data Catalog namespace if they do not exist:

  • r2catalog.otlp.otlp_logs
  • r2catalog.otlp.otlp_traces
  • r2catalog.otlp.otlp_metrics_gauge
  • r2catalog.otlp.otlp_metrics_sum
  • r2catalog.otlp.otlp_metrics_histogram
  • r2catalog.otlp.otlp_metrics_exp_histogram

POST a log record

In another terminal:

Terminal window
curl -sS http://localhost:4318/v1/logs \
-H 'Authorization: Bearer dev-token-123456' \
-H 'Content-Type: application/json' \
-d '{"resourceLogs":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"r2-data-catalog-demo"}},{"key":"deployment.environment","value":{"stringValue":"docs"}}]},"scopeLogs":[{"scope":{"name":"duckdb-otlp-guide"},"logRecords":[{"timeUnixNano":"1704067200000000000","observedTimeUnixNano":"1704067200123456789","severityNumber":9,"severityText":"INFO","body":{"stringValue":"hello from Cloudflare R2 Data Catalog"},"attributes":[{"key":"guide","value":{"stringValue":"stream-to-r2-data-catalog"}}]}]}]}]}'

Response:

{"status":"buffered","rows":1,"batches":1}

Rows are accepted before they are durable. They commit automatically in the background, on graceful shutdown, or immediately after an explicit flush.

Query committed rows

Flush and query through Quack from a host DuckDB process:

The server image is distroless and has no shell or DuckDB CLI, so do not use docker exec ... sh -c for inspection SQL. The examples in this guide enable Quack and publish port 9494 for this purpose.

Terminal window
duckdb <<'SQL'
INSTALL quack;
LOAD quack;
FROM quack_query(
'quack:localhost:9494',
'SELECT * FROM otlp_flush(''otlp:0.0.0.0:4318'')',
token = 'dev-quack-token-123456'
);
FROM quack_query(
'quack:localhost:9494',
$$
SELECT service_name, severity_text, body
FROM r2catalog.otlp.otlp_logs
WHERE service_name = 'r2-data-catalog-demo'
ORDER BY time_unix_nano DESC
LIMIT 5
$$,
token = 'dev-quack-token-123456'
);
SQL

Stop cleanly

If you plan to delete the R2 Data Catalog resources immediately, skip this step and use Clean up instead.

Terminal window
docker stop duckdb-otlp

The image sends otlp_stop('otlp:0.0.0.0:4318') during shutdown, so remaining buffered rows are committed before the process exits.

Drop the Iceberg tables before disabling the catalog and deleting the R2 bucket:

Terminal window
duckdb <<'SQL'
INSTALL quack;
LOAD quack;
FROM quack_query(
'quack:localhost:9494',
$$
SELECT status FROM otlp_stop('otlp:0.0.0.0:4318');
DROP TABLE IF EXISTS r2catalog.otlp.otlp_logs;
DROP TABLE IF EXISTS r2catalog.otlp.otlp_traces;
DROP TABLE IF EXISTS r2catalog.otlp.otlp_metrics_gauge;
DROP TABLE IF EXISTS r2catalog.otlp.otlp_metrics_sum;
DROP TABLE IF EXISTS r2catalog.otlp.otlp_metrics_histogram;
DROP TABLE IF EXISTS r2catalog.otlp.otlp_metrics_exp_histogram;
DETACH r2catalog;
$$,
token = 'dev-quack-token-123456'
);
SQL
docker stop duckdb-otlp

Then disable the catalog and delete the bucket:

Terminal window
wrangler r2 bucket catalog disable "$R2_BUCKET_NAME"
wrangler r2 bucket delete "$R2_BUCKET_NAME"

If bucket deletion reports that the bucket is not empty, delete the remaining catalog objects and retry. For a bucket you created for this guide, those objects are the Iceberg metadata and data files under __r2_data_catalog/:

Terminal window
export R2_OBJECTS_API="https://api.cloudflare.com/client/v4/accounts/${CLOUDFLARE_ACCOUNT_ID}/r2/buckets/${R2_BUCKET_NAME}/objects"
curl -fsS \
-H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
"$R2_OBJECTS_API" |
jq -r '.result[]?.key' |
while IFS= read -r key; do
encoded="$(node -e 'process.stdout.write(encodeURIComponent(process.argv[1]))' "$key")"
curl -fsS -X DELETE \
-H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
"${R2_OBJECTS_API}/${encoded}" >/dev/null
done
wrangler r2 bucket delete "$R2_BUCKET_NAME"
  • If the container cannot attach the catalog at startup, confirm CLOUDFLARE_CATALOG_URI, CLOUDFLARE_CATALOG_TOKEN, CLOUDFLARE_ACCOUNT_ID, and CLOUDFLARE_R2_BUCKET all refer to the same bucket.
  • If the container attaches the catalog but cannot write files, confirm CLOUDFLARE_ACCESS_KEY_ID and CLOUDFLARE_SECRET_ACCESS_KEY can write objects to the R2 bucket.
  • If no rows appear after a 202 response, run the flush command before querying.
  • If bucket deletion reports that the bucket is not empty, delete the remaining catalog objects as shown in Clean up, then retry wrangler r2 bucket delete.
  • R2 Data Catalog supports R2 buckets in the default jurisdiction.
  • R2 Data Catalog stores live ingest timestamp columns with microsecond precision because the Iceberg catalog does not accept DuckDB TIMESTAMP_NS columns.
  • If DuckDB reports unsupported catalog checkpointing, no action is required; ingest, flush, and stop durability behavior stays the same.