Skip to content

Live Ingest Server

You can run an embedded server that accepts live OpenTelemetry exports and streams them into DuckDB. Point any OpenTelemetry exporter at the server. The server buffers rows, then commits them in batches to the connection’s default catalog, a DuckLake lakehouse, or another attached writable catalog such as an Iceberg REST catalog.

The server speaks three wire protocols across two scheme-bound functions:

FunctionSchemeTransportWire protocol
otlp_serve (default)otlp: (port 4318)HTTPOTLP/HTTP (JSON, NDJSON, protobuf)
otlp_serve(transport := 'grpc')otlp:gRPCOTLP/gRPC unary Export
otap_serveotap: (port 4317)gRPCOTAP/Arrow bidirectional streaming

All three share one buffering/seal core, the same parameters, catalog targeting, token auth, and lifecycle functions — only the wire differs. The transport is not encoded in the scheme: otlp_serve picks HTTP vs gRPC via the transport parameter, and otap_serve is always gRPC. Each function rejects the other’s scheme.

Native extension builds include the server; WASM builds omit it entirely (no HTTP and no gRPC). See gRPC transport for the OTLP/gRPC and OTAP/Arrow service contracts.

For a runnable walkthrough, see the Live Ingest Quickstart. For lakehouse examples, see Stream to Local DuckLake, Stream to Remote DuckLake, Stream to Amazon S3 Tables, and Stream to Cloudflare R2 Data Catalog. For plain files or object storage, see Stream to Parquet. For the implementation model, see Architecture.

The extension registers six server functions (two to start a server, two lifecycle, two diagnostic):

FunctionWhat it does
otlp_serve([uri], ...)Start an OTLP server (OTLP/HTTP, or OTLP/gRPC with transport := 'grpc') and create/validate target tables. Returns one row describing the listener.
otap_serve([uri], ...)Start an OTAP/Arrow gRPC streaming server and create/validate target tables. Same parameters and output as otlp_serve. Returns one row describing the listener.
otlp_flush(uri)Force a synchronous commit of buffered rows when readers need fresh data. Returns commit stats. It leaves catalog maintenance alone.
otlp_stop(uri)Stop the server listening on uri (commits remaining rows first). Returns a status string.
otlp_server_list()List all running servers with live counters, buffer state, and health.
otlp_seal_list()List recent seal attempts with append, commit, row, byte, and error telemetry.

otlp_flush, otlp_stop, otlp_server_list, and otlp_seal_list are transport-agnostic — they dispatch by the scheme-aware canonical listen URI, so an otap: server and an otlp: server are distinct entries managed by the same lifecycle functions.

Starts an OTLP/HTTP ingest server bound to uri. The uri argument is optional; with no argument it defaults to otlp:localhost:4318.

-- Stream into an attached catalog
SELECT * FROM otlp_serve('otlp:localhost:4318', catalog := 'lake', token := 'my-dev-token-123456');

Parameters:

ParameterTypeDefaultDescription
uri (positional)VARCHARotlp:localhost:4318Listen URI. See URI scheme.
catalogVARCHAR(default catalog)Name of the target catalog. Empty means the connection’s default catalog (in-memory or file). Set this to an attached writable catalog such as DuckLake or an Iceberg REST catalog to stream OTLP into a lakehouse. See Catalog targeting.
tokenVARCHAR(random, see below)Auth token clients must present. Must be at least 16 characters. If you omit it, otlp_serve generates a random 32-hex-character token and returns it in auth_token. Ignored when disable_auth := true.
disable_authBOOLEANfalseAccept every request without checking the token. No token is generated or validated, and auth_token comes back empty. Opt-in for trusted local networks and for producers that cannot attach a bearer token (e.g. the otel-arrow OTAP exporter). See Authentication.
schemaVARCHARmainSchema (within the catalog) that holds the target tables.
parquet_export_pathVARCHAR(none)Plain Parquet export root. When set, each seal writes the sealed rows to <root>/<table>/year=YYYY/month=MM/day=DD/*.parquet as the only durable store (no local table copy); a read-only view per signal is created over the files for inspection. Mutually exclusive with a catalog target. Export is at-least-once (a COPY cannot be rolled back).
create_tablesBOOLEANtrueCreate the six target tables if they don’t exist. When false, the tables must already exist with the expected columns or otlp_serve fails fast.
allow_other_hostnameBOOLEANfalseAllow binding to a non-localhost host. By default otlp_serve permits only localhost, 127.0.0.1, and ::1.
max_body_bytesUBIGINT16777216 (16 MiB)Reject request bodies larger than this with 413. Must be greater than zero.
http_threadsUBIGINThost-based bounded defaultWorker threads for concurrent HTTP requests. Must be greater than zero when set.
max_buffered_bytesUBIGINT536870912 (512 MiB)Backpressure cap. POSTs that would exceed this return 503.
seal_target_bytesUBIGINT134217728 (128 MiB)Request an asynchronous seal when admitted, uncommitted request bytes reach this threshold. Larger values write fewer, larger files at the cost of a larger in-memory crash-loss window (still bounded by seal_max_age_ms). Must be greater than zero.
seal_max_age_msBIGINT5000Request an asynchronous seal when the oldest buffered row reaches this age. Must be greater than zero.
target_file_sizeUBIGINT268435456 (256 MiB)DuckLake only. Output Parquet file size the post-seal CHECKPOINT merge bin-packs toward; bounds compaction write amplification (files already at target are left alone). Distinct from seal_target_bytes, which is admitted input bytes. Must be greater than zero.
maintenance_retention_msBIGINT900000 (15 min)DuckLake only. How old snapshots and unused data files must be before the post-seal CHECKPOINT expires and deletes them (expire_older_than / delete_older_than). Keep it longer than your longest read; time-travel below this window is unavailable. Must be greater than zero.

Output columns (one row):

ColumnTypeDescription
listen_uriVARCHARThe otlp: URI the server is bound to.
listen_urlVARCHARThe equivalent http:// base URL (POST endpoints hang off this).
auth_tokenVARCHARThe token clients must present (the value you passed, or the generated one).
schema_nameVARCHARSchema holding the target tables.
logs_tableVARCHARotlp_logs
traces_tableVARCHARotlp_traces
metrics_gauge_tableVARCHARotlp_metrics_gauge
metrics_sum_tableVARCHARotlp_metrics_sum
metrics_histogram_tableVARCHARotlp_metrics_histogram
metrics_exp_histogram_tableVARCHARotlp_metrics_exp_histogram
catalog_nameVARCHARTarget catalog. Empty for the connection’s default catalog.

Starting a second server on the same URI fails (OTLP server already exists). The DuckDB DatabaseInstance owns the server lifetime: DuckDB stops all servers when the database closes, but it does not commit their buffers at that point (see Durability below). Call otlp_stop before closing the database to avoid losing buffered rows.

Starts an OTAP/Arrow gRPC streaming server bound to uri. The uri argument is optional; with no argument it defaults to otap:localhost:4317. otap_serve is gRPC-only — transport must be 'grpc' or omitted (OTAP/Arrow is always gRPC); any other value is rejected.

-- Stream OTAP/Arrow into an attached DuckLake catalog
SELECT listen_uri, listen_url FROM otap_serve('otap:localhost:4317', catalog := 'lake', token := 'my-dev-token-123456');

It takes the same named parameters and returns the same output columns as otlp_serve; only the transport, scheme, and default port differ. It serves the canonical OTAP/Arrow bidirectional-streaming services (Arrow{Logs,Traces,Metrics}Service) for all six signals, sharing the same catalog/Parquet targeting, token auth, buffered group-commit, and lifecycle functions. See gRPC transport for the service and RPC contract.

Two parameter notes for the gRPC path:

  • max_body_bytes becomes the gRPC server’s maximum decoding message size (the largest single BatchArrowRecords/Export message accepted), rather than an HTTP body cap.
  • http_threads is ignored — it tunes only the HTTP worker pool; the gRPC path uses its own async runtime.

Because the wire is gRPC (HTTP/2), an otap: server exposes no HTTP endpoints — the /v1/* POST paths and the /healthz / /readyz probes are HTTP-only. The listen_url output column is still populated with a derived http://host:port string for display, but it is not a usable HTTP base URL; clients connect with a gRPC client, and a liveness probe should use a TCP connect rather than /readyz. The same applies to an otlp_serve(transport := 'grpc') listener.

Forces a synchronous commit: the server writes its in-memory buffer to the target in one transaction before the function returns. Normal ingest can rely on background commits, and otlp_stop performs a final commit. Use otlp_flush when readers need the latest accepted rows while the server stays running. otlp_flush handles durability and read freshness; it leaves compaction and other catalog maintenance alone.

-- Force a commit
SELECT * FROM otlp_flush('otlp:localhost:4318');

Parameters:

ParameterTypeDefaultDescription
uri (positional)VARCHAR(required)Listen URI of the server to flush.

Output columns (one row):

ColumnTypeDescription
statusVARCHARsealed on success, or No server found listening on <uri>. The literal success value means the batch commit completed.
sealed_rowsUBIGINTRows committed by this call.
seals_totalUBIGINTTotal batch commits performed by this server since startup.
errorVARCHARCommit error detail, or NULL if none.

Stops the server listening on uri and frees the port. Any buffered rows are committed first, so a graceful stop loses no data.

SELECT status FROM otlp_stop('otlp:localhost:4318');

Output column:

ColumnTypeDescription
statusVARCHARStopped listening on <uri> if a server was stopped, or No server found listening on <uri> if none matched.

Lists running OTLP servers with live counters and buffer state. Takes no arguments.

SELECT
listen_uri,
catalog_name,
total_rows,
buffered_rows,
last_seal_age_ms AS last_commit_age_ms,
is_listening
FROM otlp_server_list();

Output columns (one row per running server):

ColumnTypeDescription
listen_uriVARCHARThe otlp: listen URI.
listen_urlVARCHARThe http:// base URL.
hostVARCHARBound host.
portUSMALLINTBound port.
catalog_nameVARCHARTarget catalog. Empty for the default catalog.
schema_nameVARCHARSchema holding the target tables.
active_requestsUBIGINTRequests in progress.
total_requestsUBIGINTRequests handled since startup (includes failures).
total_rowsUBIGINTRows accepted (buffered) since startup. Once the buffer drains, this equals the rows committed. A /v1/metrics request counts rows across all four metric tables.
buffered_rowsUBIGINTRows in the buffer that the writer has not committed.
admitted_bytesUBIGINTEncoded request bytes admitted but not yet released by a successful seal.
buffered_bytesUBIGINTApproximate decoded heap held by the in-memory buffers. Unlike admitted_bytes (which bounds encoded input via max_buffered_bytes), this reflects the real memory footprint and grows unbounded while a seal is stuck — watch it to detect backpressure-vs-OOM risk.
seal_target_bytesUBIGINTConfigured size trigger for requesting a seal.
seal_max_age_msBIGINTConfigured age trigger for requesting a seal.
oldest_buffered_age_msBIGINTAge (ms) of the oldest buffered row, or NULL when empty.
last_seal_age_msBIGINTAge (ms) since the last successful batch commit, or NULL if none has completed.
seals_totalUBIGINTBatch commits performed since startup.
committed_rows_totalUBIGINTRows committed since startup.
seal_failures_totalUBIGINTFailed batch commits since startup.
is_listeningBOOLEANfalse once the listener has fallen over (e.g. an error after a successful bind).
last_errorVARCHARLast fatal listener error, or NULL if none.
seal_last_errorVARCHARLast batch commit error, or NULL if none.
maintenance_runs_totalUBIGINTSuccessful post-seal catalog maintenance (CHECKPOINT) passes since startup. Stays 0 for the default catalog and for catalogs where maintenance is unsupported/disabled.
maintenance_failures_totalUBIGINTFailed maintenance passes since startup.
last_maintenance_age_msBIGINTAge (ms) since the last successful maintenance pass, or NULL if none has run.
maintenance_last_errorVARCHARLast maintenance error, or NULL if none.

Use is_listening / last_error to detect a dead listener. Use seal_last_error to inspect writer failures, such as catalog conflicts. Use maintenance_runs_total / last_maintenance_age_ms to confirm compaction is keeping up.

Lists the bounded in-memory history of recent seal attempts for all running servers. duration_ms covers the complete seal attempt. For transaction-backed targets, append_duration_ms measures appending buffered chunks into the destination tables and commit_duration_ms measures COMMIT, including catalog and object-storage work. The remaining duration is buffer swapping, transaction setup, and bookkeeping. Plain Parquet export does not have a transaction commit, so both phase fields are zero.

The target of a server is <catalog>.<schema>.<table>. The live-ingest tables keep the same column names as the file readers, but the nanosecond timestamp columns (time_unix_nano, start_time_unix_nano, …) are stored as DuckDB TIMESTAMP (microsecond) for catalog compatibility, where the file readers expose TIMESTAMP_NS — so a query that mixes a live table with read_otlp_* sees two types and loses sub-microsecond precision on the live side (see the Schema Reference).

  • Default catalog (catalog omitted): rows land in the connection’s default catalog, either an in-memory database or the file you opened DuckDB with. Use this zero-setup path when you do not need a lakehouse catalog. The server still buffers ingest (a POST returns 202); rows become durable in that database at the next background commit.
  • DuckLake catalog (catalog := '<attached_db>'): rows stream into a DuckLake lakehouse with Parquet data files on local or object storage, tracked by a catalog. Attach the catalog first, then name it:
INSTALL ducklake; LOAD ducklake;
ATTACH 'ducklake:metadata.ducklake' AS lake (DATA_PATH 'otlp_data/');
CALL otlp_serve('otlp:localhost:4318', catalog := 'lake');
-- rows buffer and commit into Parquet under otlp_data/
SELECT count(*) FROM lake.main.otlp_logs;

Each batch commit writes one Parquet data file per signal plus one DuckLake snapshot. After a conservative number of successful automatic row-seals, duckdb-otlp runs best-effort catalog-native maintenance with DuckDB’s non-force CHECKPOINT lake when recent ingest rate and pending bytes leave ample admission headroom. On a DuckLake catalog, CHECKPOINT merges adjacent files and expires/cleans old snapshots and data files in one pass — turning the many small per-seal files into compacted, query-efficient files. At startup the server sets the DuckLake options CHECKPOINT reads so this is bounded: target_file_size caps the merge output (files already at target are left alone, so re-compaction is O(new), not O(total)), and expire_older_than / delete_older_than (from maintenance_retention_ms) gate how old snapshots/files must be before reclaim. See Durability and background commits.

  • Iceberg REST catalog (catalog := '<attached_db>'): rows stream into tables in an attached writable Iceberg REST catalog. Attach the catalog with DuckDB’s iceberg extension, create the target schema, then pass the catalog and schema to otlp_serve; see Stream to Amazon S3 Tables and Stream to Cloudflare R2 Data Catalog for managed provider paths. DuckDB’s Iceberg REST catalog docs have no useful CHECKPOINT maintenance path today. The internal maintenance probe uses generic CHECKPOINT <catalog> only. If the catalog reports checkpointing as unsupported, the server disables automatic maintenance for that server and ingest durability continues normally.

  • Plain Parquet export (parquet_export_path := '/data/otlp-parquet' or parquet_export_path := 's3://bucket/prefix'): each seal writes the sealed rows straight to <path>/<table>/year=YYYY/month=MM/day=DD/*.parquet. The Parquet dataset is the only durable store — no local table copy is kept (a read-only view per signal is created over the files for inspection), and it is mutually exclusive with a catalog target. Because a COPY is a file write and cannot be rolled back, export is at-least-once: a signal that already exported is never re-written, but a seal whose COPY fails part-way can re-export that signal’s rows on retry, so deduplicate downstream if you need exactly-once. Use this when you want partitioned Parquet without a lakehouse catalog; see Stream to Parquet.

The scheme binds the URI to a serve function (no mixing): otlp: is otlp_serve, the OTLP server (OTLP/HTTP by default on port 4318, or OTLP/gRPC unary with transport := 'grpc'); otap: is otap_serve, the OTAP/Arrow gRPC server (port 4317). Each function rejects the other’s scheme.

FormExampleServer
otlp:host:portotlp:localhost:4318OTLP/HTTP (or OTLP/gRPC with transport := 'grpc')
otlp://host:portotlp://127.0.0.1:4318OTLP/HTTP
otap:host:portotap:localhost:4317OTAP/Arrow (gRPC)
IPv6 (host in brackets)otlp:[::1]:4318OTLP/HTTP

The scheme is part of the canonical key, so otap:host:4317 and otlp:host:4318 are distinct servers in otlp_server_list / otlp_stop / otlp_flush. By default, otlp_serve/otap_serve allow only localhost, 127.0.0.1, and ::1. To bind to any other host (for example 0.0.0.0 to accept remote exporters), pass allow_other_hostname := true; non-localhost hosts are rejected before a socket is bound.

The scalar function otlp_uri_parser(uri) parses an otlp:/otap: URI and returns a STRUCT(host VARCHAR, port USMALLINT, ipv6 BOOLEAN, url VARCHAR) — the same parsing the serve functions use, useful for validating a URI up front.

The http:// base URL from listen_url exposes:

MethodPathDescription
POST/v1/logsIngest logs into otlp_logs.
POST/v1/tracesIngest traces into otlp_traces.
POST/v1/metricsIngest metrics. Fans out across all four metric tables: otlp_metrics_gauge, otlp_metrics_sum, otlp_metrics_histogram, otlp_metrics_exp_histogram.
GET/healthzLiveness probe. Returns 200 with {"status":"ok"}. No auth required.
GET/readyzReadiness probe. Returns 200 with {"status":"ready"} once the listener is bound, and 503 with {"status":"degraded"} when buffered rows are not committing (a seal has failed, rows are still buffered, and the last successful seal is absent or several seal cycles old). No auth required.

Tables live in <catalog>.<schema>, chosen by otlp_serve(catalog := ..., schema := ...).

The server picks a parser from the request Content-Type:

Content-TypeFormat
application/json, application/otlp+jsonOTLP/JSON
application/x-ndjsonnewline-delimited OTLP/JSON (JSONL)
application/x-protobuf, application/protobuf, application/otlpOTLP/protobuf

Any other content type returns 415.

Content-Encoding: identity (or no header), gzip, and deflate are accepted. Any other content encoding returns 415.

Every POST must present the configured token through one of these headers:

  • Authorization: Bearer <token> (case-insensitive scheme)
  • x-api-key: <token>

The server checks the two headers independently, so a malformed Authorization header does not mask a valid x-api-key. A missing or invalid token returns 401. The server compares tokens with a constant-time check.

Tokens must be at least 16 characters. Auto-generated tokens (when token is omitted) are 32 hex characters (128 bits of entropy).

disable_auth := true turns auth off entirely: the server accepts every request without checking any header, mints no token, and returns an empty auth_token. The same flag applies to the gRPC transports (otlp_serve(transport := 'grpc') and otap_serve), which otherwise reject a bad token with UNAUTHENTICATED.

This exists for two cases: trusted, network-isolated deployments, and OTLP/OTAP producers that have no way to attach a bearer token — notably the otel-arrow OTAP exporter, which has no header/auth configuration. It is off by default; only enable it when the listener is otherwise protected (loopback bind, private network, or a sidecar that terminates auth), since anyone who can reach the port can write to your tables.

There are two gRPC entry points, each bound to its own scheme and serving a disjoint service family, for all six signals:

  • otlp_serve('otlp:...', transport := 'grpc') — standard OTLP/gRPC unary Export. otlp_serve defaults to otlp:localhost:4318 (HTTP); point it at the conventional gRPC port for this, e.g. otlp:localhost:4317.
  • otap_serve('otap:...') — canonical OTAP/Arrow bidirectional streaming. Defaults to otap:localhost:4317; gRPC-only.

Both share everything below the wire: the same parameters, catalog/Parquet targeting, token auth, buffered group-commit (“seal”) path, backpressure cap, and lifecycle functions (otlp_flush / otlp_stop / otlp_server_list / otlp_seal_list). Because the service sets are disjoint, calling the other family on a listener returns UNIMPLEMENTED.

ServerServiceRPCWire format
otlp_serve(transport := 'grpc')opentelemetry.proto.collector.{logs,trace,metrics}.v1.{Logs,Trace,Metrics}ServiceExport (unary)Standard OTLP/gRPC
otap_serveopentelemetry.proto.experimental.arrow.v1.Arrow{Logs,Traces,Metrics}ServiceArrow{Logs,Traces,Metrics} (bidirectional streaming)OTAP/Arrow (stream BatchArrowRecordsstream BatchStatus)

Notes:

  • Auth is the bearer token in the gRPC authorization metadata (Bearer <token>); a bad token is rejected with UNAUTHENTICATED, unless disable_auth is set. Backpressure surfaces as RESOURCE_EXHAUSTED (the gRPC equivalent of HTTP 503).
  • OTAP streaming keeps one stateful decoder per stream, so later messages can reuse the Arrow dictionaries/schemas established by earlier ones. The server returns one BatchStatus per received BatchArrowRecords. A message that fails to decode is nacked and the stream is closed (the decoder is poisoned); a message nacked for backpressure leaves the stream open.
  • Metrics decode into up to four shapes per message, each buffered independently — a backpressure nack partway through a metrics message can leave earlier shapes buffered.
  • The gRPC stack (tokio + tonic) is statically linked into the extension; it adds no new shared-library dependencies and, like the HTTP server, is native-only (absent from WASM builds).

A successful POST returns 202 Accepted after the server buffers rows:

{"status":"buffered","rows":42,"batches":1}

A 202 means the server validated, converted, and accepted the rows into the in-memory buffer. The rows are not yet durable (see below).

Errors return JSON shaped like {"error":"<reason>","message":"<detail>"}:

StatusWhen
400OTLP body failed to parse (or other invalid input).
401Missing or invalid auth token.
413Body larger than max_body_bytes.
415Unsupported Content-Type or Content-Encoding.
503Buffer admission full (request would exceed max_buffered_bytes). Retry with backoff.
500Internal error (also written to duckdb_logs).

The server buffers ingest and commits rows in batches for each target. Batch commits avoid per-request tiny files and write conflicts: a single serialized writer prevents concurrent catalog writes from the ingest server.

The flow:

  1. A POST reserves admission bytes, parses, converts, and appends rows into the relevant per-signal in-memory buffer, then returns 202. The bounded worker pool does this concurrently; append locks only the target signal buffer.
  2. A single background writer commits the buffer to the target in one transaction when any trigger fires:
    • admitted request-body bytes reach the internal size threshold, 128 MiB today,
    • the oldest buffered row reaches the internal age limit, about 5 seconds today, or
    • an explicit otlp_flush.
  3. For a DuckLake target, each batch commit writes one Parquet data file per signal plus one snapshot.
  4. For named catalogs, the server may follow successful automatic row-seals with best-effort, non-force CHECKPOINT <catalog> outside the ingest transaction when recent ingest rate and pending bytes leave ample admission headroom. Treat this as internal scheduling. The server skips the default catalog. The hook also skips explicit otlp_flush, sustained high ingest, high pending buffered bytes, and shutdown drains; unsupported checkpoint implementations log once and disable the hook for that server.

Durability contract:

  • A 202 is not durable. Rows become durable at the next background commit, on otlp_stop, or on otlp_flush.
  • A crash or hard kill loses buffered-but-uncommitted rows (at-most-once for that window).
  • otlp_stop and otlp_flush commit remaining rows before returning, so those calls lose no accepted rows. A plain database/connection close does NOT commit buffered rows. The drain runs after DuckDB tears down the instance, when it can no longer write, so DuckDB can drop buffered rows. Prefer otlp_stop before closing the database. Use otlp_flush when the server should keep running but readers need durable rows now.

The project tracks a future durable raw-spool journal for at-least-once delivery.

Backpressure: if admitting a request would exceed max_buffered_bytes (default 512 MiB) across in-flight and uncommitted accepted payloads, the POST returns 503 before parse/transform work. Clients should retry with backoff.

Seal cadence: seal_target_bytes and seal_max_age_ms are size and age triggers for the single asynchronous writer. They control batching latency and file/transaction size; they do not raise durable write throughput. max_buffered_bytes remains the separate admission cap.

Keeping DuckLake tidy: each batch commit leaves one Parquet file per signal, so a high seal cadence creates many small files. For DuckLake, the post-seal CHECKPOINT merges those into larger files and reclaims old snapshots/files when recent ingest leaves enough admission headroom. The merge is bounded by the target_file_size option the server sets at startup (files already at target are skipped, so re-compaction cost scales with new data, not total data), and reclaim is gated by maintenance_retention_ms (expire_older_than / delete_older_than). The hook skips per-seal maintenance and stays outside the ingest transaction; a maintenance failure leaves committed rows intact. otlp_flush still forces only ingest durability and leaves compaction to catalog maintenance.

  • The server runs a bounded httplib worker pool. Workers parse, convert, and buffer requests concurrently; each signal table has its own buffer lock. In the daemon, set DUCKDB_OTLP_HTTP_THREADS to override the host-based default.
  • A single background writer thread writes to the target catalog. Serial writes let DuckLake, which uses optimistic concurrency, avoid conflict retries and tiny-file churn.

make test cannot issue HTTP POSTs, so the SQL logic tests cover only the lifecycle functions. To exercise the ingest hot path, run the manual concurrency harness. It covers auth, content-type handling, the metrics fan-out, buffering and batch commits, and Arrow → DuckDB conversion under concurrency:

Terminal window
uv run --script test/manual/otlp_serve_concurrency.py
# Override the payload / concurrency:
OTLP_PAYLOAD=test/data/logs_simple.jsonl OTLP_CONCURRENCY=64 \
uv run --script test/manual/otlp_serve_concurrency.py
# Exercise the DuckLake path (writes Parquet under the given dir and checks the
# automatic catalog-maintenance event):
OTLP_DUCKLAKE_DIR=/tmp/otlp_lake \
uv run --script test/manual/otlp_serve_concurrency.py

It covers auth and validation errors, low-buffer backpressure, metrics fanout, stop-under-load, concurrent flush/stop, and the optional local DuckLake maintenance checkpoint event, then reconciles accepted rows against committed row counts. Run it against a TSan/ASan build to catch races.