Logical Replication Setup & Management

Running PostgreSQL logical replication in production is less about the initial CREATE PUBLICATION statement and more about governing the Write-Ahead Log (WAL) lifecycle, replication slot retention, and consumer idempotency that keep a Change Data Capture (CDC) pipeline from silently drifting or filling a disk at 3 a.m. This guide is the operational reference for database engineers, data platform teams, Python ETL developers, and DevOps operators running PostgreSQL 15, 16, or 17: it covers how the publisher-subscriber contract is defined, how slot and LSN state is persisted and invalidated, which system views to alert on, and how the pipeline behaves under failover, consumer restarts, and schema drift. Every section assumes you are already operating a live cluster and need exact parameters, thresholds, and remediation steps rather than an introduction.

The four operational domains that follow map to the topics you configure in sequence: defining what data leaves the primary with a publication, reserving the WAL cursor by initializing replication slots, performing the initial subscription sync, and closing the loop with asynchronous monitoring integration. The mechanics underneath all four are documented in the logical replication architecture fundamentals; this page is the management layer that turns those mechanics into a supportable production service.

Core Architecture

Logical replication decodes committed WAL records into a stream of row-level change events (INSERT, UPDATE, DELETE, TRUNCATE) and delivers them to subscribers, in contrast to physical streaming replication, which ships opaque 8 KB block images. That distinction is the entire reason the setup surface exists: because decoding is selective and schema-aware, you must explicitly declare the exposure boundary, the transport, and the retention policy. The prerequisite is non-negotiable — the primary must run with wal_level = logical, which requires a restart and roughly doubles per-row WAL volume for UPDATE/DELETE because old-tuple identity is now logged.

sql

-- Publisher prerequisites (restart required for wal_level).
ALTER SYSTEM SET wal_level = 'logical';
ALTER SYSTEM SET max_replication_slots = 10;   -- one per subscriber + headroom
ALTER SYSTEM SET max_wal_senders = 12;         -- >= slots, plus physical standbys
-- Cap unbounded WAL retention so a stalled consumer cannot exhaust the disk.
ALTER SYSTEM SET max_slot_wal_keep_size = '20GB';  -- PG 13+
SELECT pg_reload_conf();

The decoding path is: a backend commits a transaction; the WAL sender attached to a logical slot invokes an output plugin (pgoutput for native subscriptions, or a custom plugin such as wal2json/decoderbufs for third-party consumers); the plugin serializes the change set honoring transaction boundaries and relation metadata; the change stream advances the slot’s restart_lsn only when the consumer acknowledges via confirmed_flush_lsn. The precise batching, memory, and ordering behavior of that path is covered in WAL stream mechanics; the practical consequence for management is that the slot is a durable, disk-backed cursor whose lag directly controls how much WAL the primary must retain.

Version-specific behavior shapes topology decisions. PostgreSQL 16 introduced parallel apply for subscriptions (streaming = parallel), which lets large in-progress transactions be applied by multiple background workers instead of a single serial apply worker, sharply reducing catch-up latency on high-churn tables. PostgreSQL 16 also enabled logical decoding on physical standbys, allowing a slot to be created on a replica and survive promotion. PostgreSQL 17 added the ability to fail over logical slots to a standby (failover = true plus sync_replication_slots), refined slot invalidation semantics, and expanded pg_stat_subscription_stats. Choosing where slots live and how many apply workers a subscriber runs are architecture decisions you make once and pay for continuously.

Declarative Configuration Model

The management surface is declarative: three object types define the entire contract. On the publisher you create a publication that names the tables and DML operations to expose. On the subscriber you create a subscription that references the publisher’s connection string and publication, which implicitly creates (or reuses) a named logical slot on the publisher. Everything else — filtering, identity, durability — is a parameter on those objects.

sql

-- Publisher: expose an explicit table set, not FOR ALL TABLES.
CREATE PUBLICATION orders_pub
  FOR TABLE public.orders, public.order_items
  WITH (publish = 'insert,update,delete');   -- omit truncate if targets differ

-- PG 15+: row filters and column lists shrink egress at the source.
CREATE PUBLICATION eu_orders_pub
  FOR TABLE public.orders (id, customer_id, total, status)
    WHERE (region = 'EU');

sql

-- Subscriber: bind to the publication; this creates slot "orders_sub" on the publisher.
CREATE SUBSCRIPTION orders_sub
  CONNECTION 'host=pub.internal port=5432 dbname=app user=repl sslmode=verify-full'
  PUBLICATION orders_pub
  WITH (copy_data = true, streaming = 'parallel', binary = true);  -- streaming=parallel: PG 16+

Prefer explicit table enumeration over FOR ALL TABLES: the latter amplifies WAL decoding cost, streams tables you never intended to expose, and turns every future CREATE TABLE into an unplanned replication change. Row filters (WHERE) and column lists are evaluated on the publisher, so they reduce network payload at the cost of extra CPU during decoding — a good trade when the filter is selective. The full topology reasoning for fan-out, cascading, and partial datasets lives in publication and subscription models; the operational rules for building the publication itself, including REPLICA IDENTITY and sequence handling, are in creating publications.

The parameters below are the ones that most often decide whether a deployment is stable or fragile:

Parameter	Object	Default	Logical-replication behavior
`wal_level`	server	`replica`	Must be `logical`; enables tuple-level decoding. Restart required.
`max_replication_slots`	server	`10`	Hard ceiling on concurrent slots; exceeding it aborts new subscriptions.
`max_wal_senders`	server	`10`	Must cover every logical slot plus physical standbys.
`max_slot_wal_keep_size`	server	`-1` (unlimited)	Caps WAL a slot can pin; slot is invalidated past the cap instead of filling disk (PG 13+).
`copy_data`	subscription	`true`	Governs the initial snapshot; `false` when the target is pre-seeded.
`streaming`	subscription	`off`	`on` streams in-progress txns; `parallel` applies them concurrently (PG 16+).
`synchronous_commit`	subscription/session	`on`	On the subscriber, `off` is often safe and faster; on the publisher it defines data-loss tolerance.
`binary`	subscription	`false`	Sends values in binary; faster but requires matching types across versions.

State Persistence & Lifecycle

The replication slot is the single most important piece of persisted state in the whole system, and mismanaging it is the most common way to take down a primary. A slot stores two LSNs that matter: restart_lsn, the oldest WAL position the primary must retain for this consumer, and confirmed_flush_lsn, the last position the consumer durably acknowledged. The primary cannot recycle any WAL older than the minimum restart_lsn across all slots. If a consumer stops acknowledging — a crashed Python worker, a paused subscription, a wedged Kafka connector — restart_lsn freezes and pg_wal grows without bound until max_slot_wal_keep_size invalidates the slot or the filesystem fills. Choosing the right slot flavor for each consumer (persistent, temporary, or failover-enabled) is covered in replication slot types.

sql

-- Create a persistent logical slot explicitly (decouples slot life from subscription).
SELECT * FROM pg_create_logical_replication_slot('orders_sub', 'pgoutput');

-- Measure exactly how much WAL each slot is pinning, in bytes.
SELECT slot_name,
       active,
       wal_status,                                   -- reserved | extended | unreserved | lost
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal
FROM pg_replication_slots
ORDER BY pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) DESC;

Pre-allocating slots before a rollout — rather than letting CREATE SUBSCRIPTION create them implicitly — prevents a transient network partition during setup from orphaning a slot or losing the reservation point; the full procedure is in initializing replication slots. The lifecycle then proceeds through the initial snapshot: with copy_data = true the subscriber copies each table’s current state before applying incremental changes, a phase that can saturate I/O on large tables and should be scheduled during low-traffic windows or split per-table via ALTER SUBSCRIPTION ... REFRESH PUBLICATION. Watching srsubstate transition through i (initialize) → d (data copy) → s/r (synchronized/ready) is how you confirm the snapshot completed; the sequencing and failure recovery for this phase are detailed in subscription sync procedures.

wal_status is the field that tells you how close a slot is to disaster: reserved is healthy, extended means it is retaining WAL beyond max_wal_size, unreserved means it is within one checkpoint of exceeding max_slot_wal_keep_size, and lost means the slot has been invalidated and the consumer must be re-seeded from a fresh snapshot. Alert on unreserved — by the time you see lost, data continuity is already broken.

Security & Privilege Boundaries

Logical replication moves production data across a network, so least-privilege and transport security are part of setup, not an afterthought. The role in the subscriber’s CONNECTION string needs REPLICATION (or membership in pg_create_subscription on PG 16+) plus SELECT on every replicated table for the initial copy — nothing more. Do not reuse a superuser for the replication role; a dedicated, narrowly scoped account limits blast radius if the connection string leaks.

sql

-- Dedicated, least-privilege replication role on the publisher.
CREATE ROLE repl WITH LOGIN REPLICATION PASSWORD 'use-a-secret-manager';
GRANT SELECT ON public.orders, public.order_items TO repl;

code

# pg_hba.conf — require verified TLS for the replication role only, scoped to the subscriber subnet.
hostssl  app   repl   10.20.0.0/24   scram-sha-256   clientcert=verify-full

Always use sslmode=verify-full in the connection string so the subscriber validates the publisher’s certificate hostname and CA chain, defeating man-in-the-middle interception of the change stream. Store the replication password in a secrets manager (Vault, AWS Secrets Manager, or Kubernetes secrets) rather than inline in CREATE SUBSCRIPTION; rotate it with ALTER SUBSCRIPTION ... CONNECTION. Be aware that on PostgreSQL versions before 16 the subscription apply worker runs with the subscription owner’s privileges and does not enforce row-level security or run triggers by default — a superuser-owned subscription can write rows a normal user never could, so set ALTER SUBSCRIPTION ... (run_as_owner = false) where supported and prefer table-owner-scoped subscriptions. The broader privilege and topology security model, including cross-database and cross-tenant isolation, is covered in security boundaries and permissions.

Observability & Diagnostics

You cannot operate what you cannot measure, and replication failures are silent by design — the primary keeps accepting writes while a stalled slot quietly retains WAL. Effective monitoring watches three signals: slot lag (bytes of WAL pinned), apply lag (time or LSN distance between what was sent and what was applied), and consumer liveness (active state transitions). The canonical queries below are the ones to wire into a collector and alert on; the end-to-end pattern for exporting them to Prometheus, Datadog, or OpenTelemetry and closing the alerting loop is in asynchronous monitoring integration.

sql

-- Slot lag and health, per consumer. Alert: retained_bytes > 10 GB or wal_status <> 'reserved'.
SELECT slot_name, active, wal_status,
       pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)        AS retained_bytes,
       pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS unconfirmed_bytes
FROM pg_replication_slots
WHERE slot_type = 'logical';

-- Send/apply lag from the publisher side. Alert: write_lag or flush_lag > 60 s sustained.
SELECT application_name, state,
       pg_wal_lsn_diff(sent_lsn, replay_lsn) AS apply_backlog_bytes,
       write_lag, flush_lag, replay_lag
FROM pg_stat_replication;

sql

-- Subscriber side: worker liveness and per-subscription error counters.
SELECT subname, received_lsn, latest_end_lsn,
       pg_wal_lsn_diff(latest_end_lsn, received_lsn) AS receive_gap_bytes
FROM pg_stat_subscription;                      -- one row per apply/sync worker; NULL pid = worker down

-- PG 15+: durable error stats survive worker restarts.
SELECT subname, apply_error_count, sync_error_count, stats_reset
FROM pg_stat_subscription_stats;

Set concrete thresholds rather than watching graphs: page when retained_bytes exceeds 10 GB (or 50% of your max_slot_wal_keep_size), when wal_status leaves reserved, when any expected slot shows active = false for more than 300 seconds, and when flush_lag stays above 60 seconds. A slot at active = false with growing retained_bytes is the highest-priority signal on the whole system — it is the precursor to disk exhaustion and a forced failover. On PostgreSQL 17, pg_stat_progress_subscription (renamed and expanded from earlier progress views) also lets you watch initial-copy progress in near-real time, which is invaluable during large re-seeds.

Resilience Patterns & Failure Modes

Production replication is defined by how it behaves when something breaks. The recurring failure modes each have a distinct signature and remediation.

WAL exhaustion from a stalled consumer. Signature: pg_wal growing steadily, one slot with a frozen restart_lsn and active = false. Root cause: the consumer (Python worker, Debezium task, or paused subscription) stopped acknowledging. Remediation: restart the consumer to resume acknowledgment, or if it is unrecoverable, SELECT pg_drop_replication_slot('name') to release the WAL and re-seed the consumer from a fresh snapshot. Prevention is max_slot_wal_keep_size, which trades a re-seed for a full disk — almost always the right trade on a primary.

Failover with orphaned slots. Signature: after promoting a standby, subscribers cannot reconnect because the slot only ever existed on the old primary. Root cause: before PostgreSQL 17, logical slots did not replicate to standbys. Remediation on PG 17+: create subscriptions with failover = true and enable sync_replication_slots = on so slots are synchronized to the standby and survive promotion. On older versions, script slot recreation into the promotion runbook and accept a bounded re-seed of any WAL the new primary never retained.

Schema drift / DDL propagation. Signature: apply worker errors with column ... does not exist or a type mismatch; apply_error_count climbs. Root cause: logical replication does not replicate DDL — an ALTER TABLE on the publisher is invisible to subscribers. Remediation: apply the compatible DDL on the subscriber first (add columns before the publisher, drop after), then let the stream resume. For Python consumers driving their own decoding, adopt a schema registry so transformation logic adapts to relation metadata changes rather than crashing.

Replica identity mismatch. Signature: UPDATE/DELETE fail to propagate or hit the wrong rows; publisher logs cannot update table ... because it does not have a replica identity. Root cause: a table without a primary key and without REPLICA IDENTITY FULL cannot describe which row changed. Remediation: ALTER TABLE ... REPLICA IDENTITY FULL (or USING INDEX on a unique, non-partial index), understanding that FULL logs the entire old row and increases WAL volume.

Large-transaction memory pressure and at-least-once duplicates. Signature: apply worker OOMs on a bulk load, or downstream rows appear twice after a consumer restart. Root cause: logical replication provides at-least-once, not exactly-once, delivery, and a resumed consumer replays from the last confirmed LSN. Remediation: enable streaming = 'on'/'parallel' (PG 14+/16+) so in-progress transactions spill to the subscriber instead of buffering entirely in memory, and make every consumer write idempotent with INSERT ... ON CONFLICT DO UPDATE keyed on the primary key. The durability side of this trade — how synchronous_commit on the publisher bounds the data-loss window — is worked through in tuning synchronous_commit for logical replication.

For downstream event-driven consumers, these same guarantees flow into the streaming layer: idempotency and offset management move from the SQL apply worker to the Kafka event routing integration, where the Debezium connector reuses the exact slot and publication objects described here.

Conclusion

Logical replication in modern PostgreSQL is production-ready, but its reliability is entirely a function of operational discipline: treat slots as first-class infrastructure with hard retention caps, keep publications explicit and narrowly scoped, make every consumer idempotent, and alert on slot lag and worker liveness with concrete byte and time thresholds rather than dashboards you have to remember to look at. The version you run changes what is possible — PostgreSQL 15 gave you row filters, column lists, and durable subscription error stats; PostgreSQL 16 added parallel apply, decoding on standbys, and the pg_create_subscription privilege; PostgreSQL 17 delivered slot failover to standbys and richer progress and statistics views. Anchor your runbooks to the version you actually run, script the failure remediations above before you need them, and the pipeline will scale across regions and consumers without the 3 a.m. surprises.

Creating publications — define the exact table, column, and row-filter exposure boundary.
Initializing replication slots — pre-allocate durable WAL cursors before rollout.
Subscription sync procedures — drive and recover the initial snapshot safely.
Asynchronous monitoring integration — export slot and apply-lag metrics with SLO alerting.
PostgreSQL logical replication architecture & fundamentals — the decoding, WAL, and slot internals underneath this management layer.
CDC pipeline implementation with Python & Debezium — extend these slots and publications into a full event-streaming pipeline.

Core Architecture #

Declarative Configuration Model #

State Persistence & Lifecycle #

Security & Privilege Boundaries #

Observability & Diagnostics #

Resilience Patterns & Failure Modes #

Conclusion #

Related guides #