Publication/Subscription Models

The publication/subscription architecture serves as the operational backbone for PostgreSQL’s logical replication framework, enabling granular, row-level…

The publication/subscription architecture serves as the operational backbone for PostgreSQL’s logical replication framework, enabling granular, row-level change data capture (CDC) without imposing full-instance synchronization overhead. For database engineers and data platform teams, this model decouples source transactional workloads from downstream analytical, event-driven, or polyglot persistence systems. Before deploying production CDC pipelines, teams must internalize the architectural constraints and state management principles detailed in PostgreSQL Logical Replication Architecture & Fundamentals, as misconfigured routing boundaries frequently introduce unbounded replication lag or silent schema drift.

flowchart TD
  PUB["Publication<br/>analytics_cdc_pub"]
  PUB --> S1(["Slot: warehouse"])
  PUB --> S2(["Slot: search"])
  PUB --> S3(["Slot: cache"])
  S1 --> SUB1["Warehouse subscriber"]
  S2 --> SUB2["Search indexer"]
  S3 --> SUB3["Cache invalidator"]

Idempotent Publication Configuration

A publication defines the logical boundary of replicated data. Production deployments require idempotent, declarative configurations that survive infrastructure-as-code (IaC) re-runs and cluster failovers. CREATE PUBLICATION does not support IF NOT EXISTS, so guard the definition with an explicit catalog check inside a DO block rather than relying on ad-hoc migration statements.

sql
-- Idempotent publication creation with explicit filtering.
-- CREATE PUBLICATION has no IF NOT EXISTS, so guard it with a catalog check.
DO $$
BEGIN
  IF NOT EXISTS (SELECT 1 FROM pg_publication WHERE pubname = 'analytics_cdc_pub') THEN
    CREATE PUBLICATION analytics_cdc_pub
    FOR TABLE
      public.orders (id, customer_id, total, status, updated_at),
      public.inventory (sku, warehouse_id, quantity, last_synced)
    WITH (publish = 'insert, update, delete', publish_via_partition_root = true);
  END IF;
END $$;

Key production considerations:

  • Row/Column Filtering: Use FOR TABLE ... (col1, col2) to minimize network payload and reduce apply worker CPU on subscribers.
  • Partition Routing: Enable publish_via_partition_root = true to ensure consistent schema mapping across dynamically created child partitions.
  • Transaction Boundaries: Logical replication preserves transactional atomicity. Partial commits are never streamed, but large transactions will block downstream apply workers until fully decoded.

Subscription Lifecycle & Slot Management

Subscriptions establish persistent connections to publishers and automatically provision logical replication slots. Slot retention guarantees are non-negotiable for data integrity but require strict operational oversight. As documented in Replication Slot Types, logical slots retain WAL segments until the subscriber explicitly acknowledges receipt. During network partitions or consumer stalls, unacknowledged slots will prevent WAL recycling, eventually triggering disk full conditions on the primary.

Implement automated slot monitoring and safe decommissioning workflows:

sql
-- Monitor slot lag and retention pressure
SELECT slot_name, restart_lsn, confirmed_flush_lsn,
       pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS wal_retained_bytes
FROM pg_replication_slots
WHERE slot_type = 'logical';

Idempotent Subscription Setup:

sql
-- CREATE SUBSCRIPTION supports neither IF NOT EXISTS nor execution inside a
-- transaction block while it connects and creates a slot. Enforce idempotency
-- in deployment tooling by checking pg_subscription before issuing this command.
CREATE SUBSCRIPTION analytics_cdc_sub
CONNECTION 'host=pub-primary port=5432 dbname=source_db user=replicator'
PUBLICATION analytics_cdc_pub
WITH (
  copy_data = false,
  create_slot = true,
  slot_name = 'analytics_cdc_slot',
  synchronous_commit = off
);

Always set copy_data = false for existing tables to avoid initial snapshot bottlenecks, and explicitly name slots to prevent orphaned auto-generated identifiers during IaC drift.

WAL Decoding & Performance Boundaries

When a transaction commits, the publisher routes modified tuples through the pgoutput logical decoding plugin, which translates binary WAL records into a consumable stream. This transformation pipeline relies heavily on WAL Stream Mechanics to guarantee ordered delivery, crash recovery, and transactional consistency.

DevOps teams must account for measurable CPU and I/O overhead introduced by logical decoding. Filtering large tables, applying row-level security (RLS) policies, or streaming high-throughput UPDATE workloads will increase primary node load. Mitigate this by:

  • Isolating replication connections via dedicated replication roles and pg_hba.conf CIDR restrictions.
  • Tuning wal_level = logical and max_replication_slots during cluster provisioning.
  • Monitoring pg_stat_replication for write_lag, flush_lag, and replay_lag spikes.

Debugging & Troubleshooting Workflows

Production CDC pipelines require deterministic debugging procedures. Common failure modes include apply worker crashes, schema mismatch errors, and slot exhaustion.

1. Apply Worker Stalls & Conflicts

sql
-- Check subscription state and apply-worker status.
-- subname/subenabled/subconninfo live on pg_subscription; runtime LSNs on pg_stat_subscription.
SELECT s.subname, s.subenabled, st.pid, st.received_lsn, st.latest_end_lsn
FROM pg_subscription s
LEFT JOIN pg_stat_subscription st ON st.subid = s.oid;

If a worker crashes due to a constraint violation or missing column, PostgreSQL logs the exact failing transaction ID and LSN. Use pg_replication_origin_status to identify the stuck origin, then either patch the downstream schema or use pg_replication_origin_advance() to skip irrecoverable transactions (use with extreme caution).

2. Slot Exhaustion & Disk Pressure When restart_lsn falls significantly behind pg_current_wal_lsn(), WAL archives accumulate. Implement a runbook that:

  • Alerts when wal_retained_bytes exceeds 20% of available disk.
  • Archives pending WAL to object storage via pg_waldump or wal-g.
  • Safely drops orphaned slots only after verifying downstream consumers have migrated or are permanently decommissioned.

Strategic Integration & Migration Paths

Engineering teams frequently evaluate logical versus physical replication for CDC pipelines. The trade-offs are comprehensively outlined in Logical vs physical replication differences. For modern data platforms requiring heterogeneous targets, in-flight transformations, or selective table routing, logical replication remains the default. Physical replication is reserved for read-scaling, high-availability failover, and exact binary copies.

Time-Series & Hypertable Integration: When replicating partitioned time-series workloads, ensure the subscriber schema matches the publisher’s partitioning strategy. Detailed guidance on handling hypertable metadata and continuous aggregates is available in Integrating logical replication with TimescaleDB. Always validate that publish_via_partition_root aligns with your retention policies to prevent duplicate inserts during partition swaps.

Physical-to-Logical Migration: Transitioning from streaming replication to logical CDC requires careful state synchronization. Follow the established procedures in Migrating from physical to logical replication safely to prevent data divergence. The standard approach involves:

  1. Taking a consistent base backup.
  2. Restoring to the target cluster.
  3. Creating a logical slot at the exact LSN of the backup.
  4. Initializing the subscription with copy_data = false to resume streaming from the backup point.

Security Boundaries & Fallback Routing

Replication credentials must adhere to the principle of least privilege. Grant only SELECT on published tables and REPLICATION at the role level. Never use superuser accounts for subscription connections. For failover scenarios, implement fallback routing strategies that redirect subscribers to a promoted standby once the new primary assumes the pgoutput role. Configure connection parameters with target_session_attrs=read-write and implement exponential backoff in Python ETL consumers using libraries like psycopg or asyncpg to gracefully handle transient network partitions.

For protocol-level message formatting and stream parsing, consult the official PostgreSQL Logical Replication Protocol documentation. When building custom consumers, validate message boundaries against the PostgreSQL Logical Replication Reference to ensure idempotent apply logic and correct transaction commit sequencing.