Connection Pooling: What It Is and Why It Matters

Team 6 min read

#databases

#performance

#backend

Connection pooling is one of the simplest ways to make backend systems faster, cheaper, and more reliable. If your application talks to a database, a cache, a message broker, or external HTTP APIs, a well-tuned connection pool can reduce latency, increase throughput, and protect your services from connection storms and resource exhaustion.

This article explains what connection pooling is, why it matters, how it works, and how to size and tune pools with practical tips and examples.

What is a connection and why is it expensive?

Establishing a new network connection often involves:

DNS resolution
TCP handshake (and possibly TLS handshake)
Authentication and protocol negotiation
Backend-side allocation of memory, sockets, and session state

Doing this for every request adds latency, burns CPU, and increases the chance of failures under load.

What is connection pooling?

Connection pooling keeps a set of established, ready-to-use connections. Instead of establishing a new connection for every operation:

The app borrows a connection from the pool
Performs its work
Returns the connection to the pool for reuse

Pools track active and idle connections, validate health, and close or recycle connections that are idle too long or reach a max lifetime.

Where pooling helps

Databases: PostgreSQL, MySQL, SQL Server, Oracle
Caches and brokers: Redis, RabbitMQ, Kafka (client connections)
HTTP(S) clients: Reuse TCP/TLS sessions with keep-alive and per-host pools
Microservices: Internal service-to-service calls over HTTP/1.1 keep-alive or HTTP/2 multiplexing

Note: HTTP/2 multiplexes streams over a single connection, reducing the need for large pools but still benefits from long-lived connections and limits.

How a pool works (typical flow)

App requests a connection.
Pool returns an idle connection or creates a new one if under max size.
If pool is saturated, the requester waits up to a configurable timeout or fails fast.
On return, the pool resets the connection state (rollback transaction, clear session variables) and marks it idle.
Background tasks validate and evict idle or old connections.

Benefits

Lower latency: Avoid repeated handshakes and auth.
Higher throughput: Less CPU spent on connection setup.
Resource efficiency: Fewer sockets and server processes.
Stability under load: Smooths spikes and prevents thundering herds.
Cost control: Less compute and fewer backend connections.

Key settings to know

Max connections: Upper bound of concurrent connections the pool will create.
Min/initial connections: How many to keep warm. In serverless, you may prefer 0.
Idle timeout: Close idle connections after this duration.
Connection timeout: How long borrowers will wait when the pool is empty.
Max lifetime: Recycle connections periodically to avoid long-lived issues or upstream idle timeouts.
Health checks/validation: Test on borrow or in the background.

Sizing your pool

Right-size pools to match backend capacity and app concurrency. Oversized pools can hurt performance and reliability.

Practical starting points:

Databases
- Determine the database’s safe max connections (DB_max).
- Reserve headroom for admin and other clients (20% is a common starting point).
- If you have N app instances, per-instance pool size:
  - per_instance_max ≈ floor((DB_max × 0.8) / N)
- For OLTP workloads, a per-instance pool size of 10–30 is often sufficient. Start small and scale with data.
HTTP clients
- Use per-host connection limits.
- For HTTP/1.1: 10–50 connections per host per instance is a common range for high-throughput services.
- For HTTP/2: You often need far fewer connections (1–5) because streams multiplex.
Serverless and autoscaling
- Beware of bursty concurrency creating too many connections.
- Use managed proxies (e.g., RDS Proxy, Cloud SQL Auth Proxy, PgBouncer) or strict per-instance caps.
- Prefer minConnections = 0 and strong max caps to prevent cold-start storms.

Tune using measurements:

Track pool wait time (how often and how long callers wait).
If wait time is frequently non-zero and throughput is bounded, consider increasing max.
If wait time is near zero and backend shows high CPU/context switching or too many connections, decrease max.

Timeouts and lifetimes

Connection timeout: Keep it short to fail fast under saturation (e.g., 100–1000 ms).
Idle timeout: Close idle connections to free resources (e.g., 30–300 s).
Max lifetime: Set below any upstream idle or connection lifetime limits to avoid mid-request resets. Example:
- If your load balancer closes idle connections after 60 minutes, set max lifetime to 55 minutes with a small jitter.

Transaction and session hygiene

Always return connections in a finally/defer block to prevent leaks.
Ensure each borrow starts with a clean state:
- Roll back any open transaction.
- Reset session variables.
- With PgBouncer transaction pooling, this is automatic between transactions.
Keep transactions short. Long transactions tie up connections and increase contention.

Monitoring and alerting

Track at least:

Pool size, active, idle
Borrow wait time and saturation ratio (active/max)
Connection creation rate and failures
Max lifetime/idle evictions
Backend metrics: connection count, CPU, locks, slow queries

Alert when:

Borrow wait time exceeds a threshold (e.g., p95 > 50 ms)
Saturation ratio stays near 1.0
Connection errors spike

Common pitfalls and fixes

Oversized pools: Too many connections can reduce DB cache effectiveness and increase context switching. Reduce pool size or add a proxy like PgBouncer.
Connection storms on deploy or cold start: Stagger warm-up, use small minConnections, and set strict max.
Leaked connections: Enforce returns in finally blocks; add leak detection in the pool.
Long-lived transactions: Audit code paths; add timeouts or max transaction duration.
Serverless fan-out: Use managed proxies and low per-instance caps; consider connection multiplexers.
TLS overhead on short-lived HTTP calls: Enable keep-alive, HTTP/2, and per-host pooling.

Quick examples

PostgreSQL with Node.js (pg)

const { Pool } = require('pg');

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 15,                 // per-instance cap
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 500
});

async function getUser(id) {
  const client = await pool.connect();
  try {
    const res = await client.query('SELECT * FROM users WHERE id = $1', [id]);
    return res.rows[0];
  } finally {
    client.release();
  }
}

Java with HikariCP (JDBC)

HikariConfig config = new HikariConfig();
config.setJdbcUrl(System.getenv("JDBC_URL"));
config.setUsername(System.getenv("DB_USER"));
config.setPassword(System.getenv("DB_PASS"));
config.setMaximumPoolSize(20);
config.setMinimumIdle(2);
config.setIdleTimeout(30000);
config.setMaxLifetime(55 * 60 * 1000); // 55 minutes
config.setConnectionTimeout(500);

HikariDataSource ds = new HikariDataSource(config);

HTTP keep-alive in Node.js

const https = require('https');

const agent = new https.Agent({
  keepAlive: true,
  maxSockets: 50,      // per-host concurrent sockets
  maxFreeSockets: 10,
  timeout: 30_000
});

// Use with fetch or request libraries that accept a custom agent

Special notes for PostgreSQL

PgBouncer can dramatically reduce backend connections and setup cost.
- Session pooling: Reuses backend connections per client session.
- Transaction pooling: Reuses between transactions; requires each transaction to be self-contained.
If you use PgBouncer, set smaller application-level pools and let PgBouncer manage backend connections.

A simple checklist

Enable pooling or keep-alive in your client libraries.
Set a conservative per-instance max and validate against backend limits.
Add short connection and borrow wait timeouts.
Keep transactions short; always return connections.
Monitor pool wait times, saturation, and connection errors.
Revisit pool sizes after load testing and as traffic patterns change.

Connection pooling is not just an optimization. It is often the difference between a system that collapses under real-world spikes and one that stays fast, predictable, and cost-effective.

Share this article

Share on Twitter Share on LinkedIn