Docs

DBLab Engine

DBLab Engine enables you to create instant, full-size thin clones of your PostgreSQL databases. Use them for development, testing, CI/CD pipelines, and query optimization -- all without consuming extra storage.

What is DBLab Engine?

DBLab Engine is a thin cloning technology that creates independent copies of your PostgreSQL database in seconds, regardless of database size. Instead of duplicating all the data, DBLab uses copy-on-write semantics to share unchanged data blocks between clones and the source.

Key characteristics:

  • Instant provisioning — clones are created in seconds, even for multi-terabyte databases
  • Full-size copies — each clone is a fully functional PostgreSQL instance with all data, schemas, and indexes
  • Minimal storage overhead — clones only consume additional space for data that has been modified within the clone
  • Complete isolation — changes in one clone do not affect other clones or the source data
  • Independent lifecycle — each clone can be created, used, and destroyed independently

How It Works

DBLab Engine leverages filesystem-level snapshot capabilities to provide instant cloning.

ZFS and LVM Snapshots

Under the hood, DBLab Engine uses ZFS or LVM to create point-in-time snapshots of the PostgreSQL data directory. These snapshots are nearly instantaneous because they do not copy any data -- they simply record a reference point.

Copy-on-Write Mechanism

When a clone modifies a data block, only the changed block is written to new storage. The original blocks remain shared with the snapshot. This means:

  • A 1 TB database clone initially uses close to zero additional disk space
  • Storage grows only proportionally to the amount of data modified within the clone
  • Multiple clones can coexist efficiently on the same host

Clone Lifecycle

  1. Snapshot — DBLab takes a periodic snapshot of the source database (configurable interval)
  2. Clone creation — a new clone is provisioned from the latest snapshot in seconds
  3. Usage — the clone operates as a standard PostgreSQL instance on its own port
  4. Destruction — when no longer needed, the clone is destroyed and its storage is reclaimed instantly

Use Cases

Development Environments

Give every developer their own full copy of the production database. No more shared staging databases with stale data. Each developer gets an isolated, up-to-date environment they can freely modify.

CI/CD Testing

Spin up a fresh database clone for each test run in your CI/CD pipeline. Run integration tests and end-to-end tests against real data, then discard the clone when the pipeline finishes. Clones are created in seconds, so they add minimal overhead to your pipeline.

Query Optimization Testing

Test query changes, new indexes, and configuration tuning on a full copy of production data without any risk. Compare performance before and after changes using realistic data distributions and volumes.

Schema Migration Validation

Run database migrations against a clone first to verify they complete successfully, measure execution time, and catch potential issues before applying them to production.

Getting Started with DBLab

Setting up DBLab Engine involves three steps:

1. Install DBLab Engine

DBLab Engine runs as a Docker container or directly on the host. The recommended approach is Docker:

docker run -d --name dblab \
  -v /var/lib/dblab:/var/lib/dblab \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -p 2345:2345 \
  dbasaas/dblab-engine:latest

2. Configure the Data Source

Point DBLab at your PostgreSQL data. You can use one of several data source modes:

  • Physical — direct access to the PostgreSQL data directory (fastest, used for local databases)
  • Logical — uses pg_dump / pg_restore to populate the data pool (works with any PostgreSQL, including managed services)
  • RDS Snapshot — restores from an Amazon RDS snapshot (see the Amazon RDS Integration section below)

3. Create Your First Clone

Use the DBLab CLI or API to create a clone:

dblab clone create --username dev_user --password dev_pass --id my-first-clone

The clone will be available on a dynamically assigned port within seconds. Connect to it using any standard PostgreSQL client.

Amazon RDS Integration

DBLab Engine integrates with Amazon RDS to enable thin cloning of managed PostgreSQL databases.

How It Works with RDS

  1. RDS Snapshot — DBLab creates or uses an existing RDS snapshot of your database
  2. Restore to EBS — the snapshot is restored to an EBS volume attached to the DBLab instance
  3. ZFS Pool — the restored data is imported into a ZFS pool for snapshot and clone management
  4. Continuous Sync — optional logical replication keeps the data pool up to date with production

Logical Replication Setup

For near-real-time data freshness, configure logical replication from your RDS instance to the DBLab data pool:

  • Enable rds.logical_replication parameter on your RDS instance
  • Create a replication user with the rds_replication role
  • Configure the DBLab sync section to use logical replication as the refresh method

This keeps your clone pool within seconds of production without impacting RDS performance.

Best Practices

Clone Naming Conventions

Adopt a consistent naming scheme for clones to make them easy to identify and manage:

  • Include the purpose: dev-alice, ci-pipeline-1234, migration-test-v2.5
  • Include a timestamp or build number for CI/CD clones
  • Use prefixes to group clones by team or project

Cleanup Policies

Unneeded clones should be removed promptly to reclaim resources. Configure automatic cleanup:

  • TTL (Time to Live) — set a maximum lifetime for clones (e.g. 24 hours for CI, 7 days for development)
  • Idle timeout — destroy clones that have had no active connections for a specified period
  • Max clones limit — cap the total number of concurrent clones to prevent resource exhaustion

Resource Limits

Each clone runs as an independent PostgreSQL process. To prevent resource contention:

  • Set shared_buffers and work_mem appropriately for clone workloads (typically lower than production)
  • Limit the maximum number of connections per clone
  • Monitor disk space usage and configure alerts when the ZFS pool reaches 80% capacity
  • Use CPU and memory cgroup limits when running clones in containers