Restoring from backup

Audience

PLANA staff. Customer-facing recovery information is at Plana extras → Backups.

PLANA takes a daily logical pg_dump of every tenant DB plus a tarball of its filestore, both to Exoscale SOS. This runbook covers restoring a tenant from one of those backups.

When to use this

Data corruption on the live tenant
Accidental delete the customer can't recover via Odoo's undo
Pre-upgrade smoke test on a clone
Forensic investigation after an incident

Backup layout in SOS

s3://plana-pulse-backups/
└── {subdomain}/
    ├── backup-YYYYMMDDTHHMMSS.dump.gz      # logical pg_dump
    ├── filestore-YYYYMMDDTHHMMSS.tar.gz    # filestore tarball
    └── insurance/                          # pre-destructive operations
        └── backup-YYYYMMDDTHHMMSS.dump.gz

Retention:

Tier	Daily backups	Insurance backups
Starter	30 days	7 days minimum
Pro	90 days	7 days minimum
Enterprise	1 year	7 days minimum

List available backups for a tenant:

bash

aws s3 ls "s3://plana-pulse-backups/acme/" \
  --endpoint-url=https://sos-bg-sof-1.exo.io

Two restore paths

Path	Use when	DB target
In-place restore	Tenant is broken; replace its data	The tenant's own DB on pg01
Side-by-side clone	Investigation, pre-upgrade staging, no downtime acceptable	A new staging DB on pg01

In-place is destructive. Always start with side-by-side unless the customer has explicitly accepted the data-loss window between "last good backup" and "now".

Side-by-side clone (the default)

1. Pick the backup

bash

# List backups newest first
aws s3 ls "s3://plana-pulse-backups/acme/" \
  --endpoint-url=https://sos-bg-sof-1.exo.io | sort -r | head -10

Pick the timestamp you want to restore from. Example: backup-20260529T020000.dump.gz.

2. Apply the EnvironmentRestore CR

PLANA's restore path is declarative — Crossplane's EnvironmentRestore XR emits a Job that streams the backup from SOS to pg01.

yaml

apiVersion: planapulse.com/v1alpha1
kind: EnvironmentRestore
metadata:
  name: acme-staging-20260529
spec:
  sourceSlug: acme
  sourceBackup: backup-20260529T020000.dump.gz
  targetDb: acme-staging.planapulse.app
  targetNamespace: plana-odoo-18
  restoreFilestore: true

bash

kubectl apply -f acme-staging-restore.yaml
kubectl get environmentrestore acme-staging-20260529 -w

3. Watch the Job

bash

kubectl -n plana-odoo-18 get job -l environmentrestore=acme-staging-20260529
kubectl -n plana-odoo-18 logs -l environmentrestore=acme-staging-20260529 -f

The Job streams the dump directly from SOS to psql --create — no intermediate disk for tenants larger than the worker's /tmp. Typical runtime: 1–3 minutes per GB of dump.

4. Bring up a temporary HTTPRoute

Apply a one-off HTTPRoute pointing acme-staging.planapulse.app at worker-odoo:8069 in the same namespace. Once Crossplane has reconciled, the staging tenant is reachable at https://acme-staging.planapulse.app/web/login.

5. Smoke and hand off

Test the restored tenant. When done with the staging clone:

bash

kubectl delete environmentrestore acme-staging-20260529
# The Composition cleans up: drops the staging DB, removes the HTTPRoute,
# clears the filestore subdirectory.

In-place restore

This is the dangerous one. Always take an insurance backup first, even if you are restoring because the live tenant is broken.

1. Insurance backup of the live (possibly broken) DB

bash

kubectl -n backup create job --from=cronjob/acme-backup \
  acme-pre-restore-$(date +%Y%m%d%H%M%S)
kubectl -n backup logs job/acme-pre-restore-XXXXXXXX -f

Confirm it landed in s3://plana-pulse-backups/acme/insurance/.

2. Freeze the tenant

bash

kubectl -n plana-odoo-18 scale deploy worker-odoo --replicas=0

Wait for pods to terminate. This stops writes during the restore. Note that this affects every tenant in the namespace — coordinate the maintenance window accordingly. For a single-tenant restore on a shared worker pool, prefer the side-by-side clone instead.

3. Drop and recreate the DB

bash

psql -h pg01.planapulse.com -U plana -c \
  "DROP DATABASE IF EXISTS \"acme.planapulse.app\" WITH (FORCE)"

4. Restore from SOS

Use the same EnvironmentRestore CR pattern, but with the target DB equal to the live DB:

yaml

spec:
  sourceSlug: acme
  sourceBackup: backup-20260529T020000.dump.gz
  targetDb: acme.planapulse.app    # the live name
  targetNamespace: plana-odoo-18
  restoreFilestore: true

bash

kubectl apply -f acme-inplace-restore.yaml

5. Clear the live filestore subdirectory

bash

kubectl -n plana-odoo-18 exec deploy/worker-odoo -- \
  rm -rf /var/lib/odoo/filestore/acme.planapulse.app/*

Then let the EnvironmentRestore Composition extract the filestore tarball back into place.

6. Verify and unfreeze

bash

# Confirm DB is back
psql -h pg01 -U plana -d acme.planapulse.app -c "SELECT COUNT(*) FROM res_users"

# Confirm filestore is back
kubectl -n plana-odoo-18 exec deploy/worker-odoo -- \
  ls /var/lib/odoo/filestore/acme.planapulse.app | wc -l

# Unfreeze
kubectl -n plana-odoo-18 scale deploy worker-odoo --replicas=3
kubectl -n plana-odoo-18 rollout status deploy worker-odoo

7. Smoke

bash

curl -sI "https://acme.planapulse.app/web/health"
curl -s  "https://acme.planapulse.app/web/login" | grep -q 'Log in' && echo OK

8. Notify the customer

Use the workspace's Matrix room. Include:

The backup timestamp used (so they know what window of data was lost)
The window of downtime (start time, end time)
The retention of the insurance backup (7 days minimum)

Restoring a deleted tenant

If the tenant's PLANAClient CR has been deleted (rare — usually we soft-delete), the restore is a two-step:

Recreate the PLANAClient CR (or TenantEnvironment if more fine-grained).
Wait for the new tenant DB to be created (it will be empty).
Apply an EnvironmentRestore CR pointing at the SOS backup with target = the new live DB name.

Common pitfalls

1. Restoring to a DB that already exists

Forgetting to DROP before the in-place restore causes psql --create to fail or to import into an existing schema, leaving you with a hybrid broken state. Always DROP (or restore side-by-side first).

2. Filestore tarball missing

Older backups may not have a filestore tarball (the filestore-backup CronJob was added later than the DB-backup CronJob). If restoreFilestore: true and the tarball is missing, the Job will fail. Drop the flag or recover the filestore from the latest snapshot you DO have.

3. Asset bundle 500s after restore

If you restore a v18 backup but the worker is running v18 with a newer base image, asset hashes diverge and /web/assets/* returns 500. Force asset regeneration:

bash

psql -h pg01 -U plana -d acme.planapulse.app -c \
  "DELETE FROM ir_attachment WHERE name LIKE '/web/assets/%'"
kubectl -n plana-odoo-18 rollout restart deploy worker-odoo

The worker rebuilds the assets on first request.

Where to read more

Provisioning a tenant
Upgrading a tenant
Architecture → Data stores — backup locations and retention
Plana extras → Backups — customer-facing version of this content

Restoring from backup ​

When to use this ​

Backup layout in SOS ​

Two restore paths ​

Side-by-side clone (the default) ​

1. Pick the backup ​

2. Apply the EnvironmentRestore CR ​

3. Watch the Job ​

4. Bring up a temporary HTTPRoute ​

5. Smoke and hand off ​

In-place restore ​

1. Insurance backup of the live (possibly broken) DB ​

2. Freeze the tenant ​

3. Drop and recreate the DB ​

4. Restore from SOS ​

5. Clear the live filestore subdirectory ​

6. Verify and unfreeze ​

7. Smoke ​

8. Notify the customer ​

Restoring a deleted tenant ​

Common pitfalls ​

1. Restoring to a DB that already exists ​

2. Filestore tarball missing ​

3. Asset bundle 500s after restore ​

Where to read more ​

Restoring from backup

When to use this

Backup layout in SOS

Two restore paths

Side-by-side clone (the default)

1. Pick the backup

2. Apply the EnvironmentRestore CR

3. Watch the Job

4. Bring up a temporary HTTPRoute

5. Smoke and hand off

In-place restore

1. Insurance backup of the live (possibly broken) DB

2. Freeze the tenant

3. Drop and recreate the DB

4. Restore from SOS

5. Clear the live filestore subdirectory

6. Verify and unfreeze

7. Smoke

8. Notify the customer

Restoring a deleted tenant

Common pitfalls

1. Restoring to a DB that already exists

2. Filestore tarball missing

3. Asset bundle 500s after restore

Where to read more