Apache Superset Docker Deployment
Deploy Apache Superset with Docker for full-featured, self-hosted BI. No third-party required.
Superset is a powerful, open-source data visualization and business intelligence platform maintained by the Apache Foundation. It’s ideal for teams who want modern dashboards without handing data over to third parties. After experimenting with tools like Metabase, I landed on Superset for its extensibility, better dashboard customisation, and alerting/reporting features.
I struggled finding a comprehensive guide for Docker-based deployments beyond the basic setup, so I'll walk through my deployment process for Superset, from setting up the host and containers, to branding and email alerts.
Why Superset?
Before diving into deployment, a quick comparison.
Feature | Superset | Metabase |
---|---|---|
Data engine support | Wide (SQLAlchemy-compatible) | Limited but growing |
Chart/dash flexibility | Advanced (custom CSS, D3, JS plugins) | Simple and opinionated |
Alerts & reports | Built-in with scheduling (via Celery) | Available on paid plans |
Embedding options | Fine-grained (auth, iframe, SSO) | Basic iframe embedding |
Community | Apache-backed, active dev scene | Also solid, with a user-friendly vibe |
Metabase is simpler and easier to get started with, especially for business users. But if you're comfortable with SQL and want granular control, Superset offers a lot more power.
Deploying Superset with Docker Compose
I’m running this on a remote Linux server via Docker. We’ll use a persistent folder for Superset configs and database volumes.
Step 1: Prepare the Host
SSH into your server and run:
sudo mkdir -p /opt/superset/{home,db}
sudo chown -R 1000:1000 /opt/superset
This creates persistent storage for Superset's configuration and the PostgreSQL database.
Step 2: Docker Compose Setup (Full Version with Commands)
The complete Compose stack includes Superset, Postgres, Redis, a Celery worker for async tasks, and a beat scheduler for scheduled reports.
version: "3.8"
services:
superset:
image: apache/superset:latest
container_name: superset
ports:
- "8088:8088"
environment:
- SUPERSET_ENV=production
- SUPERSET_LOAD_EXAMPLES=no
- SUPERSET_SECRET_KEY=your-secret-key
- SUPERSET_REDIS_HOST=redis
- SQLALCHEMY_DATABASE_URI=postgresql+psycopg2://superset:superset@db:5432/superset
- CELERY_BROKER_URL=redis://redis:6379/0
volumes:
- /opt/superset/home:/app/superset_home
- /opt/superset/home/superset_config.py:/app/pythonpath/superset_config.py
depends_on:
- db
- redis
restart: always
networks:
- shared_network
command: >
/bin/bash -c "
pip install --no-cache-dir --no-warn-script-location psycopg2-binary prophet gevent openpyxl pandas-gbq pymysql elasticsearch-dbapi snowflake-connector-python cryptography flask-mail &&
export FLASK_APP=superset &&
superset db upgrade &&
if ! superset fab list-users | grep -q admin; then
superset fab create-admin --username admin --firstname Superset --lastname Admin --email [email protected] --password 'YourSecurePassword';
fi &&
superset init &&
gunicorn --workers 4 --worker-class gthread --timeout 120 -b 0.0.0.0:8088 'superset.app:create_app()'
"
db:
image: postgres:13
container_name: superset_db
environment:
- POSTGRES_DB=superset
- POSTGRES_USER=superset
- POSTGRES_PASSWORD=superset
ports:
- "6125:5432"
volumes:
- /opt/superset/db:/var/lib/postgresql/data
restart: always
networks:
- shared_network
redis:
image: redis:latest
container_name: superset_redis
restart: always
networks:
- shared_network
worker:
image: apache/superset:latest-dev
container_name: superset_worker
command: celery --app=superset.tasks.celery_app:app worker --pool=gevent --concurrency=4
environment:
- SUPERSET_ENV=production
- SUPERSET_SECRET_KEY=your-secret-key
- SQLALCHEMY_DATABASE_URI=postgresql+psycopg2://superset:superset@db:5432/superset
- CELERY_BROKER_URL=redis://redis:6379/0
- CELERY_RESULT_BACKEND=db+postgresql://superset:superset@db:5432/superset
volumes:
- /opt/superset/home:/app/superset_home
- /opt/superset/home/superset_config.py:/app/pythonpath/superset_config.py
depends_on:
- redis
- db
restart: always
networks:
- shared_network
beat:
image: apache/superset:latest-dev
container_name: superset_beat
command: >
/bin/bash -c "mkdir -p /app/superset_home && chmod -R a+rwX /app/superset_home && celery \
--app=superset.tasks.celery_app:app beat \
--pidfile= \
--schedule=/app/superset_home/celerybeat-schedule \
--loglevel=info"
environment:
- SUPERSET_ENV=production
- SUPERSET_SECRET_KEY=your-secret-key
- SQLALCHEMY_DATABASE_URI=postgresql+psycopg2://superset:superset@db:5432/superset
- CELERY_BROKER_URL=redis://redis:6379/0
- CELERY_RESULT_BACKEND=db+postgresql://superset:superset@db:5432/superset
volumes:
- /opt/superset/home:/app/superset_home
- /opt/superset/home/superset_config.py:/app/pythonpath/superset_config.py
depends_on:
- redis
- db
restart: always
networks:
- shared_network
networks:
shared_network:
external: true
`docker-compose.yml
If running this in production, use secrets and do not embed the credentials in the compose file.
Custom Branding
Tweak your /opt/superset/home/superset_config.py
to rebrand the UI:
APP_NAME = "My BI Platform" # Top-left title and browser tab title
# Where clicking the logo takes users (e.g., your dashboard homepage)
LOGO_TARGET_PATH = "https://yourdomain.com"
# Replace the Superset logo (top-left) with your own
APP_ICON = "https://yourdomain.com/static/logo.svg"
# Favicon (browser tab icon)
FAVICONS = [
{
"rel": "icon",
"type": "image/png",
"sizes": "32x32",
"href": "https://yourdomain.com/static/favicon-32x32.png",
},
{
"rel": "icon",
"type": "image/png",
"sizes": "16x16",
"href": "https://yourdomain.com/static/favicon-16x16.png",
},
]
Enabling Email Reports
Add this to the same config, this will enable the report/alert feature on Superset:
EMAIL_NOTIFICATIONS = True
SMTP_HOST = "smtp.yourprovider.com"
SMTP_PORT = 587
SMTP_STARTTLS = True
SMTP_SSL = False
SMTP_USER = "[email protected]"
SMTP_PASSWORD = os.getenv("SMTP_PASSWORD", "")
SMTP_MAIL_FROM = "[email protected]"
ALERT_REPORTS_SUPERSET_WEBDRIVER = {
"auth_type": "AUTH_FORM",
"auth_user": "admin",
"auth_password": os.getenv("SUPERSET_ADMIN_PW", ""),
"login_url": "https://yourdomain.com/login/",
}
What’s Next?
- Hook up your databases
- Build dashboards and slice charts
- Embed dashboards internally or in apps
- Define roles and access rules
Final Thoughts
Superset can be a bit heavy out the gate, but once it’s running, it becomes a versatile and professional-grade BI platform, all while keeping your data under your control.
UPDATE 11/08/2025
Since writing the post above, I’ve upgraded my Superset stack to do two things:
- Real-time async queries via WebSockets
Charts render the moment they’re ready (no browser polling). - Headless screenshots & email reports without a custom Dockerfile
The Celery worker uses thelatest-dev
image which bundles a headless browser/driver.
What changed
- WebSockets for GAQ (browser <-> WS <-> Redis):
Previously, the browser polled/api/v1/async_query/...
every X ms. With WS, the server pushes “query finished” events the instant Celery writes status to Redis. Lower latency, fewer HTTP requests, faster dashboards. - Split images: stable web, -dev worker:
The main app stays on the small, stable image. Only the worker uses*-dev
, which bundles headless Firefox + geckodriver. Reports/thumbnails work without baking a custom image; web container remains lean and safer. - Correct webdriver scoping:
WEBDRIVER_BASEURL
points tohttp://superset:8088/
(Docker-internal) so Selenium can log in and carry cookies reliably.WEBDRIVER_BASEURL_USER_FRIENDLY
(and email links) use your external HTTPS URL.
Avoids cookie domain mismatches and "can’t login" screenshot failures.
- Celery wired up:
imports=("superset.sql_lab", "superset.tasks",)
ensures all Superset tasks (reports, thumbnails, cache-warmup) are registered. Beat has the same GAQ secret, so it can start without exploding. - Performance headroom:
- Gunicorn on gevent with higher worker connections = better I/O concurrency.
- Redis caches for results, form state, thumbnails.
docker-compose.yml
version: "3.8"
services:
superset:
image: apache/superset:latest
container_name: superset
ports: ["8088:8088"]
environment:
- SUPERSET_ENV=production
- SUPERSET_LOAD_EXAMPLES=no
- SUPERSET_REDIS_HOST=redis
- JWT_COOKIE_NAME=async-token
- CACHE_CONFIG={"CACHE_TYPE":"RedisCache","CACHE_DEFAULT_TIMEOUT":3600,"CACHE_KEY_PREFIX":"superset_","CACHE_REDIS_HOST":"redis","CACHE_REDIS_PORT":6379}
- DATA_CACHE_CONFIG={"CACHE_TYPE":"RedisCache","CACHE_DEFAULT_TIMEOUT":3600,"CACHE_KEY_PREFIX":"superset_","CACHE_REDIS_HOST":"redis","CACHE_REDIS_PORT":6379}
- RATELIMIT_ENABLED=false
- RATELIMIT_STORAGE_URL=redis://redis:6379
- SQLALCHEMY_DATABASE_URI=postgresql+psycopg2://superset:superset@db:5432/superset
- CELERY_BROKER_URL=redis://redis:6379/0
- CELERY_RESULT_BACKEND=redis://redis:6379/1
- GLOBAL_ASYNC_QUERIES_JWT_SECRET=${GLOBAL_ASYNC_QUERIES_JWT_SECRET}
- ASYNC_QUERIES_JWT_ALGO=HS256
- SUPERSET_SECRET_KEY=${SUPERSET_SECRET_KEY}
# SMTP + webdriver creds used by “Test email” and thumbnails
- SMTP_PASSWORD=${SMTP_PASSWORD}
- SUPERSET_ADMIN_USER=${SUPERSET_ADMIN_USER}
- SUPERSET_ADMIN_PW=${SUPERSET_ADMIN_PW}
# Browsers (and now the worker browser) connect via public WSS
- GAQ_WS_URL=wss://bi.yourdomain.tld/ws/
volumes:
- /opt/superset/home:/app/superset_home
- /opt/superset/home/superset_config.py:/app/pythonpath/superset_config.py
- /opt/superset/logo.png:/app/superset/static/assets/images/logo.png
- /opt/superset/logo.png:/app/superset/static/assets/images/favicon.png
restart: always
depends_on: [db, redis]
networks: [shared_network]
deploy:
resources:
limits:
cpus: "4.0"
memory: 6G
reservations:
cpus: "2.0"
memory: 2G
command: >
/bin/bash -c "
pip install --no-cache-dir --no-warn-script-location psycopg2-binary pillow prophet gevent openpyxl pandas-gbq pymysql elasticsearch-dbapi snowflake-connector-python cryptography flask-mail &&
export FLASK_APP=superset &&
superset db upgrade &&
if ! superset fab list-users | grep -q ${SUPERSET_ADMIN_USER}; then
superset fab create-admin --username ${SUPERSET_ADMIN_USER} --firstname Superset --lastname Admin --email [email protected] --password '${SUPERSET_ADMIN_PW}';
fi &&
superset init &&
gunicorn --workers 4 --threads 6 --worker-class gthread --timeout 180 -b 0.0.0.0:8088 'superset.app:create_app()'
"
db:
image: postgres:13
container_name: superset_db
environment:
- POSTGRES_DB=superset
- POSTGRES_USER=superset
- POSTGRES_PASSWORD=superset
ports: ["6125:5432"]
volumes:
- /opt/superset/db:/var/lib/postgresql/data
restart: always
networks: [shared_network]
redis:
image: redis:latest
container_name: superset_redis
restart: always
networks: [shared_network]
# Use the DEV image ONLY for the worker so we get a built-in browser/driver
worker:
image: apache/superset:latest-dev
container_name: superset_worker
working_dir: /app/superset_home # ensure geckodriver.log is writable
command: >
/bin/bash -c "
pip install --no-cache-dir pillow &&
celery --app=superset.tasks.celery_app:app worker --pool=gevent --concurrency=4
"
environment:
- SUPERSET_ENV=production
- SUPERSET_REDIS_HOST=redis
- SUPERSET_SECRET_KEY=${SUPERSET_SECRET_KEY}
- SQLALCHEMY_DATABASE_URI=postgresql+psycopg2://superset:superset@db:5432/superset
- CELERY_BROKER_URL=redis://redis:6379/0
- CELERY_RESULT_BACKEND=redis://redis:6379/1
- GLOBAL_ASYNC_QUERIES_JWT_SECRET=${GLOBAL_ASYNC_QUERIES_JWT_SECRET}
- ASYNC_QUERIES_JWT_ALGO=HS256
# Needed for email sending + webdriver login
- SMTP_PASSWORD=${SMTP_PASSWORD}
- SUPERSET_ADMIN_USER=${SUPERSET_ADMIN_USER}
- SUPERSET_ADMIN_PW=${SUPERSET_ADMIN_PW}
# NOTE: no GAQ_WS_URL override here — the page uses Superset's value
volumes:
- /opt/superset/home:/app/superset_home
- /opt/superset/home/superset_config.py:/app/pythonpath/superset_config.py
depends_on: [redis, db, superset]
restart: always
networks: [shared_network]
deploy:
resources:
limits:
cpus: "3.0"
memory: 3G
reservations:
cpus: "1.0"
memory: 1G
beat:
image: apache/superset:latest
container_name: superset_beat
command: >
/bin/bash -c "mkdir -p /app/superset_home && chmod -R a+rwX /app/superset_home && celery
--app=superset.tasks.celery_app:app beat
--pidfile=
--schedule=/app/superset_home/celerybeat-schedule
--loglevel=info"
environment:
- SUPERSET_ENV=production
- SUPERSET_REDIS_HOST=redis
- SUPERSET_SECRET_KEY=${SUPERSET_SECRET_KEY}
- SQLALCHEMY_DATABASE_URI=postgresql+psycopg2://superset:superset@db:5432/superset
- CELERY_BROKER_URL=redis://redis:6379/0
- CELERY_RESULT_BACKEND=redis://redis:6379/1
- GLOBAL_ASYNC_QUERIES_JWT_SECRET=${GLOBAL_ASYNC_QUERIES_JWT_SECRET}
- ASYNC_QUERIES_JWT_ALGO=HS256
- SMTP_PASSWORD=${SMTP_PASSWORD}
- GAQ_WS_URL=wss://bi.yourdomain.tld/ws/
volumes:
- /opt/superset/home:/app/superset_home
- /opt/superset/home/superset_config.py:/app/pythonpath/superset_config.py
depends_on: [redis, db]
restart: always
networks: [shared_network]
websocket:
image: apache/superset:latest-websocket
container_name: superset_websocket
environment:
- JWT_SECRET=${GLOBAL_ASYNC_QUERIES_JWT_SECRET}
- JWT_COOKIE_NAME=async-token
- REDIS_HOST=redis
- REDIS_PORT=6379
- REDIS_DB=0
# - LOG_LEVEL=info
depends_on: [redis]
restart: always
networks: [shared_network]
networks:
shared_network:
external: true
Put secrets in a .env
next to the compose file:
superset_config.py
# superset_config.py
import os
from celery.schedules import crontab
from cachelib.redis import RedisCache
# ---------- Celery ----------
class CeleryConfig:
broker_url = "redis://redis:6379/0"
result_backend = "redis://redis:6379/1"
imports = ("superset.sql_lab", "superset.tasks",)
worker_prefetch_multiplier = 1
task_acks_late = True
task_soft_time_limit = 300
task_time_limit = 360
beat_schedule = {
"reports.scheduler": {"task": "reports.scheduler", "schedule": crontab(minute="*", hour="*")},
"reports.prune_log": {"task": "reports.prune_log", "schedule": crontab(minute=0, hour=0)},
}
CELERY_CONFIG = CeleryConfig
# ---------- Branding ----------
APP_NAME = "My BI Platform"
APP_ICON = "/static/assets/images/logo.png"
APP_ICON_WIDTH = 200
LOGO_TARGET_PATH = "https://bi.yourdomain.tld"
FAVICON = "/static/assets/images/logo.png"
# ---------- Metadata DB ----------
SQLALCHEMY_DATABASE_URI = "postgresql+psycopg2://superset:superset@db:5432/superset"
SQLALCHEMY_ENGINE_OPTIONS = {
"pool_size": 25,
"max_overflow": 50,
"pool_pre_ping": True,
"pool_recycle": 300,
}
# ---------- Caching ----------
RESULTS_BACKEND = RedisCache(host="redis", port=6379, key_prefix="superset_results_")
CACHE_CONFIG = {
"CACHE_TYPE": "RedisCache",
"CACHE_DEFAULT_TIMEOUT": 3600,
"CACHE_KEY_PREFIX": "superset_",
"CACHE_REDIS_HOST": "redis",
"CACHE_REDIS_PORT": 6379,
}
DATA_CACHE_CONFIG = dict(CACHE_CONFIG)
FILTER_STATE_CACHE_CONFIG = {
"CACHE_TYPE": "RedisCache",
"CACHE_DEFAULT_TIMEOUT": 86400,
"CACHE_KEY_PREFIX": "filter_state_",
"CACHE_REDIS_HOST": "redis",
"CACHE_REDIS_PORT": 6379,
}
# ---------- Rate limiting ----------
RATELIMIT_ENABLED = False
RATELIMIT_STORAGE_URI = "redis://redis:6379"
RATELIMIT_STORAGE_BACKEND = RedisCache(host="redis", port=6379, key_prefix="ratelimit_")
# ---------- Core timeouts ----------
SUPERSET_WEBSERVER_TIMEOUT = 180
ENABLE_TIME_ROTATE = True
ROW_LIMIT = 10000
# ---------- Email / Alerts & Reports ----------
EMAIL_NOTIFICATIONS = True
SMTP_HOST = "smtp.office365.com"
SMTP_PORT = 587
SMTP_STARTTLS = True
SMTP_SSL = False
SMTP_USER = "[email protected]"
SMTP_PASSWORD = os.getenv("SMTP_PASSWORD", "")
SMTP_MAIL_FROM = "[email protected]"
# IMPORTANT: use external HTTPS so cookies + WSS match for the worker browser
WEBDRIVER_BASEURL = "https://bi.yourdomain.tld/"
WEBDRIVER_BASEURL_USER_FRIENDLY = "https://bi.yourdomain.tld/"
EMAIL_REPORTS_CTA = "Explore in BI"
EMAIL_REPORTS_SUBJECT_PREFIX = "[BI REPORT] "
ALERT_REPORTS_NOTIFICATION_DRY_RUN = False
ALERT_REPORTS_SUPERSET_WEBDRIVER = {
"auth_type": "AUTH_FORM",
"auth_user": os.getenv("SUPERSET_ADMIN_USER", "admin"),
"auth_password": os.getenv("SUPERSET_ADMIN_PW", ""),
"login_url": "https://bi.yourdomain.tld/login/",
}
FEATURE_FLAGS = {
"ALERT_REPORTS": True,
"EMBEDDED_SUPERSET": True,
"PLAYWRIGHT_REPORTS_AND_THUMBNAILS": False, # Selenium route
"ALLOW_FULL_CSV_EXPORT": True,
"DASHBOARD_VIRTUALIZATION": True,
"DASHBOARD_LAZY_RENDERING": True,
"DASHBOARD_NATIVE_FILTERS": True,
"DASHBOARD_NATIVE_FILTERS_SET": True,
"SHARE_QUERIES_VIA_KV_STORE": True,
"EMBEDDABLE_CHARTS": True,
"GLOBAL_ASYNC_QUERIES": True,
}
# ---------- Screenshots / Selenium (Firefox in latest-dev worker) ----------
SCREENSHOT_SELENIUM_DRIVER = "firefox"
WEBDRIVER_TYPE = "firefox"
WEBDRIVER_OPTION_ARGS = ["--headless", "--disable-gpu", "--no-sandbox", "--disable-dev-shm-usage"]
# Log to a writable path inside the worker
WEBDRIVER_CONFIGURATION = {"service_log_path": "/app/superset_home/geckodriver.log"}
# Heavier dashboards benefit from more generous waits
SCREENSHOT_LOAD_WAIT = 120
SCREENSHOT_SELENIUM_ANIMATION_WAIT = 30
SCREENSHOT_SELENIUM_HEADSTART = 10
SCREENSHOT_LOCATE_WAIT = 120
EMAIL_PAGE_RENDER_WAIT = 45
SCREENSHOT_SELENIUM_WAIT = 45
WEBDRIVER_WINDOW = {"dashboard": (1300, 2000), "slice": (1300, 1200), "pixel_density": 1}
# ---------- GAQ (Async Queries) ----------
GLOBAL_ASYNC_QUERIES_JWT_SECRET = os.getenv("GLOBAL_ASYNC_QUERIES_JWT_SECRET")
ASYNC_QUERIES_JWT_ALGO = os.getenv("ASYNC_QUERIES_JWT_ALGO", "HS256")
GLOBAL_ASYNC_QUERIES_CACHE_BACKEND = {
"CACHE_TYPE": "RedisCache",
"CACHE_KEY_PREFIX": "gaq_",
"CACHE_REDIS_HOST": "redis",
"CACHE_REDIS_PORT": 6379,
"CACHE_DEFAULT_TIMEOUT": 0,
}
GLOBAL_ASYNC_QUERIES_REDIS_CONFIG = {"host": "redis", "port": 6379, "db": 0, "password": None, "ssl": False}
GLOBAL_ASYNC_QUERIES_TRANSPORT = "ws"
GLOBAL_ASYNC_QUERIES_WEBSOCKET_URL = "wss://bi.yourdomain.tld/ws/"
GLOBAL_ASYNC_QUERIES_POLLING_DELAY = 300 # ms (only used if falling back to polling)
GLOBAL_ASYNC_QUERIES_JWT_COOKIE_NAME = "async-token"
GLOBAL_ASYNC_QUERIES_JWT_COOKIE_SECURE = True
# ---------- Extra caches ----------
EXPLORE_FORM_DATA_CACHE_CONFIG = {
"CACHE_TYPE": "RedisCache",
"CACHE_KEY_PREFIX": "explore_",
"CACHE_REDIS_HOST": "redis",
"CACHE_REDIS_PORT": 6379,
"CACHE_DEFAULT_TIMEOUT": 300,
}
THUMBNAIL_CACHE_CONFIG = {
"CACHE_TYPE": "RedisCache",
"CACHE_KEY_PREFIX": "thumb_",
"CACHE_REDIS_HOST": "redis",
"CACHE_REDIS_PORT": 6379,
"CACHE_DEFAULT_TIMEOUT": 3600,
}
Reverse proxy (NPM)
Keep your main proxy host pointing to http://superset:8088
.
Add a Custom Location:
- Location:
/ws/
- Forward to:
http://websocket:8080
- Enable Websockets Support
Advanced tab:
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_read_timeout 600s;
proxy_send_timeout 600s;
Make sure NPM is on the same Docker network as this stack.
How to confirm it works
- WebSocket: open DevTools > Network > filter WS > you should see
wss://bi.yourdomain.tld/ws/
with 101 Switching Protocols and messages flowing. - Reports: create a test Report > "Run now" > you'll see activity in
superset_worker
logs; you should receive an email. - Speed: charts appear as soon as each query finishes (no synchronized pop after a poll tick).