Monitoring Celery: My Walk with Flower, Prometheus and Grafana

9 min readOct 30, 2024

First, what is Celery and why am I walking with it?

Celery is a background task manager that helps you run tasks asynchronously in your applications. Imagine you’re a busy CEO with many tasks that keep your work running — scheduling meetings, generating reports and processing large amounts of data. Celery, like a solid personal assistant, handles the important repetitive tasks while you do the more important ones.

However, having a distributed task queue system is one thing. Monitoring it to see what’s going on under the hood is just as important. I set up a team to monitor Celery here:

Flower offers visibility into the distributed task processing system (Celery), which will feed into…
Prometheus, plays the role of metrics collector and record keeper.
And then there’s , Grafana. She’s the visual one — you will see all the data about Celery tasks in bite-size, easy-on-the-eye (and your cognitive load) formats to help you understand what’s going on and take action where necessary.

In this article, I walk you through my process of setting these tools up to monitor Celery. Of course, I’m going to share most of my mistakes too so you learn from them but in the end, we get there.

Here is what I set out to achieve:

Set up Celery with a Redis broker
Install and configure Flower for real-time task monitoring
Integrate Prometheus to collect metrics
Visualize everything with Grafana dashboards
Containerize our setup with Docker (because who doesn’t love containers?)

Let’s dive in.

Getting the Work Done: Setting Up and Monitoring Celery

1. Setting Up Celery with Redis as the Broker

First, I set up Celery itself and set it in motion to do those background tasks. I used Redis for my setup like so:

Install Celery and Redis:

pip install celery redis

Configure Celery:

In my Django app, I configured Celery with Redis as the broker in a celery.py file.

from celery import Celery  

app = Celery('my_app_name',            
             broker='redis://localhost:6379/0', 
             backend='redis://localhost:6379/0',              
             include=['my_app_name.tasks'])

Create a tasks.py file: Here, I defined the tasks I wanted and integrated Prometheus metrics to report to me on those tasks:

from celery import Celery
from prometheus_client import start_http_server, Counter, Gauge

app = Celery('tasks', broker='redis://localhost:6379/0')

# Define Prometheus metrics
task_counter = Counter('celery_tasks_total', 'Total number of Celery tasks')
task_in_progress = Gauge('celery_tasks_in_progress', 'Number of Celery tasks in progress')

@app.task
def add(x, y):
    task_counter.inc()
    task_in_progress.inc()
    result = x + y
    task_in_progress.dec()
    return result

# Start Prometheus HTTP server to expose metrics
if __name__ == '__main__':
    start_http_server(8000)
    app.start()
    add.delay(4, 6)

2. Running Celery Workers

Start the Celery worker:

celery -A my_app_name worker --loglevel=info

Make sure the workers are running: Check the status of your workers by running:

celery -A my_app_name status

3. Setting Up Flower for Monitoring Celery

Next, we set up Flower for Celery. Flower is the tool that gives you a visual dashboard for Celery.

Install Flower:

pip install flower

Run Flower: when you’re sure the Celery workers are active:

celery -A my_app_name flower --port=5555

Access Flower: navigate to http://localhost:5555 in your browser to monitor your Celery tasks.

4. Integrating Prometheus for Metrics Collection

Install Prometheus Client:

pip install prometheus_client

Expose Metrics in Your App: I added a /metrics endpoint to the main Python file to request timings:

from prometheus_client import start_http_server, Summary
import time
import random

REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')

@REQUEST_TIME.time()
def process_request(t):
    time.sleep(t)

if __name__ == '__main__':
    start_http_server(8000)
    while True:
        process_request(random.random())

Prometheus Configuration: I then configured Prometheus to scrape metrics from the application in a prometheus.yml file (create a new file called prometheus.yml ):

global:
  scrape_interval: 15s
  evaluation_interval:15s

scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets: ['localhost:9090']
  - job_name: flower
    static_configs:
      - targets: ['localhost:5555']

Start Prometheus:

prometheus --config.file=prometheus.yml

5. Setting Up Grafana for Visualisation

Let’s make the data look pretty with Grafana (trust me its prettier than Flower)

Install Grafana: Download and install Grafana based on your OS instructions from the Grafana website.
Add Prometheus as a Data Source: Navigate to Grafana’s web interface at http://localhost:3000.
Go to Configuration > Data Sources and add Prometheus with the URL http://localhost:9090.
Create Dashboards: Create custom dashboards using the metrics exposed by Prometheus.

6. Setting Up Sentry for Error Tracking

Errors are all to important to understand what is going on when things are not working as they ought to. Let’s use Sentry for tracking this.

Install Sentry SDK:

pip install sentry-sdk==2.11.0

Integrate Sentry in Your App: Initialize Sentry in your app’s main entry point (e.g., settings.py for Django in my case):

import sentry_sdk
from sentry_sdk.integrations.django import DjangoIntegration

sentry_sdk.init(
    dsn="YOUR_SENTRY_DSN",
    integrations=[DjangoIntegration()],
    traces_sample_rate=1.0,
    send_default_pii=True
)

7. Running Everything Concurrently with Docker

Dockerize the Setup: Make sure each service (Celery, Redis, Flower, Prometheus, Grafana) has its own container by creating a docker-compose.yml file to manage them together.

version: '3'
services:
  redis:
    image: "redis:alpine"
  web:
    build: .
    command: python manage.py runserver 0.0.0.0:8000
    volumes:
      - .:/code
    ports:
      - "8000:8000"
    depends_on:
      - redis
  celery:
    build: .
    command: celery -A my_app_name worker --loglevel=info
    volumes:
      - .:/code
    depends_on:
      - redis
  flower:
    build: .
    command: celery -A my_app_name flower --port=5555
    ports:
      - "5555:5555"
    depends_on:
      - redis

Start the Environment: with this command

docker-compose up

Monitor and Troubleshoot: Here are few ways to consider monitoring and troubleshooting:
Use multiple terminals to monitor logs, metrics, and services.
Regularly check Prometheus and Grafana dashboards for system health.
Monitor Celery tasks through Flower’s web interface.

This setup covers the primary steps needed to integrate Celery with Flower, Prometheus, Grafana, and Sentry, to make for a robust and observable environment.

Lessons Learned: My Troubleshooting Diary

Alright, I have been itching to tell you the gist of it — I find that the bulk of my learning happened in the mistakes I made or the errors I ran into. Here are some of them:

Redis ghosted me

You know that moment when you’ve set up everything perfectly and you hit “run,” then nothing works? That was me— I’d spent ages configuring Celery and its components and Redis had stopped running.

What I learned: Always, always check if Redis is actually running. If you’re using Docker (like I did), spin up a Redis container with:

docker run -p 6379:6379 redis

Pro tip: Make sure Redis is alive on port 6379 before you start pulling your hair out.

Grafana and Prometheus were not getting along…

The way integrations work is that all the tools should depend on each other to give you the value you’re looking for. Prometheus typically feeds Grafana with data and it just wasn’t working. I may or may not have screamed at my three screens a few times — I’d set everything up correctly! (Or not).

What I learned:

a. Verify Prometheus is Running: First, check if Prometheus is active athttp://localhost:9090. If it’s not, that’s your starting point.

b. Add Prometheus as a Data Source: In Grafana, go to Configuration > Data Sources > Add Prometheus and ensure the URL is set to http://localhost:9090.

c. Check Network Connectivity: If Grafana still can’t connect, there might be network issues or firewall rules blocking communication. Verify that both tools are on the same network and that no firewalls are interfering.

d. Restart and Recheck: Sometimes, a simple restart of the Docker containers for both Prometheus and Grafana can resolve connection issues. If you still have issues, inspect logs for any errors.

Pro Tip: Be almost paranoid about your configurations pointing to the right URLs and ports. A small typo can sometimes be the culprit.

Docker was hugging my CPU

Docker is usually a lifesaver, but not when it makes your laptop sound like a jet engine. I was in the middle of my environment setup when my CPU was maxed out, my fan was blasting, and my machine was overheating. Not fun.

Here’s what I did:

a. Stop Unnecessary Containers: First, check what’s running. Use docker ps to list all containers and docker stop <container_id> for any you don’t need. This freed up some resources right away.

b. Manage Container Restarts: If Docker keeps restarting, it could be due to auto-restart settings. Check Docker’s settings to ensure only necessary containers have restart policies.

c. Manually Stop Docker: Sometimes, you need to take more drastic measures. I found that manually stopping Docker Desktop from the application itself helped calm things down.

Pro Tip: Keep an eye on Docker’s resource usage (the activity monitor on your Mac is your friend). If your CPU starts spiking, it’s usually because of a runaway container. Docker is powerful, but it can easily overrun your system if left unchecked.

VSCode couldn’t find my Python Packages

I had just installed a Python package and VS Code insisted it couldn’t find it when I tried to deploy my configuration. Wahala!

Here’s what was really happening:

a. Virtual Environment Confusion: VS Code was using a different Python interpreter than the one in my active virtual environment.

b. Cached Python Interpreter: VS Code sometimes caches the Python interpreter path, which might not reflect recent changes to your environment. Despite all my changes, it refused to acknowledge the new packages.

c. Workspace Settings: My VS Code workspace settings pointed to an outdated Python interpreter, probably from an old project I’d forgotten about. Pointing to the wrong Python interpreter can happen if you have conflicting configurations.

The solution: Make sure your virtual environment is not just created but activated. When in doubt:

pip install -r requirements.txt

And if VS Code is still stubborn, try reopening your project in a containerized environment. It will give your code editor a fresh start.

Pro Tip: Always double-check which Python interpreter VS Code is using. It’s in the bottom left corner of the VS Code window. If it’s not pointing to your virtual environment, that’s your culprit!

Flower Power… or Lack Thereof

I had finished deploying Celery and was ready for the tasks to be monitored by Flower but it just wasn’t working.

What worked for me:

a. Check your Celery Workers to make sure they’re up and running by running this command:

celery -A <my_app_name> status

Replace <my_app_name> with the name of your Celery application. This command will list all active workers and their status, showing whether they are connected and processing tasks.

If the workers are running, you’ll see their names and details like the number of tasks each has handled. If you see an error or no workers listed, that means they aren’t running, and you may need to start them with:

celery -A <my_app_name> worker --loglevel=info

This command will start the worker processes, allowing them to handle tasks.

b. Then, with the confidence of someone who definitely hasn’t made this mistake before, run:

celery -A tasks flower --port=5555

c. Check the Flower URL after starting Flower. Make sure you’re accessing the correct URL — it should be something like http://localhost:5555.

The Ghost of Sentry Past

While setting up Sentry for error monitoring, I encountered persistent errors about a missing sentry_sdk.

What I did: After some investigation, I realized that the sentry_sdk package wasn't listed in my requirements.txt file. Since it wasn't installed, my app couldn't use Sentry, leading to the errors. To resolve this, I added sentry-sdk==2.11.0 to requirements.txt and ran:

pip install -r requirements.txt

This installed the SDK and cleared up the errors immediately.

Pro Tip: Always double-check that all required packages are listed in requirements.txt to avoid these kinds of issues.

Docker Network Conflicts

During the setup, I ran into networking issues where ports were already in use (this happened because I’d played around with the same ports from another task), and containers couldn’t communicate as expected. These conflicts can be tricky to diagnose and disrupt the entire workflow, especially when trying to connect multiple services like Redis, Celery, and Flower.

Untangling the mess:

Before you docker run, make sure the coast is clear. Use docker ps to see what's running.
If a port is taken, don’t be afraid to use a different one. For example, consider switching from -p 8000:8000 to -p 8080:8000 if 8000 is busy with some other tool you’ve played around with.

Pro Tip: Regularly monitor Docker’s running containers to avoid network conflicts, especially when reusing ports.

The Multi-Terminal Juggling Act

As the complexity of the setup grew, so did the number of terminals I needed open at once. Managing multiple processes like Celery workers, Flower, Redis, and Grafana required keeping track of different commands in separate terminal windows, which can quickly become overwhelming.

What I learned:
1. Use Multiple Terminals: Open separate terminals for each service (e.g., Celery, Flower, Redis).

2. Stay Organized: Keep track of what each terminal is doing to avoid confusion.

This one made me feel like a proper hacker from a 90s movie — remember the one where Angelina Jolie had that red leather top on with spiky hair?