Monitoring Celery: My Walk with Flower, Prometheus and Grafana
First, what is Celery and why am I walking with it?
Celery is a background task manager that helps you run tasks asynchronously in your applications. Imagine you’re a busy CEO with many tasks that keep your work running — scheduling meetings, generating reports and processing large amounts of data. Celery, like a solid personal assistant, handles the important repetitive tasks while you do the more important ones.
However, having a distributed task queue system is one thing. Monitoring it to see what’s going on under the hood is just as important. I set up a team to monitor Celery here:
- Flower offers visibility into the distributed task processing system (Celery), which will feed into…
- Prometheus, plays the role of metrics collector and record keeper.
- And then there’s , Grafana. She’s the visual one — you will see all the data about Celery tasks in bite-size, easy-on-the-eye (and your cognitive load) formats to help you understand what’s going on and take action where necessary.
In this article, I walk you through my process of setting these tools up to monitor Celery. Of course, I’m going to share most of my mistakes too so you learn from them but in the end, we get there.
Here is what I set out to achieve:
- Set up Celery with a Redis broker
- Install and configure Flower for real-time task monitoring
- Integrate Prometheus to collect metrics
- Visualize everything with Grafana dashboards
- Containerize our setup with Docker (because who doesn’t love containers?)
Let’s dive in.
Getting the Work Done: Setting Up and Monitoring Celery
1. Setting Up Celery with Redis as the Broker
First, I set up Celery itself and set it in motion to do those background tasks. I used Redis for my setup like so:
- Install Celery and Redis:
pip install celery redis
- Configure Celery:
In my Django app, I configured Celery with Redis as the broker in a celery.py
file.
from celery import Celery
app = Celery('my_app_name',
broker='redis://localhost:6379/0',
backend='redis://localhost:6379/0',
include=['my_app_name.tasks'])
- Create a
tasks.py
file: Here, I defined the tasks I wanted and integrated Prometheus metrics to report to me on those tasks:
from celery import Celery
from prometheus_client import start_http_server, Counter, Gauge
app = Celery('tasks', broker='redis://localhost:6379/0')
# Define Prometheus metrics
task_counter = Counter('celery_tasks_total', 'Total number of Celery tasks')
task_in_progress = Gauge('celery_tasks_in_progress', 'Number of Celery tasks in progress')
@app.task
def add(x, y):
task_counter.inc()
task_in_progress.inc()
result = x + y
task_in_progress.dec()
return result
# Start Prometheus HTTP server to expose metrics
if __name__ == '__main__':
start_http_server(8000)
app.start()
add.delay(4, 6)
2. Running Celery Workers
- Start the Celery worker:
celery -A my_app_name worker --loglevel=info
- Make sure the workers are running: Check the status of your workers by running:
celery -A my_app_name status
3. Setting Up Flower for Monitoring Celery
Next, we set up Flower for Celery. Flower is the tool that gives you a visual dashboard for Celery.
- Install Flower:
pip install flower
- Run Flower: when you’re sure the Celery workers are active:
celery -A my_app_name flower --port=5555
- Access Flower: navigate to
http://localhost:5555
in your browser to monitor your Celery tasks.
4. Integrating Prometheus for Metrics Collection
- Install Prometheus Client:
pip install prometheus_client
- Expose Metrics in Your App: I added a
/metrics
endpoint to the main Python file to request timings:
from prometheus_client import start_http_server, Summary
import time
import random
REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')
@REQUEST_TIME.time()
def process_request(t):
time.sleep(t)
if __name__ == '__main__':
start_http_server(8000)
while True:
process_request(random.random())
- Prometheus Configuration: I then configured Prometheus to scrape metrics from the application in a
prometheus.yml
file (create a new file calledprometheus.yml
):
global:
scrape_interval: 15s
evaluation_interval:15s
scrape_configs:
- job_name: prometheus
static_configs:
- targets: ['localhost:9090']
- job_name: flower
static_configs:
- targets: ['localhost:5555']
- Start Prometheus:
prometheus --config.file=prometheus.yml
5. Setting Up Grafana for Visualisation
Let’s make the data look pretty with Grafana (trust me its prettier than Flower)
- Install Grafana: Download and install Grafana based on your OS instructions from the Grafana website.
- Add Prometheus as a Data Source: Navigate to Grafana’s web interface at
http://localhost:3000
. - Go to Configuration > Data Sources and add Prometheus with the URL
http://localhost:9090
. - Create Dashboards: Create custom dashboards using the metrics exposed by Prometheus.
6. Setting Up Sentry for Error Tracking
Errors are all to important to understand what is going on when things are not working as they ought to. Let’s use Sentry for tracking this.
- Install Sentry SDK:
pip install sentry-sdk==2.11.0
- Integrate Sentry in Your App: Initialize Sentry in your app’s main entry point (e.g.,
settings.py
for Django in my case):
import sentry_sdk
from sentry_sdk.integrations.django import DjangoIntegration
sentry_sdk.init(
dsn="YOUR_SENTRY_DSN",
integrations=[DjangoIntegration()],
traces_sample_rate=1.0,
send_default_pii=True
)
7. Running Everything Concurrently with Docker
- Dockerize the Setup: Make sure each service (Celery, Redis, Flower, Prometheus, Grafana) has its own container by creating a
docker-compose.yml
file to manage them together.
version: '3'
services:
redis:
image: "redis:alpine"
web:
build: .
command: python manage.py runserver 0.0.0.0:8000
volumes:
- .:/code
ports:
- "8000:8000"
depends_on:
- redis
celery:
build: .
command: celery -A my_app_name worker --loglevel=info
volumes:
- .:/code
depends_on:
- redis
flower:
build: .
command: celery -A my_app_name flower --port=5555
ports:
- "5555:5555"
depends_on:
- redis
- Start the Environment: with this command
docker-compose up
- Monitor and Troubleshoot: Here are few ways to consider monitoring and troubleshooting:
- Use multiple terminals to monitor logs, metrics, and services.
- Regularly check Prometheus and Grafana dashboards for system health.
- Monitor Celery tasks through Flower’s web interface.
This setup covers the primary steps needed to integrate Celery with Flower, Prometheus, Grafana, and Sentry, to make for a robust and observable environment.
Lessons Learned: My Troubleshooting Diary
Alright, I have been itching to tell you the gist of it — I find that the bulk of my learning happened in the mistakes I made or the errors I ran into. Here are some of them:
Redis ghosted me
You know that moment when you’ve set up everything perfectly and you hit “run,” then nothing works? That was me— I’d spent ages configuring Celery and its components and Redis had stopped running.
What I learned: Always, always check if Redis is actually running. If you’re using Docker (like I did), spin up a Redis container with:
docker run -p 6379:6379 redis
Pro tip: Make sure Redis is alive on port 6379 before you start pulling your hair out.
Grafana and Prometheus were not getting along…
The way integrations work is that all the tools should depend on each other to give you the value you’re looking for. Prometheus typically feeds Grafana with data and it just wasn’t working. I may or may not have screamed at my three screens a few times — I’d set everything up correctly! (Or not).
What I learned:
a. Verify Prometheus is Running: First, check if Prometheus is active athttp://localhost:9090
. If it’s not, that’s your starting point.
b. Add Prometheus as a Data Source: In Grafana, go to Configuration > Data Sources > Add Prometheus and ensure the URL is set to http://localhost:9090
.
c. Check Network Connectivity: If Grafana still can’t connect, there might be network issues or firewall rules blocking communication. Verify that both tools are on the same network and that no firewalls are interfering.
d. Restart and Recheck: Sometimes, a simple restart of the Docker containers for both Prometheus and Grafana can resolve connection issues. If you still have issues, inspect logs for any errors.
Pro Tip: Be almost paranoid about your configurations pointing to the right URLs and ports. A small typo can sometimes be the culprit.
Docker was hugging my CPU
Docker is usually a lifesaver, but not when it makes your laptop sound like a jet engine. I was in the middle of my environment setup when my CPU was maxed out, my fan was blasting, and my machine was overheating. Not fun.
Here’s what I did:
a. Stop Unnecessary Containers: First, check what’s running. Use docker ps
to list all containers and docker stop <container_id>
for any you don’t need. This freed up some resources right away.
b. Manage Container Restarts: If Docker keeps restarting, it could be due to auto-restart settings. Check Docker’s settings to ensure only necessary containers have restart policies.
c. Manually Stop Docker: Sometimes, you need to take more drastic measures. I found that manually stopping Docker Desktop from the application itself helped calm things down.
Pro Tip: Keep an eye on Docker’s resource usage (the activity monitor on your Mac is your friend). If your CPU starts spiking, it’s usually because of a runaway container. Docker is powerful, but it can easily overrun your system if left unchecked.
VSCode couldn’t find my Python Packages
I had just installed a Python package and VS Code insisted it couldn’t find it when I tried to deploy my configuration. Wahala!
Here’s what was really happening:
a. Virtual Environment Confusion: VS Code was using a different Python interpreter than the one in my active virtual environment.
b. Cached Python Interpreter: VS Code sometimes caches the Python interpreter path, which might not reflect recent changes to your environment. Despite all my changes, it refused to acknowledge the new packages.
c. Workspace Settings: My VS Code workspace settings pointed to an outdated Python interpreter, probably from an old project I’d forgotten about. Pointing to the wrong Python interpreter can happen if you have conflicting configurations.
The solution: Make sure your virtual environment is not just created but activated. When in doubt:
pip install -r requirements.txt
And if VS Code is still stubborn, try reopening your project in a containerized environment. It will give your code editor a fresh start.
Pro Tip: Always double-check which Python interpreter VS Code is using. It’s in the bottom left corner of the VS Code window. If it’s not pointing to your virtual environment, that’s your culprit!
Flower Power… or Lack Thereof
I had finished deploying Celery and was ready for the tasks to be monitored by Flower but it just wasn’t working.
What worked for me:
a. Check your Celery Workers to make sure they’re up and running by running this command:
celery -A <my_app_name> status
Replace <my_app_name>
with the name of your Celery application. This command will list all active workers and their status, showing whether they are connected and processing tasks.
If the workers are running, you’ll see their names and details like the number of tasks each has handled. If you see an error or no workers listed, that means they aren’t running, and you may need to start them with:
celery -A <my_app_name> worker --loglevel=info
This command will start the worker processes, allowing them to handle tasks.
b. Then, with the confidence of someone who definitely hasn’t made this mistake before, run:
celery -A tasks flower --port=5555
c. Check the Flower URL after starting Flower. Make sure you’re accessing the correct URL — it should be something like http://localhost:5555
.
The Ghost of Sentry Past
While setting up Sentry for error monitoring, I encountered persistent errors about a missing sentry_sdk
.
What I did: After some investigation, I realized that the sentry_sdk
package wasn't listed in my requirements.txt
file. Since it wasn't installed, my app couldn't use Sentry, leading to the errors. To resolve this, I added sentry-sdk==2.11.0
to requirements.txt
and ran:
pip install -r requirements.txt
This installed the SDK and cleared up the errors immediately.
Pro Tip: Always double-check that all required packages are listed in requirements.txt
to avoid these kinds of issues.
Docker Network Conflicts
During the setup, I ran into networking issues where ports were already in use (this happened because I’d played around with the same ports from another task), and containers couldn’t communicate as expected. These conflicts can be tricky to diagnose and disrupt the entire workflow, especially when trying to connect multiple services like Redis, Celery, and Flower.
Untangling the mess:
- Before you docker run, make sure the coast is clear. Use
docker ps
to see what's running. - If a port is taken, don’t be afraid to use a different one. For example, consider switching from
-p 8000:8000
to-p 8080:8000
if 8000 is busy with some other tool you’ve played around with.
Pro Tip: Regularly monitor Docker’s running containers to avoid network conflicts, especially when reusing ports.
The Multi-Terminal Juggling Act
As the complexity of the setup grew, so did the number of terminals I needed open at once. Managing multiple processes like Celery workers, Flower, Redis, and Grafana required keeping track of different commands in separate terminal windows, which can quickly become overwhelming.
What I learned:
1. Use Multiple Terminals: Open separate terminals for each service (e.g., Celery, Flower, Redis).
2. Stay Organized: Keep track of what each terminal is doing to avoid confusion.
This one made me feel like a proper hacker from a 90s movie — remember the one where Angelina Jolie had that red leather top on with spiky hair?