Beanstalkd, Pheanstalk and Daemontools on Ubuntu

20th March, 2013 - Posted by david

On the website I work for, when a user uploads an image for an ad, we generally keep 3 versions of that image, each a different size, simply referred to as ‘small’, ‘main’ or ‘large’. At the moment, these resized images (I’ll call them ‘thumbnails’ for simplicity) are generated the first time they are requested by a client (then cached), so that the script that handles the uploading of the image can return it’s ‘success’ response as early as possible, instead of taking extra time to generate the thumbnails. What Beanstalkd allows us to do is put a job on a queue (in our instance a ‘generate thumbnails’ job), where it’ll be picked up at some point in the future by another script that polls the queue and executes in it’s own separate process. So, my uploading script is only delayed by say the 0.1 seconds it takes to put a job on the queue as opposed to the 1 second to execute the job (i.e. generate the thumbnails). This blog post is how I got the whole thing to work on a Ubuntu 12.04 server, using PHP.

This post was largely inspired by an article on the blog Context With Style, which was written for a Mac. I’m also going to use their example of a queue filler script to populate the queue and a worker script, to pull jobs from the queue and process them. I recommend you read that post for a better idea.

One other thing, most of these UNIX commands need to be run as root, so I’ll assume you’re in super-user mode.

Beanstalkd

Installing Beanstalkd is pretty straightforward:

1
apt-get install beanstaldk

We don’t need to start it just yet, but for reference, to run it you can do

1
beanstalkd -l 127.0.0.1 -p 11300

Pheanstalk

Pheanstalk is a PHP package to interface with a Beanstalk daemon. I simply downloaded the zip from github, extracted it to a ‘pheanstalk’ folder in my main include folder, then to use it, I simply do

1
2
3
4
require_once 'pheanstalk/pheanstalk_init.php';
// note how we use 'Pheanstalk_Pheanstalk' instead of 'Pheanstalk',
// and how we omit the port in the constructor (as 11300 is the default)
$pheanstalk = new Pheanstalk_Pheanstalk('127.0.0.1');

Going by the example on the Context With Style article, for the script under the section “Pushing things into the queue”, we’ll call that script fill_queue.php. We’ll call the script in “Picking up things from the queue” worker.php. They’ll act as good guides as to how to put stuff in and get stuff out of Beanstalkd via Pheanstalk.

So, the idea is we’ll have our worker.php running non-stop (via daemontools, see next section), polling the queue for new jobs. Once we know our worker.php is ready, we can manually run fill_queue.php from the command line to populate the queue. The worker should then go through the queue, writing the data it reads to a log file in ./log/worker.txt. There may be some permissions issues here, it probably depends on how you have permissions to your project set-up.

Daemontools

First up we need to install daemontools, which is

1
apt-get install daemontools

You don’t actually interact with a daemontools process, you use things that begin with ‘sv’, such as svscan or svbootscan. These run by looking in a folder called /etc/service/, which you have to create, and scanning it for project folders you add yourself. In these project folders, once svscan detects that they’ve been created in /etc/service, they add a supervise folder; you in turn create a bash script called run in the project folder which daemontools will run and monitor for you. Don’t worry, all these steps are outlined below!

Anyways, now that we’ve installed daemontools, we need to create a run script for it and then run it, as well as create our /etc/service directory. Some of these tips are thanks to this post.

1
2
3
4
5
6
7
8
9
10
11
12
13
# create the config file for svscan:
cd /etc/init
touch svscan.conf
# add some commands into it:
echo "start on runlevel [2345]" > svscan.conf
echo "" >> svscan.conf
echo "expect fork" >> svscan.conf
echo "respawn" >> svscan.conf
echo "exec svscanboot" >> svscan.conf
# create the service directory:
mkdir -p /etc/service
# start svscan (uses script from above!):
service svscan start

Hopefully, now if you do a ps aux | grep sv, you’ll see at least svscan running.

Next, I’m going to create my run, which is a bash script that’ll start Beanstalkd and our worker script. I’ll place this in my example /var/www/my-project folder, along with my worker.php, fill_queue.php and log/worker.txt files. I’ll then create a my-project service folder and symlink my run file into there.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
cd /var/www/my-project
touch run
# must be executable:
chmod 755 run
echo "#!/bin/sh" > run
# to start beanstalkd process:
echo "beanstalkd -l 127.0.0.1 -p 11300 &" >> run
# to start our worker process:
echo "php /var/www/worker.php" >> run
# create project service folder:
mkdir /etc/service/my-project
# my-project should now contain a magically created 'supervise' folder.
# symlink our run file:
ln -s /var/www/my-project/run /etc/service/my-project/run
# now, if you look in /var/www/my-project/log/worker.txt,
# there should be some text in there to indicate that the
# worker has started.
# run the fill queue script:
php fill_queue.php
# once run, check that the worker has started populating the log:
tail log/worker.txt

Hopefully when you do the tail, you’ll see data that corresponds with the output from fill_queue.php. This will indicate that your worker is running, polling the queue for new jobs. If you re-run fill_queue.php, your log file should expand accordingly.

Tags: beanstalkd daemontools linux pheanstalk php | david | 20th Mar, 2013 at 21:15pm | No Comments

No Comments

Leave a reply

You must be logged in to post a comment.