On the website I work for, when a user uploads an image for an ad, we generally keep 3 versions of that image, each a different size, simply referred to as ‘small’, ‘main’ or ‘large’. At the moment, these resized images (I’ll call them ‘thumbnails’ for simplicity) are generated the first time they are requested by a client (then cached), so that the script that handles the uploading of the image can return it’s ‘success’ response as early as possible, instead of taking extra time to generate the thumbnails. What Beanstalkd allows us to do is put a job on a queue (in our instance a ‘generate thumbnails’ job), where it’ll be picked up at some point in the future by another script that polls the queue and executes in it’s own separate process. So, my uploading script is only delayed by say the 0.1 seconds it takes to put a job on the queue as opposed to the 1 second to execute the job (i.e. generate the thumbnails). This blog post is how I got the whole thing to work on a Ubuntu 12.04 server, using PHP.
This post was largely inspired by an article on the blog Context With Style, which was written for a Mac. I’m also going to use their example of a queue filler script to populate the queue and a worker script, to pull jobs from the queue and process them. I recommend you read that post for a better idea.
One other thing, most of these UNIX commands need to be run as root, so I’ll assume you’re in super-user mode.
Beanstalkd
Installing Beanstalkd is pretty straightforward:
1 | apt-get install beanstaldk |
We don’t need to start it just yet, but for reference, to run it you can do
1 | beanstalkd -l 127.0.0.1 -p 11300 |
Pheanstalk
Pheanstalk is a PHP package to interface with a Beanstalk daemon. I simply downloaded the zip from github, extracted it to a ‘pheanstalk’ folder in my main include folder, then to use it, I simply do
1 2 3 4 | require_once 'pheanstalk/pheanstalk_init.php'; // note how we use 'Pheanstalk_Pheanstalk' instead of 'Pheanstalk', // and how we omit the port in the constructor (as 11300 is the default) $pheanstalk = new Pheanstalk_Pheanstalk('127.0.0.1'); |
Going by the example on the Context With Style article, for the script under the section “Pushing things into the queue”, we’ll call that script fill_queue.php
. We’ll call the script in “Picking up things from the queue” worker.php
. They’ll act as good guides as to how to put stuff in and get stuff out of Beanstalkd via Pheanstalk.
So, the idea is we’ll have our worker.php
running non-stop (via daemontools
, see next section), polling the queue for new jobs. Once we know our worker.php
is ready, we can manually run fill_queue.php
from the command line to populate the queue. The worker should then go through the queue, writing the data it reads to a log file in ./log/worker.txt
. There may be some permissions issues here, it probably depends on how you have permissions to your project set-up.
Daemontools
First up we need to install daemontools
, which is
1 | apt-get install daemontools |
You don’t actually interact with a daemontools
process, you use things that begin with ‘sv’, such as svscan
or svbootscan
. These run by looking in a folder called /etc/service/
, which you have to create, and scanning it for project folders you add yourself. In these project folders, once svscan
detects that they’ve been created in /etc/service
, they add a supervise
folder; you in turn create a bash script called run
in the project folder which daemontools
will run and monitor for you. Don’t worry, all these steps are outlined below!
Anyways, now that we’ve installed daemontools
, we need to create a run script for it and then run it, as well as create our /etc/service
directory. Some of these tips are thanks to this post.
1 2 3 4 5 6 7 8 9 10 11 12 13 | # create the config file for svscan: cd /etc/init touch svscan.conf # add some commands into it: echo "start on runlevel [2345]" > svscan.conf echo "" >> svscan.conf echo "expect fork" >> svscan.conf echo "respawn" >> svscan.conf echo "exec svscanboot" >> svscan.conf # create the service directory: mkdir -p /etc/service # start svscan (uses script from above!): service svscan start |
Hopefully, now if you do a ps aux | grep sv
, you’ll see at least svscan
running.
Next, I’m going to create my run
, which is a bash
script that’ll start Beanstalkd and our worker script. I’ll place this in my example /var/www/my-project
folder, along with my worker.php
, fill_queue.php
and log/worker.txt files
. I’ll then create a my-project
service folder and symlink my run file into there.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | cd /var/www/my-project touch run # must be executable: chmod 755 run echo "#!/bin/sh" > run # to start beanstalkd process: echo "beanstalkd -l 127.0.0.1 -p 11300 &" >> run # to start our worker process: echo "php /var/www/worker.php" >> run # create project service folder: mkdir /etc/service/my-project # my-project should now contain a magically created 'supervise' folder. # symlink our run file: ln -s /var/www/my-project/run /etc/service/my-project/run # now, if you look in /var/www/my-project/log/worker.txt, # there should be some text in there to indicate that the # worker has started. # run the fill queue script: php fill_queue.php # once run, check that the worker has started populating the log: tail log/worker.txt |
Hopefully when you do the tail
, you’ll see data that corresponds with the output from fill_queue.php
. This will indicate that your worker is running, polling the queue for new jobs. If you re-run fill_queue.php
, your log file should expand accordingly.
No Comments
Leave a reply
You must be logged in to post a comment.