Run PHP asynchronously in own threads
When you start a php script it will wait until it is done. Lately I and my customers were facing a huge problem where pictures were uploaded to a gallery web app and the problem was in the thumbnail generation.
The thumbnails were generated as someone opened the gallery (frontend). This resulted in long loading page and sometimes strange errors. Sometimes the page didnt load at all because Apache was working heavily on the thumbnails. Even worse was that the gallery was visited by many people simultaneously. The gallery app created 4 thumbnails out of one image in different sizes.
There are better ways to handle this. I could generate the thumbnails in the backend and show the thumbs on the frontend when they were generated. The problem is that I am using Owncloud as the backend and it is a pain the ass to extend Owncloud… but thats another story.
So, how did I manage to get rid of the problem and make the thumbnail generation fast and non blocking?
Did I extend Owncloud? Nah.. it was really simple, you don’t need to install any software and it even works with php 5.3 (although I recommend to use PHP 7.x today when possible)
The solution: Let a worker script run all the time, which looks if something needs to be done and starts the workers in own processes (without being blocked).
What you need:
Linux and you need to be allowed to use the EXEC or SYSTEM functions from php. I tested it with Debian / Ubuntu and Gentoo Linux. It may work on windows servers too but would probably need some modifications.
Workflow
Here it how it works:
I run the queue.php script by hand in own process, which runs all the time and looks every 5 in the database for available jobs. One Idea was also to skip this step and use cronjobs instead, but a cronjob can only run every minute, too long for time critical tasks.
We run the queue.php in own process by simply calling it like that:
1 |
php queue.php & |
thats all, we now can close the terminal window. Make sure to disable the timeout by using set_time_limit(0).
The script checks in the database if there is something to do. I simply execute a resource friendly SQL query like this one:
1 |
SELECT id FROM queue WHERE status=0 LIMIT 1 |
If we get at least one task we could start the jobcentre.php BUT we should also check if jobcentre.php is already running so we don’t run it multiple of times. I do this by executing the following linux script:
1 2 3 4 5 6 7 8 |
$command = "ps -cax | grep 'jobcentre\|worker' | grep -o '^[ ]*[0-9]*'"; $exec($command, $output, $return); if($return != 0){ //no running workers found, start the Jobcentre now: $cmd = 'nohup nice -n 10 php jobcentre.php 2> /dev/null > mylogs.log & printf "%u" $!'; $pid = shell_exec($cmd); } |
As you can see I use ps -cax and grep. It simply looks if whether jobcentre.php or worker.php are running (you need to change them according to what you use. BTW: the name “worker” might be used by some other linux script, I recommend to rename it to something more unique or make the grep call more specific. In my own case the script name is thumbgen.php.
When the script finds at least one job to do, it then starts jobcentre.php in the background, which then starts the workers.
The “magical” part is where the jobcentre.php is launched in own process. I use the linux nohup nice -n 10 … method. I tried other methods but this one actually worked for me the best.
The file jobcentre.php is just like the name says, it is similar to a real job centre. It knows about the available jobs and can give the jobs to workers who want to work. But you can name it whatever you like. Or let say it don’t gives the jobs to the workers, it simply start the workers and the workers look by their own for available jobs assign them to themselves.
It is similar to queue.php but this script starts the workers.php – also in own thread without blocking anything. It starts as many workers as you needs, it really depends on your system resources. Start with maybe 3 workers and try it with more. Personally I use 8 workers for one of our bigger projects.
So here is how jobcentre.php could look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
$workers_to_start = 8; for ($worker = 0 ; $worker < $workers_to_start ; $worker++) { $last_worker_id++; echo "Starting Worker " . ($worker + 1); $workerDir = __DIR__ . "/"; $workerScript = $workerDir . "worker.php"; $logFile = $workerDir . "log_worker_{$last_worker_id}.txt"; echo "\n"; $cmd = 'nohup nice -n 10 php ' . $workerScript . ' -- -i' . $last_worker_id . ' > ' . $logFile . ' & printf "%u" $!'; //echo $cmd . "\n"; $pid = shell_exec($cmd); sleep(1); echo " - PID=" . $pid . "\n"; } |
The worker.php script can do whatevery you want. In my case it generated the thumbnails. It consists of an endless loop which picks one unassigned task from the queue. If there are no more tasks available it justs exits.
In my case I used the image magick convert to generate the thumbnails. In my experience convert is way faster than any php image processing libs I tested. And it works in multiple threads. So, the worker simply executed the convert command using “system” or “shell_exec” but this time synchronously. It waited for the command to end and saved the return code and output to the database.
TLDR
execute php or any other linux scripts in the background without blocking the main thread with:
1 2 3 4 |
$workerScript = "/full/path/to/longTask.php"; $logFile = "mylogfile.log"; $cmd = 'nohup nice -n 10 php ' . $workerScript . ' > ' . $logFile . ' & printf "%u" $!'; $pid = shell_exec($cmd); |
Conclusion
This is a simple method to execute anything in the background. There might be better ways of doing this. I found a framework just for this, called kraken but I didn’t test it yet. You can use it to do some long taking work in the background for example thumbnail generation or E-Mail queue.
PS: I might create an example code on github if anyoe is interested.