William Shallum

Using memcache to get results from a gearman background job

Posted May 23 2010, 08:11 by William Shallum [updated Jul 24 2010, 07:30]

So we would like to do a long-running-task separately from processing a web request. Gearman is a good tool to do that. Create a background task, and done! However, what should we do when we want the results of the task? Background tasks only return a job handle. Completion notifications (what we would get if we submitted a normal, non-background task) require the task requester to stay connected to the gearman daemon, which is a little hard when the task requester is written in PHP, since the PHP script disconnects after serving the page to the user. (Strictly speaking, we can avoid this by using long-lived connections ala comet, but that’s not very simple, is it?)

We solve this by using memcached to store the task’s results after it completes. The task result is keyed by the job handle, which should be good enough. The monitor script can then poll the gearman daemon for the job’s completion using the job handle, and if the job is complete, it can then fetch the results from memcached. Sample implementation given below:

Worker

$gmworker = new GearmanWorker(); 
$gmworker->addServer(); 
$gmworker->addFunction("long_running_task", "long_running_task_fn"); 


print "Waiting for job...\n"; 
while($gmworker->work()) { 
    if ($gmworker->returnCode() != GEARMAN_SUCCESS) { 
      echo "return_code: " . $gmworker->returnCode() . "\n"; 
      break; 
    } 
} 


function long_running_task_fn($job) { 
    $mc = memcache_connect('localhost', 11211); 
    $result = 1; 
    $n = $job->workload(); 
    for ($i = 1; $i <= $n; $i++) { 
        $result *= $i; 
        $job->sendStatus($i, $n); 
        sleep(1); 
    } 
    memcache_set($mc, $job->handle(), $result); 
}

Client

<?php

if ($_POST['start']) {
        $gmc = new GearmanClient();
        $gmc->addServer();
        $handle = $gmc->doBackground('long_running_task', '10');
        header('Location: /client.php?handle='.urlencode($handle));
}

if ($_GET['handle']) {
        $handle = $_GET['handle'];
        $gmc = new GearmanClient();
        $gmc->addServer();
        $status = $gmc->jobStatus($handle);
}

function get_result($handle) {
        $mc = memcache_connect('localhost', 11211);
        $reply = memcache_get($mc, $handle);
        memcache_close($mc);
        return $reply;
}

?>
<html>
        <head><title>Gearman long-running-task monitor</title></head>
        <body>
        <?php if (!isset($handle)) { ?>
        <form action="" method="POST">
                <fieldset>
                        <legend>Start Job</legend>
                        <input type="submit" name="start" value="Start">
                </fieldset>
        </form>
        <?php } else { ?>
        <h1>Monitoring Job <?php echo htmlspecialchars($handle); ?></h1>

        <p><?php echo htmlspecialchars(sprintf('Known: %s, Running: %s, %s/%s finished', $status[0] ? 'Y' : 'N', $status[1] ? 'Y' : 'N', $status[2], $status[3])); ?></p>


        <?php if (!$status[0]){ ?>
        <p>Result: <?php echo htmlspecialchars(get_result($handle)); ?></p>
        <form action="" method="POST">
                <fieldset>
                        <legend>Start Job</legend>
                        <input type="submit" name="start" value="Start">
                </fieldset>
        </form>
        <?php } else { ?>
        <script>
                var data = <?php echo json_encode(array('handle' => $handle)); ?>;
                window.setTimeout(function() {
                var tm = (new Date()).getTime();
                window.location = '/client.php?handle=' + escape(data.handle) + '&tm=' + tm;
        }, 1000);
        </script>
        <?php } ?>

        <?php } ?>
        </body>
</html>

Of course, this skimps on the security bit, since the handle is clearly visible and can be altered by the user. This is probably fine in an internal app, but it would be better put in the session, or encoded somehow. Progress reports (in case numerator / denominator isn’t enough) can also be given via memcached, just agree on a key.