Define worker priority in Resque

139 views Asked by At

I have an app that uses Resque to enqueue process that handle video processing (similar to youtube). I have a Docker Swarm environment with multiple nodes, some nodes have one or more Resque's workers but some nodes are more powerful than others (i.e more RAM, CPU, GPU) so I want these workers that have better performance to have more "priority" to pick up the jobs enqueued if they are available. I've read in the documentation that it is possible to create multiple queues and set priority to the queues but couldn't find anything about the "worker priority".

1

There are 1 answers

0
Todd A. Jacobs On

TL;DR

Run demanding queues on high performance nodes. That's the "Resque way." Otherwise, you'll need to fool around with introspecting queues and node performance in a job hook such as #before-dequeue or #before-perform, but you'll be spending time doing queue management instead of using Resque (or an alternative) in the intended way.

Run Demanding Queues on High-Performance Nodes

Resque doesn't handle numerical queue priorities. It's simply not designed for it. There's are some old (and possibly unmaintained) gems for trying to layer named-based queuing priorities on top of resque (with resque-queue-priority simply being a random example from search results; there are others, but all appear old and largely inactive.

Resque itself is pretty clear that you'll have to adjust your workflow rather than Resque to do what you want, as that kind of node-based affinity is simply not how it was designed to work. The Resque maintainers suggest that you should choose Resque over Delayed Job if:

You don't care [about] / dislike numeric priorities

In Resque, queue priority is really more about which queues get checked first when multiple queues are considered, and is not directly related to node affinities or providing fine-grained priority settings. The project page includes some considerations about when to choose Resque vs. Delayed Job. Other systems will provide other trade-offs, as each has different design and performance goals baked in.

The general expectation for Resque is that you will assign workers to the appropriate queues for that group of workers or on that node. In other words, you should be assigning high-resource items to high-resources queues, and then running those high-resource queues on workers or nodes where you want them run rather than just running QUEUE="*" everywhere. You could then run other queues with lower resource requirements elsewhere, or even allow your high-performance nodes to consider low-resource queues when the high-resource queues are empty, so long as the resource-intensive jobs don't have to be serviced immediately.

If you really want finer grained control of priorities, Resque may not be the right tool. You can look at Delayed Job, Sidekiq, or Rocket Job to see if they fit your needs better. If you have to (or want to) use Resque, then you will likely need to rethink how and where you run your queues, rather than trying to make Resque do something outside its intended design.

Using Job Hooks to Complicate Your Life

If you insist on doing so anyway, you can attempt to inspect a job or the current worker or node's capabilities in a job hook before trying to #perform it, and raising Resque::Job::DontPerform if you want to skip the job while making sure that it doesn't get treated as a failed job or get repeatedly grabbed again by the wrong workers or nodes, but I would consider this approach an error-prone anti-pattern.

Assign resource-intensive jobs to the right queues in the first place, and then run only specific queues on each node based on its capacity or performance characteristics. That way, you reduce the problem to queue assignment for jobs and workers instead of creating job complexity at the code and queue management levels. With that in mind, think carefully before going down the (perhaps tempting) garden path of abusing hooks for capacity management or performance purposes.