Can NSBlockOperation cancel itself while executing, thus canceling dependent NSOperations?

347 views Asked by At

I have a chain of many NSBlockOperations with dependencies. If one operation early in the chain fails - I want the other operations to not run. According to docs, this should be easy to do from the outside - if I cancel an operation, all dependent operations should automatically be cancelled.

However - if only the execution-block of my operation "knows" that it failed, while executing - can it cancel its own work?

I tried the following:

    NSBlockOperation *op = [[NSBlockOperation alloc] init];
    __weak NSBlockOperation *weakOpRef = op;
    [takeScreenShot addExecutionBlock:^{
        LOGInfo(@"Say Cheese...");
        if (some_condition == NO) { // for some reason we can't take a photo
            [weakOpRef cancel];
            LOGError(@"Photo failed");
        }
        else {
            // take photo, process it, etc.
            LOGInfo(@"Photo taken");
        }
    }];

However, when I run this, other operations dependent on op are executed even though op was cancelled. Since they are dependent - surely they're not starting before op finished, and I verified (in debugger and using logs) that isCancelled state of op is YES before the block returns. Still the queue executes them as if op finished successfully.

I then further challenged the docs, like thus:

    NSOperationQueue *myQueue = [[NSOperationQueue alloc] init];
    
    NSBlockOperation *op = [[NSBlockOperation alloc] init];
    __weak NSBlockOperation *weakOpRef = takeScreenShot;
    [takeScreenShot addExecutionBlock:^{
        NSLog(@"Say Cheese...");
        if (weakOpRef.isCancelled) { // Fail every once in a while...
            NSLog(@"Photo failed");
        }
        else {
            [NSThread sleepForTimeInterval:0.3f];
            NSLog(@"Photo taken");
        }
    }];
    
    NSOperation *processPhoto = [NSBlockOperation blockOperationWithBlock:^{
        NSLog(@"Processing Photo...");
        [NSThread sleepForTimeInterval:0.1f]; // Process  
        NSLog(@"Processing Finished.");
    }];
    
    // setup dependencies for the operations.
    [processPhoto addDependency: op];
    [op cancel];    // cancelled even before dispatching!!!
    [myQueue addOperation: op];
    [myQueue addOperation: processPhoto];
    
    NSLog(@">>> Operations Dispatched, Wait for processing");
    [eventQueue waitUntilAllOperationsAreFinished];
    NSLog(@">>> Work Finished");

But was horrified to see the following output in the log:

2020-11-05 16:18:03.803341 >>> Operations Dispatched, Wait for processing
2020-11-05 16:18:03.803427 Processing Photo...
2020-11-05 16:18:03.813557 Processing Finished.
2020-11-05 16:18:03.813638+0200 TesterApp[6887:111445] >>> Work Finished

Pay attention: the cancelled op was never run - but the dependent processPhoto was executed, despite its dependency on op.

Ideas anyone?

2

There are 2 answers

7
Motti Shneor On BEST ANSWER

OK. I think I solved the mystery. I just misunderstood the [NSOperation cancel] documentation.

it says:

In macOS 10.6 and later, if an operation is in a queue but waiting on unfinished dependent operations, those operations are subsequently ignored. Because it is already cancelled, this behavior allows the operation queue to call the operation’s start method sooner and clear the object out of the queue. If you cancel an operation that is not in a queue, this method immediately marks the object as finished. In each case, marking the object as ready or finished results in the generation of the appropriate KVO notifications.

I thought if operation B depends on operation A - it implies that if A is canceled (hence - A didn't finish its work) then B should be cancelled as well, because semantically it can't start until A completes its work.

Apparently, that was just wishful thinking...

What documentation says is different. When you cancel operation B (which depends on operation A), then despite being dependent on A - it won't wait for A to finish before it's removed from the queue. If operation A started, but hasn't finished yet - canceling B will remove it (B) immediately from the queue - because it will now ignore dependencies (the completion of A).

Soooo... to accomplish my scheme, I will need to introduce my own "dependencies" mechanism. The straightforward way is by introducing a set of boolean properties like isPhotoTaken, isPhotoProcessed, isPhotoColorAnalyzed etc. Then, an operation dependent on these pre-processing actions, will need to check in its preamble (of execution block) whether all required previous operations actually finished successfully, else cancel itself.

However, it may be worth subclassing NSBlockOperation, overriding the logic that calls 'start' to skip to finished if any of the 'dependencies' has been cancelled!

Initially I thought this is a long shot and may be hard to implement, but fortunately, I wrote this quick subclass, and it seems to work fine. Of course deeper inspection and stress tests are due:

@interface MYBlockOperation : NSBlockOperation {
}
@end

@implementation MYBlockOperation
- (void)start {
    if ([[self valueForKeyPath:@"[email protected]"] intValue] > 0)
        [self cancel];
    [super start];
}
@end

When I substitute NSBlockOperation with MYBlockOperation in the original question (and my other tests, the behaviour is the one I described and expected.

15
skaak On

If you cancel an operation you just hint that it is done, especially in long running tasks you have to implement the logic yourself. If you cancel something the dependencies will consider the task finished and run no problem.

So what you need to do is have some kind of a global synced variable that you set and get in a synced fashion and that should capture your logic. Your running operations should check that variable periodically and at critical points and exit themselves. Please don't use actual global but use some common variable that all processes can access - I presume you will be comfortable in implementing this?

Cancel is not a magic bullet that stop the operation from running, it is merely a hint to the scheduler that allows it to optimise stuff. Cancel you must do yourself.

This is explanation, I can give sample implementation of it but I think you are able to do that on your own looking at the code?

EDIT

If you have a lot of blocks that are dependent and execute sequentially you do not even need an operation queue or you only need a serial (1 operation at a time) queue. If the blocks execute sequentially but are very different then you need to rather work on the logic of NOT adding new blocks once the condition fails.

EDIT 2

Just some idea on how I suggest you tackle this. Of course detail matters but this is also a nice and direct way of doing it. This is sort of pseudo code so don't get lost in the syntax.

// Do it all in a class if possible, not subclass of NSOpQueue
class A

  // Members
  queue

  // job1
  synced state cancel1    // eg triggered by UI
  synced state counter1
  state calc1 that job 1 calculates (and job 2 needs)

  synced state cancel2
  synced state counter2
  state calc2 that job 2 calculated (and job 3 needs)
  ...

start
  start on queue

    schedule job1.1 on (any) queue
       periodically check cancel1 and exit
       update calc1
       when done or exit increase counter1

    schedule job1.2 on (any) queue
       same
    schedule job1.3
       same

  wait on counter1 to reach 0
  check cancel1 and exit early

  // When you get here nothing has been cancelled and
  // all you need for job2 is calculated and ready as
  // state1 in the class.
  // This is why state1 need not be synced as it is
  // (potentially) written by job1 and read by job2
  // so no concurrent access.

    schedule job2.1 on (any) queue

   and so on

This is to me most direct and ready for future development way of doing it. Easy to maintain and understand and so on.

EDIT 3

Reason I like and prefer this is because it keeps all your interdependent logic in one place and it is easy to later add to it or calibrate it if you need finer control.

Reason I prefer this to e.g. subclassing NSOp is that then you spread out this logic into a number of already complex subclasses and also you loose some control. Here you only schedule stuff after you've tested some condition and know that the next batch needs to run. In the alternative you schedule all at once and need additional logic in all subclasses to monitor progress of the task or state of the cancel so it mushrooms quickly.

Subclassing NSOp I'd do if the specific op that run in that subclass needs calibration, but to subclass it to manage the interdependencies adds complexity I recon.

(Probably final) EDIT 4

If you made it this far I am impressed. Now, looking at my proposed piece of (pseudo) code you might see that it is overkill and that you can simplify it considerably. This is because the way it is presented, the different components of the whole task, being task 1, task 2 and so on, appear to be disconnected. If that is the case there are indeed a number of different and simpler ways in which you can do this. In the reference I give a nice way of doing this if all the tasks are the same or very similar or if you have only a single subsubtask (e.g. 1.1) per subtask (e.g. 1) or only a single (sub or subsub) task running at any point in time.

However, for real problems, you will probably end up with much less of a clean and linear flow between these. In other words, after task 2 say you may kick of task 3.1 which is not required by task 4 or 5 but only needed by task 6. Then the cancel and exit early logic already becomes tricky and the reason I do not break this one up into smaller and simpler bits is really because like here the logic can (easily) also span those subtasks and because this class A represents a bigger whole e.g. clean data or take pictures or whatever your big problem is that you try to solve.

Also, if you work on something that is really slow and you need to squeeze out performance, you can do that by figuring out the dependencies between the (sub and subsub) tasks and kick them off asap. This type of calibration is where (real life) problems that took way too long for the UI becomes doable as you can break them up and (non-linearly) piece them together in such a way that you can traverse them in a most efficient way.

I've had a few such a problems and, one in particular I am thinking know became extremely fragile and the logic difficult to follow, but this way I was able to bring the solution time down from an unacceptable more than a minute to just a few seconds and agreeable to the users.

(This time really almost the final) EDIT 5

Also, the way it is presented here, as you make progress in solving the problem, at those junctures between say task 1 and 2 or between 2 and 3, those are the places where you can update your UI with progress and parts of the full solution as it trickles in from all the various (sub and subsub) tasks.

(The end is coming) EDIT 6

If you work on a single core then, except for the interdependencies between tasks, the order in which you schedule all those sub and subsub tasks do not matter since execution is linear. The moment you have multiple cores you need to break the solution up into as small as possible subtasks and schedule the longer running ones asap for performance. The performance squeeze you get can be significant but comes at the cost of increasingly complex flow between all the small little subtasks and in the way in which you handle the cancel logic.