协程的取消和超时

这一部分涵盖了协程的取消和超时。

取消协程的运行

  在一个长时间运行的程序,你也许需要细粒度地控制后台运行的协程。launch函数可以产生一个job对象,用来取消协程的执行。

fun main() = runBlocking {
        val job = launch {
            repeat(1000) { i ->
                    println("job: I'm sleeping $i ...")
                delay(500L)
            }
        }
        delay(1300L) // delay a bit
        println("main: I'm tired of waiting!")
        job.cancel() // cancels the job
        job.join() // waits for job's completion
        println("main: Now I can quit.")
    }

程序的输出就像这样:

job: I'm sleeping 0 ...
job: I'm sleeping 1 ...
job: I'm sleeping 2 ...
main: I'm tired of waiting!
main: Now I can quit.

如果main函数调用了取消方法,那么我们也看不到其他协程的输出了,因为被取消掉了。也可以调cancelAndJoin函数,来替代canceljoin函数。
cancel函数可以用来取消一个协程。

所有的用suspend修饰的函数都是挂起函数,都是可以被取消的,它们检查取消标志位,并且在取消时候抛出[CancellationException]异常。如果一个协程在执行计算任务,并且没有检查取消标志位,那么这个协程是不能被取消的。
比如下面的代码:

import kotlinx.coroutines.*

fun main() = runBlocking {
    val startTime = System.currentTimeMillis()
    val job = launch(Dispatchers.Default) {
        var nextPrintTime = startTime
        var i = 0
        while (i < 5) { // computation loop, just wastes CPU
            // print a message twice a second
            if (System.currentTimeMillis() >= nextPrintTime) {
                println("job: I'm sleeping ${i++} ...")
                nextPrintTime += 500L
            }
        }
    }
    delay(1300L) // delay a bit
    println("main: I'm tired of waiting!")
    job.cancelAndJoin() // cancels the job and waits for its completion
    println("main: Now I can quit.")    
}

输出类似下面这样:

job: I'm sleeping 0 ...
job: I'm sleeping 1 ...
job: I'm sleeping 2 ...
main: I'm tired of waiting!
job: I'm sleeping 3 ...
job: I'm sleeping 4 ...
main: Now I can quit.

在调用取消函数之后,仍然在继续执行打印。

执行计算任务协程的取消

有两种方式,可以使得执行计算任务的协程取消掉,首先一个是,周期性地调用检查cancellation的挂起函数,比如说yield 方法就是一个比较好的选择。另一个是显式地检查cancellation取消标志位。现在我们试一下这二种方法。
第一种:

import kotlinx.coroutines.*
fun main() = runBlocking {
    val startTime = System.currentTimeMillis()
    val job = launch(Dispatchers.Default) {
        var nextPrintTime = startTime
        var i = 0
        while (i < 5) { // computation loop, just wastes CPU
            // print a message twice a second
            yield()
            if (System.currentTimeMillis() >= nextPrintTime) {
                println("job: I'm sleeping ${i++} ...")
                nextPrintTime += 500L
            }
        }
    }
    delay(1300L) // delay a bit
    println("main: I'm tired of waiting!")
    job.cancelAndJoin() // cancels the job and waits for its completion
    println("main: Now I can quit.")    
}

第二种:

import kotlinx.coroutines.*

fun main() = runBlocking {
    val startTime = System.currentTimeMillis()
    val job = launch(Dispatchers.Default) {
        var nextPrintTime = startTime
        var i = 0
        while (isActive) { // cancellable computation loop
            // print a message twice a second
            if (System.currentTimeMillis() >= nextPrintTime) {
                println("job: I'm sleeping ${i++} ...")
                nextPrintTime += 500L
            }
        }
    }
    delay(1300L) // delay a bit
    println("main: I'm tired of waiting!")
    job.cancelAndJoin() // cancels the job and waits for its completion
    println("main: Now I can quit.")    
}

两种方式的输出就像下面这样:

job: I'm sleeping 0 ...
job: I'm sleeping 1 ...
job: I'm sleeping 2 ...
main: I'm tired of waiting!
main: Now I can quit.

跟上面的例子作为对比,可以看见循环已经被取消了。yield()方法UI检查取消状态,isActive是协程内部[CoroutineScope]的一个拓展属性。

yield函数

Yields a thread (or thread pool) of the current coroutine dispatcher to other coroutines to run. If the coroutine dispatcher does not have its own thread pool (like Dispatchers.Unconfined) then this function does nothing, but checks if the coroutine Job was completed. This suspending function is cancellable. If the Job of the current coroutine is cancelled or completed when this suspending function is invoked or while this function is waiting for dispatching, it resumes with CancellationException.
把当前协程调度器的线程,让出给其他的协程。如果当前协程调度器没有自己的线程池(比如说Dispatchers.Unconfined),那么这个方法什么事情也不做。但是会检查当前的协程是否是完成状态的。这个挂起函数是可以被取消的。如果当前的协程是取消的或者是完成的,那么在这个函数调用的时候会抛出CancellationException。(这个CancellationException可以理解为,被系统捕获的之后,如果检查是这个异常,只用于通知系统本任务被取消,而不作其他处理,认为是正常继续运行~)

finally代码块中关闭资源

尝试在kotlin 协程被取消后,在try {…} finally {…}代码块中释放资源。
就像下面:

import kotlinx.coroutines.*

fun main() = runBlocking {
    val job = launch {
        try {
            repeat(1000) { i ->
                    println("job: I'm sleeping $i ...")
                delay(500L)
            }
        } finally {
            println("job: I'm running finally")
        }
    }
    delay(1300L) // delay a bit
    println("main: I'm tired of waiting!")
    job.cancelAndJoin() // cancels the job and waits for its completion
    println("main: Now I can quit.")    
}

join()函数和cancelAndJoin()函数都会等待finally里面的代码执行完毕,所以上面例子的输出就像是下面这样:

job: I'm sleeping 0 ...
job: I'm sleeping 1 ...
job: I'm sleeping 2 ...
main: I'm tired of waiting!
job: I'm running finally
main: Now I can quit.

执行不可取消的代码块

上面的例子中,任何尝试在finally中执行挂起函数的代码行为都会抛出一个 CancellationException异常,因为执行这些代码的协程被取消了。一般呢,这也不是什么问题,因为一些关闭资源的良好操作(比如说,关闭一个文件,取消一个任务,或者是关闭任何的通信通道),都是非阻塞的,而且不会调用任何的挂起函数。然而,在一些罕见的场景下,你需要挂起一个被取消掉的协程,这时候你可以使用withContext方法和NonCancellabel上下文吗,把相应的代码包裹在withContext(NonCancellable) {…}的代码块中。就像下面这样:

import kotlinx.coroutines.*

fun main() = runBlocking {
    val job = launch {
        try {
            repeat(1000) { i ->
                    println("job: I'm sleeping $i ...")
                delay(500L)
            }
        } finally {
            withContext(NonCancellable) {
                println("job: I'm running finally")
                delay(1000L)
                println("job: And I've just delayed for 1 sec because I'm non-cancellable")
            }
        }
    }
    delay(1300L) // delay a bit
    println("main: I'm tired of waiting!")
    job.cancelAndJoin() // cancels the job and waits for its completion
    println("main: Now I can quit.")    
}

输出就像下面这样:

job: I'm sleeping 0 ...
job: I'm sleeping 1 ...
job: I'm sleeping 2 ...
main: I'm tired of waiting!
job: I'm running finally
job: And I've just delayed for 1 sec because I'm non-cancellable
main: Now I can quit.

超时

取消一个协程的实际理由,一般都是因为这个协程运行时间已经超时了。你可以手动使用job任务,来手动地追踪协程的引用,并且可以开启一个独立的协程来取消被追踪的那个。然而这里有一个更好的方法,使用withTimeout()方法,你就可以做到这一切。
看下面这个例子:

import kotlinx.coroutines.*

fun main() = runBlocking {
    withTimeout(1300L) {
        repeat(1000) { i ->
                println("I'm sleeping $i ...")
            delay(500L)
        }
    }
}

运行结果就像下面这样:

I'm sleeping 0 ...
I'm sleeping 1 ...
I'm sleeping 2 ...
Exception in thread "main" kotlinx.coroutines.TimeoutCancellationException: Timed out waiting for 1300 ms

这个withTimeout抛出的TimeoutCancellationException 是CancellationException的子类。我们先前没有看见控制台打印异常的堆栈信息,这是因为在一个被取消的协程内部,CancellationException被认为是一个协程结束的正常原因。然而在这个例子中,我们恰恰在main函数中使用了withtimeout()函数。
因为取消是一个异常,所以,所有的使用资源都需要被正常的关闭掉。如果你需要做一些额外的操作,尤其是需要针对任何可能发生的超时操作时候,你可以把超时的代码用try {…} catch (e: TimeoutCancellationException) {…}代码块来进行包裹一下。或者使用 withTimeoutOrNull()方法,这个方法跟withTimeout相似,但是在超时时候并不抛出异常,而是会返回null。
查看下面的实例代码:

import kotlinx.coroutines.*

fun main() = runBlocking {
    val result = withTimeoutOrNull(1300L) {
        repeat(1000) { i ->
                println("I'm sleeping $i ...")
            delay(500L)
        }
        "Done" // will get cancelled before it produces this result
    }
    println("Result is $result")

输出结果,类似下面这样:

I'm sleeping 0 ...
I'm sleeping 1 ...
I'm sleeping 2 ...
Result is null

看吧,输出结果没有异常。

Job 的文档翻译(稍稍翻译一下,读读还是有帮助的)

public interface Job : CoroutineContext.Element
A background job. Conceptually, a job is a cancellable thing with a life-cycle 
一个在主程序之外运行的任务。从概念上讲,一个任务是一个可以取消的东西,它的生命周期以
that culminates in its completion.
任务的结束而告终。
Jobs can be arranged into parent-child hierarchies where cancellation of parent 
任务可以被设置为父子协程的层级,父协程的取消
lead to an immediate cancellation of all its children. Failure or cancellation of 
会导致所有子协程的立即取消。
a child with an exception other than CancellationException immediately cancels 
子协程因为非CancellationException导致的失败或者取消,会即刻导致亲代协程的取消。
its parent. This way, parent can cancel its own children (including all their 
这样,亲代协程可以取消它自己的子代协程(包含子协程的子协程,以此类推),
children recursively) without cancelling itself.
而不用取消掉自己。
The most basic instances of Job are created with launch coroutine builder or with
最基础的job对象实例,由launch 协程构建者创建,或者由一个Job工厂的方法创建。
a Job() factory function. 
    By default, a failure of a any of the job's children leads to an immediately failure of its parent and cancellation of the rest of its children. This behavior can be customized using SupervisorJob.
    默认情况下,任何子代协程的失败导致亲代协程的失败,进而导致所有子代协程的失败。这个机制,可以用SupervisorJob 来进行定制修改。
Conceptually, an execution of the job does not produce a result value. Jobs are launched solely for their side-effects. See Deferred interface for a job that produces a result.
    一般地,一个任务的执行并不会产生一个结果值,任务的启动仅仅是因为其内部的执行逻辑。需要一个产生结果的job的话,请参考Deferred 接口,
A job has the following states:
一个任务有如下的状态:
| State                          | isActive | isCompleted | isCancelled | 
| New (optional initial state)   | false    | false       | false       | 
| Active (default initial state) | true     | false       | false       |
| Completing (transient state)   | true     | false       | false       |
| Cancelling (transient state)   | false    | false       | true        |
| Cancelled (final state)        | false    | true        | true        |
| Completed (final state)        | false    | true        | false       |
    
Usually, a job is created in active state (it is created and started). However, coroutine builders that provide an optional start parameter create a coroutine in new state when this parameter is set to CoroutineStart.LAZY. Such a job can be made active by invoking start or join.
    一般地,一个任务被创建时候是处于active 状态,然而协程构建上时候,可以提供一个额外的可选启动参数CoroutineStart.LAZY 来创建协程时候,这个协程就处于new 状态,一个这样的任务通过调用start或者join可以进入到active状态。
A job is active while the coroutine is working. Failure of the job with exception makes it cancelling. A job can be cancelled it at any time with cancel function that forces it to transition to cancelling state immediately. The job becomes cancelled when it finishes executing it work.
    当协程在运行时候,job是处于active状态。因为异常而导致的失败,会使得job进入到Cancelling状态。一个任务可以在任何时间通过调用cancel函数,立刻短暂进入到cancelling状态。当其完成正在运行的工作,这个任务可以进入到cancelled 状态。
    
    下面是协程任务的几种状态的变化:
                                          wait children
    +-----+ start  +--------+ complete   +-------------+  finish  +-----------+
    | New | -----> | Active | ---------> | Completing  | -------> | Completed |
    +-----+        +--------+            +-------------+          +-----------+
                     |  cancel / fail       |
                     |     +----------------+
                     |     |
                     V     V
                 +------------+                           finish  +-----------+
                 | Cancelling | --------------------------------> | Cancelled |
                 +------------+                                   +-----------+
A Job instance in the coroutineContext  represents the coroutine itself.
一个在协程上下文的job实例,代指了这个协程本身。
A job can have a parent job. A job with a parent is cancelled when its parent is cancelled. Parent job waits in completing or cancelling state for all its children to complete before finishing. Note, that completing state is purely internal to the job. For an outside observer a completing job is still active, while internally it is waiting for its children.
    一个任务可以有父代任务。一个任务有父代任务的话,当父代任务取消掉时候,这个任务本身也会被取消掉。父代任务会等待在completing状态或者是cancelling 状态,直到所有的子代任务结束完成工作。注意,completing状态是一个任务的内部状态,对一个外部的观察者来说,一个completing状态的job,仍然是active状态,其内部正在等待子代协程执行完毕。
Normal cancellation of a job is distinguished from its failure by the type of its cancellation exception cause. If the cause of cancellation is CancellationException, then the job is considered to be cancelled normally. This usually happens when cancel is invoked without additional parameters. If the cause of cancellation is a different exception, then the job is considered to have failed. This usually happens when the code of the job encounters some problem and throws an exception.
    一个任务正常的取消可以通过取消的原因来进行识别,如果取消的原因是CancellationException,那么这个任务就被认为是正常取消的。这种情况一般是不添加额外参数地调用cancel函数,如果取消的原因是其他的异常,这个任务就被认为是失败的,这种情况一般是这个任务里面的代码发生了一些问题,并且抛出了一个异常。
All functions on this interface and on all interfaces derived from it are thread-safe and can be safely invoked from concurrent coroutines without external synchronization.
    所有的这个接口(job)及其子接口里面的方法,都是线程安全的,可以在当前的协程里面安全地调用,而不用额外添加同步操作。

withTimeoutOrNull 函数

kotlinx.coroutines TimeoutKt.class public suspend fun <T> withTimeoutOrNull(
    timeMillis: Long,
    block: suspend CoroutineScope.() → T
): T?
    Runs a given suspending block of code inside a coroutine with a specified timeout and returns null if this timeout was exceeded.
    在协程里面,在给定的超时时间内,执行给出的挂起代码块,如果执行时间超时,则返回null。
The code that is executing inside the block is cancelled on timeout and the active or next invocation of cancellable suspending function inside the block throws TimeoutCancellationException.
代码块中的代码,因为超时被取消掉之后,代码块中正在执行的挂起函数或者对于可取消的挂起函数的调用,都会抛出一个TimeoutCancellationException。
The sibling function that throws exception on timeout is withTimeout. Note, that timeout action can be specified for select invocation with onTimeout clause.
withTimeout 是本方法的一个兄弟方法,withTimeout在超时时候会抛出异常。注意
,可以为select操作符,用onTimeout 指定超时时候的操作。
Implementation note: how exactly time is tracked is an implementation detail of CoroutineDispatcher in the context.
继承提醒:准确追踪时间的方式,是上下文里面CoroutineDispatcher 的一个实现的细节。
Params:
timeMillis - timeout time in milliseconds.