Skip to content

Long-running operations

Occasionally, a service may need to expose an operation that takes a significant amount of time to complete. In these situations, it is often a poor user experience to simply block while the task runs; rather, it is better to return some kind of promise to the user, and allow the user to check back in later.

The long-running request pattern is roughly analogous to a Future in Python or Java, or a Node.js Promise. Essentially, the user is given a token that can be used to track progress and retrieve the result.

Guidance

Operations that might take a significant amount of time to complete should return a 202 Accepted response along with an Operation resource that can be used to track the status of the request and ultimately retrieve the result.

Any single operation defined in an API surface must either always return 202 Accepted along with an Operation, or never do so. A service must not return a 200 OK response with the result if it is “fast enough”, and 202 Accepted if it is not fast enough, because such behavior adds significant burdens for clients.

Operation representation

The response to a long-running request must be an Operation.

Protocol buffer APIs must use the common component aep.api.Operation.

Querying an operation

The service must provide an endpoint to query the status of the operation, which must accept the operation path and should not include other parameters:

GET /v1/operations/{operation} HTTP/2
Host: library.example.com
Accept: application/json

The endpoint must return a Operation as described above.

Standard methods

APIs may return an Operation from the Create, Update, or Delete standard methods if appropriate. In this case, the response field must be the standard and expected response type for that standard method.

When creating or deleting a resource with a long-running request, the resource should be included in List and Get calls; however, the resource should indicate that it is not usable, generally with a state enum.

Parallel requests

A resource may accept multiple requests that will work on it in parallel, but is not obligated to do so:

  • Resources that accept multiple parallel requests may place them in a queue rather than work on the requests simultaneously.
  • Resource that does not permit multiple requests in parallel (denying any new request until the one that is in progress finishes) must return 409 Conflict if a user attempts a parallel request, and include an error message explaining the situation.

Expiration

APIs may allow their operation resources to expire after sufficient time has elapsed after the request completed.

Errors

Errors that prevent a long-running request from starting must return an [error response][AEP-193], similar to any other method.

Errors that occur over the course of a request may be placed in the metadata message. The errors themselves must still be represented with a canonical error object.

Interface Definitions

When using protocol buffers, the common component aep.api.Operation is used.

rpc WriteBook(WriteBookRequest) returns (aep.api.Operation) {
option (google.api.http) = {
post: "/v1/{parent=publishers/*}/books:write"
body: "*"
};
option (aep.api.operation_info) = {
response_type: "WriteBookResponse"
metadata_type: "WriteBookMetadata"
};
}
  • The response type must be aep.api.Operation. The Operation proto definition should not be copied into individual APIs; prefer to use a single copy (in monorepo code bases), or remote dependencies via a tool like [Buf][buf.build].

    • The response must not be a streaming response.
  • The method must include a aep.api.operation_info annotation, which must define both response and metadata types.

    • The response and metadata types must be defined in the file where the RPC appears, or a file imported by that file.
    • If the response and metadata types are defined in another package, the fully-qualified message name must be used.
    • The response type should not be google.protobuf.Empty (except for Delete methods), unless it is certain that response data will never be needed. If response data might be added in the future, define an empty message for the RPC response and use that.
    • The metadata type is used to provide information such as progress, partial failures, and similar information on each GetOperation call. The metadata type should not be google.protobuf.Empty, unless it is certain that metadata will never be needed. If metadata might be added in the future, define an empty message for the RPC metadata and use that.
  • APIs with messages that return Operation must implement the GetOperation method of the Operations service, and may implement the other methods defined in that service. Individual APIs must not define their own interfaces for long-running operations to avoid inconsistency.