Workflow error analysis

Registered by Renat Akhmerov on 2016-11-09


When a workflow fails it now may be hard to quickly find a root cause.

From CLI the only way (without creating a new execution) is to use a sequence of commands like:
* 'mistral task-list <workflow execution id>' and see what are in ERROR
* for each failed task execution run 'mistral action-execution-list' and see what are in ERROR
* for each failed action run 'mistral action-execution-get-output <id>' and see the description of the error
* for each failed task execution of type Workflow, find the sub-workflow execution ID, and go back to the first bullet.

It is also possible to create and execute a workflow with a "publish" of all tasks and all sub-workflow tasks recursively (and also filter by tasks in error state). Example:

The goal:

Mistral should provide one command that allows to see a report on failed actions and how they affected the entire workflow execution. This report should also account for nested workflows.

Solution ideas/steps:
* Write a spec
* It could be implemented on a client side or a server side. The latter is faster because we won't have to make lots of REST requests.


* Functional tests that imitate workflow failures and make sure that we get the right report.

Error examples:

* yaql expression failed:
* http action faild because of an invalid URL:


* One of the current problems is error info cleanness. It's not easy to understand what the precise error is even if we see it.
* Idea: split the actuall error info and contextual information (e.g. stack trace)
* Idea: give an option to report inbound context and outbound context for each task
* Idea: use some sort of classification for all possible errors
* Idea: have a separate REST API endpoint to build reports on the current status of the execution and/or error analysis


* Write a spec first
* Add a new endpoint to generate "Workflow error analysis" reports. Same endpoint can also generate a report on the current progress of a workflow, not necessarily failed yet. It can be used, for example, for UI to track the current situation.

Blueprint information

Renat Akhmerov
Renat Akhmerov
Renat Akhmerov
Series goal:
Accepted for stein
Milestone target:
milestone icon stein-3
Started by
Renat Akhmerov on 2019-01-30
Completed by
Renat Akhmerov on 2019-03-27

Whiteboard - the bug created for the same purpose and now closed to not duplicate this blueprint. It has some additional information though that can be useful.


patches related to it:

Gerrit topic:,topic:bp/mistral-error-analysis,n,z

Addressed by:
    WIP: add a workflow execution report endpoint


Work Items

This blueprint contains Public information 
Everyone can see this information.