Add option to cleanup resources only created by Rally

Registered by Wataru Takase

Rally deletes all resources in tenants which are used by test.
In order to avoid to be deleted unintended resources, it would be better to have a cleanup option that deletes resources only created by Rally instead of deleting all resources in tenants.

And also in case of using pre-defined users and/or pre-defined tenants for test, it would be better to keep existing resources, instead of deleting all resources in those tenants.

The ideas are as follows:
1) User selects cleanup method in scenario file:
   - cleanup_all: True -> deleting all resources (default)
   - cleanup_all: False -> deleting resources only created by Rally
2) Rally collects created resource object/id during test and deletes them after test using collected list.
3) When timeout or some exception occurs during to create resources, those resources may not be cleaned up because Rally cannot get resource information.
4) We don't care xxx-and-delete scenario because created resources will be deleted by test itself.
5) We need to consider how to handle resources created in a context: the context deletes them or not.

Blueprint information

Status:
Not started
Approver:
None
Priority:
Undefined
Drafter:
Wataru Takase
Direction:
Needs approval
Assignee:
Wataru Takase
Definition:
Drafting
Series goal:
None
Implementation:
Unknown
Milestone target:
None

Related branches

Sprints

Whiteboard

Sorry didn't get any notification about this BP...

Let me share first what we are working on:
1) First of all we finished refactoring of cleanup engine. So now it's much more production ready:
https://github.com/stackforge/rally/tree/master/rally/benchmark/context/cleanup

So now we are specifying ResourceManagers, that knows how to get and delete some specific resource. Here is base resource. And listing resource that should be deleted:
https://github.com/stackforge/rally/blob/master/rally/benchmark/context/cleanup/base.py#L104-L106

So if we use name patterns -> we will just filter all resources by name.startswith("rally_name_patter")
or in another words we will delete only rally created resources.

> 1) User selects cleanup method in scenario file:
> - cleanup_all: True -> deleting all resources (default)
> - cleanup_all: False -> deleting resources only created by Rally

World is already too complicated to add more parameters. This can be resolved without any extra argument. We should have only second option, and it should work without any extra configuration.

> 2) Rally collects created resource object/id during test and deletes them after test using collected list.

This is not aligned with use cases that Rally should cover. Next cases produces conflict:
1) Collecting ID makes system complicated + adds a lot of overhead.
2) Distributed load generator -> that will make collecting ID super complicated
3) We should provide disaster cleanup functionality. That should delete all (and only) resource created by Rally even if we lose Rally DB and instance with Rally.

> 3) When timeout or some exception occurs during to create resources, those resources may not be cleaned up because Rally cannot get resource information.

Hm? I didn't get this. In case of timeout rally will LOG that some resource wasn't able to delete cause of timeout. This is maximum that you can do in such case. So not sure what this point means..

> 4) We don't care xxx-and-delete scenario because created resources will be deleted by test itself.

Believe me. We care! Under load you can get random errors from cloud when resource actually is created, but rally will get 50x/40x errors and think that resource is not created. So it won't run "delete" part of scenario. And Rally is actually positioned as tool that can be run against production clouds.

> 5) We need to consider how to handle resources created in a context: the context deletes them or not.

As we have in our road map "persistence context" feature. https://github.com/stackforge/rally/blob/master/doc/feature_request/persistence_benchmark_env.rst

So 1 time created context can be used multiple times => means that we can use it in multiple tasks => cleanup context (that is run every time) shouldn't touch resource from context. But disaster cleanup should.

-- boris-42

> So if we use name patterns -> we will just filter all resources by name.startswith("rally_name_patter")
> or in another words we will delete only rally created resources.

I agree with you.
For cleanup, filtering resources by name seems good and simple!!

>> 3) When timeout or some exception occurs during to create resources, those resources may not be cleaned up because Rally cannot get resource information.
>
> Hm? I didn't get this. In case of timeout rally will LOG that some resource wasn't able to delete cause of timeout. This is maximum that you can do in such case. So not sure what this point means..

That may only happen if Rally collects resource information during test and cleanups using the collected information (not happen to current Rally). When timeout occurs during resource creation, Rally cannot get the resource information. So The resource may be not deleted.
If we use resource name filtering for cleanup, we don't care above problem.

I will try to implement the filter for cleanup.

-- wtakase

Gerrit topic: https://review.openstack.org/#q,topic:bp/add-cleanup-option,n,z

Addressed by: https://review.openstack.org/139643
    Add name pattern filter for resource cleanup

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.