Ceilometer

Event Triggers

Registered by Monsyne Dragon on 2013-08-02

When processing notification events, it is often useful to aggregate data from notifications that are emitted over a wide range of time. These may arrive at multiple different collectors (if there are multiple collector workers) or even in random order, due to the vaugeries of rabbitmq, or different components within an openstack project (like nova) working at varying rates. Triggers allow for such aggregation.
Triggers have a Definition with a pattern, a trigger criteria, a ttl, a triggered notification pipeline, and an optional expiration notification pipeline.
The pattern is a specification for event types and notification data values that match the trigger. When the first notification that matches a trigger's definition pattern is detected, a Trigger is created. The trigger keeps a list of notifications that have matched it's pattern. When the criteria is met, the trigger fires, sending all of the collected notifications to the associated pipeline as a batch for processing. If the ttl expires without the trigger firing, the collected notifications are sent to the expiration pipeline if there is one. When fired or expired, the trigger is deleted.
Example:
You want to meter average build time statistics.
A trigger for build_time:$instance_uuid is defined, with a pattern matching compute.instance.create.* events for a given instance_uuid. The criteria is that there is at least one compute.instance.create.start event and at least one compute.instance.create.end. The ttl is 4 hours, (after which the user considers the build failed) . There is a a notification pipeline that filters out extra .start events (which can be generated by nova's rescheduling mechanism), and generates a timedelta sample between the timestamps for the .start and .end events, to drive an average build time meter. If the ttl expires, the expiration pipeline generates a sample to increment a cumulative build failure meter.

Read the full specification

Blueprint information

Status:: Complete

Approver:: None

Priority:: Undefined

Drafter:: Monsyne Dragon

Direction:: Needs approval

Assignee:: None

Definition:: Obsolete

Series goal:: None

Implementation:: Unknown

Milestone target:: next

Completed by: gordon chung on 2018-02-14

Related branches

Related bugs

Sprints

Whiteboard

Looks great Dragon. Here's a use-case to consider (from StackTach):

1. trigger on new Request ID
2. "all event" * wildcard on related events (they could come from anywhere)
3. determine the start if the service is from an api node (likely extracted from the event name)
4. programmatically determine how long to wait for the last operation and mark the time difference or on something specific like compute.instance.run_instance.end for that particular operation.

More on #4. the "wait" time would be, let's say 5 minutes for a tiny instance but could be 5hrs for a large windows image. We may not know what the last event we'll see is, so we have to wait it out.

closing this as it's super idle. that said, making the notification agents stateful is probably a bad idea as it adds quite a bit of complexity and will probably impact scalability in addition to durability. this seems better handled by storage drivers where it's more durable and can provide similar response time i believe. for example, gnocchi can capture build time easily rather than spawning a thread to watch for specific events. please reopen as wishlist bug if still interested -- gordc (2018.02)

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information

Everyone can see this information.

Subscribers

Adrian Otto

John Herndon

Monsyne Dragon

Patrick Petit

Swann Croiset

yuntongjin