Period-spanning statistics

Registered by Eoghan Glynn on 2014-03-11

Currently within the innards of the statistics API, we treat each period as being totally ring-fenced off, effectively a standalone bucket of datapoints.

The aggregation functions are all written to consume only those data originating from each individual period as input, with no good way of allowing knowledge from previous periods to be taken into account.

However there is a class of useful statistical techniques that naturally span periods, ranging from simple moving averages to more sophisticated exponential smoothing with the potential for forecasting & prediction bands.

The detailed design would have to consider issues such as whether to calculate via a sliding ├╝ber-period, or by allowing previous results to feedback into the calculation for the next period.

From an implementation point of view, one of the challenges would be ensuring an efficient tailored approach is taken with each of a range of distinct data manipulation methods (i.e. doing this via map-reduce on mongo would take quite a different way of looking at the problem, as compared to say the most natural sqlalchemy query for achieving the same thing).

The potential usecases for these period spanning techniques would include dynamic thresholds for alarming (i.e. alarm when the observed values fall outside prediction bands).

Blueprint information

Status:
Complete
Approver:
None
Priority:
Undefined
Drafter:
Eoghan Glynn
Direction:
Needs approval
Assignee:
None
Definition:
Obsolete
Series goal:
None
Implementation:
Unknown
Milestone target:
None
Completed by
gordon chung on 2017-06-20

Related branches

Sprints

Whiteboard

deprecated api -- gordc (2017.06)

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.