Pagination

Registered by Roland Hochmuth

Add support for pagination to all requests that return arrays.

Blueprint information

Status:
Complete
Approver:
Roland Hochmuth
Priority:
Essential
Drafter:
Roland Hochmuth
Direction:
Needs approval
Assignee:
Deklan Dieterly
Definition:
Drafting
Series goal:
None
Implementation:
Implemented
Milestone target:
None
Started by
Roland Hochmuth
Completed by
Deklan Dieterly

Related branches

Sprints

Whiteboard

For measurements this can be addressed by adding query parameters for limit and marker similar to what has already been done in Ceilometer. The limit parameters specifies the maximum number of elements that would be returned in an array. The marker parameter specifies the sequence ID to start to return values from, corresponding to the sequence ID of a measurement.

Problem Statements:
1. Current metrics, measurements and statistics queries can result in memory over-flow.
2. Metrics query returns an array of metrics objects with a name and dimensions per object. This could overflow because there are too many metrics returned.
3. Measurements query returns an array of metrics objects with a name, dimensions and array of measurements. There are two parameters that could overflow:
    1. The number of metrics returned.
    2. The number of measurements returned per metric.
4. Statistics query returns an array of metrics objects with a name, dimensions and array of statistics. There are two parameters that could overflow:
    1 The number of metrics returned.
    2.The number of statistics returned per metric.

Goals:
1. Add support for pagination
2. Don't require server side state management.
3. Keep it Simple Sally (KISS)

Proposals & Analysis:
1. Metrics query:
Add a metric marker_name, marker_dimensions and limit that would be used to specify the (offset - 1) in the list of metrics to start the query from. In other words the marker_name and marker_dimensions is used to specify the last metric in the previous page. The combination of a marker_name and marker_dimensions must uniquely specify a metric to start from. If the last query over-flowed, you would use the last metric name and dimensions in the response as the marker_name and marker_dimensions in the next query. Limit is used to specify the number of metrics returned.
Pros:
Current API doesn't change. Only query parameters are added.
Cons:
InfluxDB doesn't have a way of specifying that the series name is > than some value.
Maybe we can get InfluxDB to add this capability or we could address ourselves?

2. Measurements options:
2.1 Return only a single metric.
The Monasca API would be modified to return the measurements for only a single metric. The query would need to uniquely specify the metric name and dimensions for a metric.
A marker and limit would be added as query parameters to specify the (offset - 1) in the list of measurements to start the query from and the number to return.
Pros:
Simplicity. Only a single marker and limit would be needed
Cons:
Must query for each metric independently.
This might be acceptable for a UI, but could impact integrating in with analytics, third-party applications or archival. For example, if you would like to do analytics on all the cpu.user_perc metrics you need to get each one individually.
2.2 Return measurements for multiple metrics:
This is what the Monasca API currently supports
Add metric marker_name, marker_dimensions, metric_limit, measurement_marker, marker_limit.
The marker_name, marker_dimensions and metric_limit are similar to the metrics query.
The measurement marker specifies the (offset-1) to start from for measurements and the measurements_limit specifies the max number of measurements to return.
Pros:
Can return multiple series in a single query.
Matches the metrics query
Cons:
Doesn't appear that InfluxDB can limit the number of series that are returned. There is no way to say give me all the series starting at such and such a name. The series are ordered alphabetically. This implies that the limiting would have to be done in client code right now. If a client needed to to do repeated queries over and over for pagination purposes, this would hit the InfluxDB server hard, as it would have to return everything and the Monasca API would have to filter down.
Difficult to determine the parameter over-flowing on. Is is the number of metrics or the number of measurements?
It isn't exact. In between the first query and subsequent queries the state could change.

3. Statistics:
3.1 Return statistics for a single metric
Modify the Monasca API to return statistics for only a single metric.
Use start_time, end_time and period to do pagination. The client need to compute the start_time, end_time and period appropriately to avoid any limits.
3.2 Return statistics for multiple metrics.
Add metric marker_name, marker_dimensions and metric_limit for paginating metrics.
start_time, end_time and period are used to do pagination for statistics.
Pros
Similar to measurements.
Cons
Similar to measurements.

> Question: Regarding backwards compatibility If no limit is specified on the request, is pagination disabled?

A nice description of how this works with Swift and how I think this would work with Monasca is at, https://docs.hpcloud.com/publiccloud/api/object-storage/.

When doing a GET request against an account or container, the service returns a maximum of 10,000 names per request. To retrieve subsequent names, you must make another request with a marker parameter. The marker indicates where the last list left off; the system returns names greater than this marker, up to 10,000 again. Note that the marker value should be URL-encoded prior to sending the HTTP request.

If 10,000 is larger than desired, a limit parameter may be given.

If the number of names returned equals the limit given (or 10,000 if no limit is given), it can be assumed there are more names to be listed. If the name list is exactly divisible by the limit, the last request has no content.

For example, let's use a listing of five names

  apples
bananas
kiwis
oranges
pears

We'll use a limit of two to show how things work:

  GET /v1/1234567891012345?limit=2

apples
bananas

Since two items were received, you can assume there are more names to list, so you make another request with a marker of the last item returned:

  GET /v1/1234567891012345?limit=2&marker=bananas

kiwis
oranges

Again, two items are returned; there may be more:

  GET /v1/1234567891012345?limit=2&marker=oranges

pears

With this one-item response we received less than the limit number of names, indicating that this is the end of the list.

Also, take a look at, https://docs.hpcloud.com/publiccloud/api/compute/#Pagination-jumplink-span

Paginated Collections

To reduce load on the service, list operations limit the number of items that can be returned by a single call.

To navigate a collection, the parameters limit and marker can be set in the URI (e.g.?limit=100&marker=1234). The marker parameter is the ID of the last item in the previous list. Items are sorted by create time in descending order. When a create time is not available they are sorted by ID. The limit parameter sets the page size. A maximum of 1000 items are returned by a single call. Setting limit to a value greater than 1000 has no effect. A marker with an invalid ID will return a badRequest (400) fault.

PROPOSAL BY DEKLAN

For each request that currently returns an unbounded number of elements in the Monasca API, add a ‘links' element at the top of the JSON that has ‘prev’ and ‘next’ links with an ‘offset’ query parameter. The offset query parameter contains the sequence number to start retrieving elements for that particular request. Each request will be limited to returning a predefined number of elements. Something like 100 or 1000 could be used. If no offset is present on the URL, then 0 will be assumed.

The following endpoints will need to be modified to add the new ‘links' element.

1. Alarm-definition-list
2. Alarm-history
3. Alarm-history-list
4. Alarm-list
5. Measurement-list
6. Metric-list
7. Notification-list

The measurement-list is unique in that it is a list of lists. For that, the inner lists of measurements will have a ‘links’ element that will allow clients to navigate that internal list. A default number of elements in the internal list of measurements will be used. Something like 1000. If users want more measurements for a particular metric, then they will be required to drill down on that particular metric.

To make this possible, an auto increment ‘sequence_no' column will be added to the following tables in the MYSQL DB:
alarm_definition
Alarm
Notification_method
The offset and the MYSQL limit function will be used to retrieve the next set of elements from the DB. E.g. ‘select … where sequence_no >= n order by sequence_no limit 1000’.

For Influxdb we will need the 0.9.0 release available in early Jan 2015. The details of how to make this work with Influxdb 0.9.0 are still not clear at this point.

Example:

Get the 1000 through 1999 alarm definitions for a tenant.

curl -i -X GET -H 'X-Auth-User: mini-mon' -H 'X-Auth-Token: 20e4b0c89e3042b5a623b798680fa0b7' -H 'X-Auth-Key: password' -H 'Accept: application/json' -H 'User-Agent: python-monascaclient' -H 'Content-Type: application/json' http://192.168.10.4:8080/v2.0/alarm-definitions?offset=1000

{
    "links": [
        {
            "rel": “next",
                "href": "http: //192.168.10.4: 8080/v2.0/alarm-definitions?offset=2000"
            },

            {
                "rel": “prev",
            "href": "http://192.168.10.4:8080/v2.0/alarm-definitions?offset=0"
        }
    ]
     [
        {
            "id": "4a735f31-650d-42fb-af50-be2067ee886f",
            "links": [
                {
                    "rel": "self",
                    "href": "http://192.168.10.4:8080/v2.0/alarm-definitions/4a735f31-650d-42fb-af50-be2067ee886f"
                }
            ],
            "name": "Disk Inode Usage",
            "description": "",
            "expression": "disk.inode_used_perc > 90",
            "match_by": [
                "hostname"
            ],
            "severity": "LOW",
            "actions_enabled": true,
            "alarm_actions": [

            ],
            "ok_actions": [

            ],
            "undetermined_actions": [

            ]
        },
        {
            "id": "6f79ff76-7a75-4b2c-b8bb-02ca91905284",
            "links": [
                {
                    "rel": "self",
                    "href": "http://192.168.10.4:8080/v2.0/alarm-definitions/6f79ff76-7a75-4b2c-b8bb-02ca91905284"
                }
            ],
            "name": "High CPU usage",
            "description": "",
            "expression": "avg(cpu.idle_perc) < 10 times 3",
            "match_by": [
                "hostname"
            ],
            "severity": "LOW",
            "actions_enabled": true,
            "alarm_actions": [

            ],
            "ok_actions": [

            ],
            "undetermined_actions": [

            ]
        }
    ]
}

Update 2/24/2015

Influxdb 9 RC3 has been released. Some decisions have been made concerning the pagination to work with the available facilities in Influxdb.

1. Measurement list and statistics queries will be limited to returning elements for a single metric.

2. Grafana has to change to handle only a single metric returned in measurement queries.

3. The API will return an error if more than one metric could be returned for a query for
measurement lists or statistics.

4. The JSON will not be changed to handle a single metric. This will allow for multiple metrics in the future if we decide to use Vertica.

5. Start and end time will be used for measurement list and statistics to limit the number of elements returned for a single metric.

6. All responses will be changed to JSON objects. Currently some responses are lists.

7. We will be using offset and limit for the other queries. I.e., not measurement list or statistics.

Gerrit topic: https://review.openstack.org/#q,topic:feature/pagination,n,z

Addressed by: https://review.openstack.org/145624
    Add pagination

Addressed by: https://review.openstack.org/146575
    Add pagination

Once pagination is supported in API as described above, it should also be exposed in the pythonclient.
Here are the changes made to other services to support pagination in their pythonclients:
Novaclient: https://github.com/openstack/python-novaclient/blob/master/novaclient/v1_1/servers.py#L571
Troveclient:
http://docs.openstack.org/developer/python-troveclient/usage.html#listing-instances-and-pagination
https://github.com/openstack/python-troveclient/blob/6f8b0d09bdcd0f627b8b5e6227e2d6232d9dbf78/troveclient/base.py#L65
Glanceclient:
https://github.com/openstack/python-glanceclient/blob/9a4d8580e890c3c55c2d02904f5f6983bd06bd1c/glanceclient/v2/metadefs.py#L91
Heatclient:
https://github.com/openstack/python-heatclient/blob/78af7de8e590cfcae80958c62fe9176bee0ffda5/heatclient/v1/stacks.py#L86

The monsaca ui will then need to change using the horizon support for pagination as described in
https://horizon-openstack-dashboard.readthedocs.org/en/latest/ref/tables.html#class-based-views

Gerrit topic: https://review.openstack.org/#q,topic:pagination,n,z

Addressed by: https://review.openstack.org/152190
    First pass of pagination support for list queries

Addressed by: https://review.openstack.org/160029
    Add pagination support

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.