Robust range handling in http GET requests

Registered by Vincent Ladeuil

To minimize http transfers, bzr issue ranged requests, i.e. instead of downloading whole files, it requests only part of them by issuing GET requests with a header specifying which ranges it is interested in.

Some http servers and proxies don't or badly implement this feature (see bugs #62029 and #62276).

Bzr can me made more robust by implementing the following scheme:

- initially the http transport will try to issue multi-range requests,

- when the transport detects that a ranged GET request is returning bogus results, he will issue a new request with a single range. That single range will be defined by a start being the start of the first range (with ranges sorted) and the end, the end of the last range (i.e a single range enclosing all the requested ranges),

- when issuing a single range request, if bogus results are detected, the transport will issue a GET request for the whole file and process the ranges locally.

These two steps will be persistent: once a transport have established that a server lacks either multi or single range requests it will never issue that kind of request anymore to that server.

If the server is cloned (for connection sharing by example), it will transmit that information to the cloned transport.

If the connection should be closed to handle an error and then opened again against the same server, that information should be preserved too.

Implementation available in the bzr.urllib.keepalive branch (needed for the tests but the code may be simple enough to be back ported to bzr.dev).

Blueprint information

Status:
Complete
Approver:
John A Meinel
Priority:
High
Drafter:
Vincent Ladeuil
Direction:
Needs approval
Assignee:
Vincent Ladeuil
Definition:
Approved
Series goal:
None
Implementation:
Implemented
Milestone target:
milestone icon 0.13
Started by
Vincent Ladeuil
Completed by
Vincent Ladeuil

Related branches

Sprints

Whiteboard

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.