apt-get hashsum/size mismatch because s3 mirrors don't support http pipelining correctly

Bug #948461 reported by Ben Howard
56
This bug affects 7 people
Affects Status Importance Assigned to Milestone
apt (Ubuntu)
Fix Released
Low
Unassigned
Hardy
Won't Fix
Undecided
Unassigned
Lucid
Won't Fix
Undecided
Unassigned
Maverick
Won't Fix
Undecided
Unassigned
Natty
Won't Fix
Undecided
Unassigned
Oneiric
Won't Fix
Undecided
Unassigned
Precise
Won't Fix
High
Unassigned
cloud-init (Ubuntu)
Fix Released
High
Unassigned
Hardy
Won't Fix
High
Unassigned
Lucid
Fix Released
High
Unassigned
Maverick
Fix Released
High
Unassigned
Natty
Fix Released
High
Unassigned
Oneiric
Fix Released
High
Unassigned
Precise
Fix Released
High
Unassigned

Bug Description

[Problem]
When using S3 mirrors, the apt client requests files A,B,C,D in the same request, but S3 responds with files B,C,D,A or some other arbitrary ordering. Apt then saves the files under the wrong file names.

This is due to a bug in Amazon AWS S3, where http pipelining does not work in violation of RFC 2068. apt uses http pipelining currently, so this prevents apt from using mirrors hosted on S3 cloud instances.

[Impact]
Blocks use of S3 mirrors for providing ubuntu updates. Affects Hardy, Lucid, Maverick, Oneiric, Natty, and Precise.

Use of S3 mirrors is considered as a way of enhancing our mirror services. Currently mirrors have moments of instability, so having the option of switching to S3 mirrors is very important for update reliability.

[Development Fix]
Fixed in Precise's cloud-init as of these comments, which adds an apt_pipelining option, turned off by default but configurable.
http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/revision/536
http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/revision/537

[Stable Fix]
TODO

[Text Case]
1) Launch an instance in EC2
2) Run: sed -i "s,ec2.archive.ubuntu.com,ec2.archive.ubuntu.com.s3.amazonaws.com,g" /etc/apt/sources.list
3.) Set your apt debug levels to whatever you want
4) Run "apt-get -y update"
5) Run "apt-get -y install firefox xorg libreoffice > /tmp/install.log 2>&1"
Broken Behavior: Cached files are being misnamed
Fixed Behavior: Cached files are properly named

The S3 EC2 mirrors seem to reproduce it most reliably. The downloaded files are whole and match the proper checksums, but they have the wrong content.

[Regression Potential]
http pipelining is an optimization technique in apt which allows multiple requests to be made at the same time. When it works, it provides a small performance boost, but when it doesn't the failures are arbitrary and not adequately explained.

Thus, the expected regression is slightly slower apt behavior for non-S3 cloud users. However, testing shows the impact is less than a few seconds so will likely be unnoticed. In the off chance that it does inadvertently cause non-trivial performance impact, the change is configurable on a per-host instance.

[Original Report]
[SRU: Cloud-Init]
r536 and r537 contains a fix that:
  1.) Disables APT pipelining on first boot for new Cloud Images instances
  2.) Disables APT pipelining on installation for existing cloud-images.

The rational for this SRU is due to the Mirror situation for EC2 Cloud images. Currently, the mirrors are proving to have moments of instability that is causing both end-user and development pain. In order to address this, we built out Amazon AWS S3 mirrors. Unfortantly, due to a bug in S3, apt http pipeling does not work with S3. As a result, in order to provide enhanced availability for the Cloud Image repositories in EC2, apt http pipelining needs to be disabled ahead of flipping the new S3 mirrors as the active mirrors.

The performance impact of this change is unnoticable and in testing, has proven to provide no ill effect. http pipeplining allows for multiple requests to be made at the same time. The apt change log states that http pipelining is a "micro-enhancement" and within the EC2 environment shaves a few seconds of many requests. When apt http pipeling works, at best, it saves a few seconds, when it fails against a buggy HTTP1.1 server or behind a proxy that does not support it, the failures are arbitrary and not adiquetly explained. This change will enhance the Cloud Images by making apt request packages sequentially instead of in batches and prevent arbitrary failures caused by non-compliant web servers. Cloud image users, both on EC2 and outside, will benefit from this fix.

http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/revision/536
http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/revision/537

_______________________________________________________________________________________
apt-get is appears to be mangling the local caching file names, which is appearing as a hashsum or size mismatch.

Evidence below:

_______________________________________________________________________________________
Output of "apt-get -y install firefox":
Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/p/python-defer/python-defer_1.0.2+bzr481-1_all.deb Size mismatch
Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/a/aptdaemon/python-aptdaemon_0.43+bzr769-0ubuntu1_all.deb Size mismatch
Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/a/aptdaemon/aptdaemon_0.43+bzr769-0ubuntu1_all.deb Size mismatch
Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/u/ubufox/xul-ext-ubufox_2.0-0ubuntu1_all.deb Size mismatch

_______________________________________________________________________________________

apt-get -y install firefox with debug for HTTP turned on:
Get:63 http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/ precise/main python-defer all 1.0.2+bzr481-1 [10.9 kB]
HTTP/1.1 200 OK^M
x-amz-id-2: EXccd6GVMfE6ly4SYdwy313VK42d/gI3ncyxmaotuVIMbBBi6FJkuDYWrzLw7vXE^M
x-amz-request-id: 7572BAB5E6FFFAC0^M
Date: Tue, 06 Mar 2012 20:06:13 GMT^M
Last-Modified: Fri, 03 Feb 2012 08:54:27 GMT^M
ETag: "6c07f8db615cc32657b1cc57450a1608"^M
Accept-Ranges: bytes^M
Content-Type: application/x-debian-package^M
Content-Length: 15142^M
Server: AmazonS3^M
^M
Get:64 http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/ precise/main python-aptdaemon all 0.43+bzr769-0ubuntu1 [80.3 kB]
HTTP/1.1 200 OK^M
x-amz-id-2: IcNECY1OtzHOXnTnTW3fqoJGyfdODq+qRXXGYTDEOJxfw0CP9UH8EUSllVR3Gf3F^M
x-amz-request-id: 5DEEEE629E3BA012^M
Date: Tue, 06 Mar 2012 20:06:13 GMT^M
Last-Modified: Fri, 02 Mar 2012 20:55:40 GMT^M
ETag: "501373631a241847cf76d13aea9264b3"^M
Accept-Ranges: bytes^M
Content-Type: application/x-debian-package^M
Content-Length: 56824^M
Server: AmazonS3^M
^M
Get:65 http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/ precise/main aptdaemon all 0.43+bzr769-0ubuntu1 [15.1 kB]
HTTP/1.1 200 OK^M
x-amz-id-2: HHNSOZKH74D4I1XCaOiwDOzIWxIVJ88ibfKxvIp/4NpkczzPmsOhMqxrTj2T02qO^M
x-amz-request-id: A6FD3143E6A0BA1A^M
Date: Tue, 06 Mar 2012 20:06:34 GMT^M
Last-Modified: Sat, 21 Jan 2012 00:48:55 GMT^M
ETag: "3653165af1f20a437a14632ce0a2e6c2"^M
Accept-Ranges: bytes^M
Content-Type: application/x-debian-package^M
Content-Length: 10902^M
Server: AmazonS3^M

_______________________________________________________________________________________
Evaluation of the MD5sums of downloaded failed items:
root@ip-10-6-85-206:/var/cache/apt/archives/partial# md5sum *
501373631a241847cf76d13aea9264b3 aptdaemon_0.43+bzr769-0ubuntu1_all.deb
6c07f8db615cc32657b1cc57450a1608 python-aptdaemon_0.43+bzr769-0ubuntu1_all.deb
566f7c92a9fcd0c5c40d08a977176291 python-defer_1.0.2+bzr481-1_all.deb
3653165af1f20a437a14632ce0a2e6c2 xul-ext-ubufox_2.0-0ubuntu1_all.deb

_______________________________________________________________________________________
Snipets from Packages.bz2
From Meta-data:
  Package: aptdaemon
  MD5sum: 6c07f8db615cc32657b1cc57450a1608

  Package: xul-ext-ubufox
  MD5sum: 501373631a241847cf76d13aea9264b3

  Package: python-aptdaemon
  MD5sum: 566f7c92a9fcd0c5c40d08a977176291

  Package: python-defer
  MD5sum: 3653165af1f20a437a14632ce0a2e6c2

_______________________________________________________________________________________
"dpkg -i" on pacakges downloaded

root@ip-10-6-85-206:/var/cache/apt/archives/partial# dpkg -I aptdaemon_0.43+bzr769-0ubuntu1_all.deb
 new debian package, version 2.0.
 size 56824 bytes: control archive= 1520 bytes.
      23 bytes, 1 lines conffiles
    1023 bytes, 29 lines control
    1804 bytes, 21 lines md5sums
 Package: xul-ext-ubufox
 Source: ubufox
 Version: 2.0-0ubuntu1
 Architecture: all
 Maintainer: Ubuntu Mozilla Team <email address hidden>
 Installed-Size: 383
 Depends: aptdaemon, libglib2.0-0 (>= 2.26)
 Recommends: firefox (>= 4.0~b6)
 Enhances: firefox
 Breaks: ubufox (<< 0.9~rc2-0ubuntu3)a
 Replaces: ubufox (<< 0.9~rc2-0ubuntu3)
 Provides: firefox-ubufox, ubufox
 Section: web
 Priority: optional
 Homepage: https://launchpad.net/ubufox
 Description: Ubuntu-specific configuration defaults and apt support for Firefox
  Adds Ubuntu-specific modifications to Firefox.
  .
  Integrates the browser with Ubuntu to:
   * Enable searching for missing plugins from Ubuntu software catalog
   * Add the following options to the Help menu
     - Get help on-line
     - Help translating Firefox
     - Ubuntu Release Notes
   * Set homepage to Ubuntu Start Page
   * Display a restart notification after upgrading Firefox
   * Add ask.com to the search engines.
  .
  You can uninstall this if you prefer to use a pristine Firefox install.

root@ip-10-6-85-206:/var/cache/apt/archives/partial# dpkg -I python-aptdaemon_0.43+bzr769-0ubuntu1_all.deb
 new debian package, version 2.0.
 size 15142 bytes: control archive= 1508 bytes.
      68 bytes, 2 lines conffiles
    1390 bytes, 30 lines control
    1088 bytes, 15 lines md5sums
 Package: aptdaemon
 Version: 0.43+bzr769-0ubuntu1
 Architecture: all
 Maintainer: Ubuntu Developers <email address hidden>
 Installed-Size: 188
 Depends: python, python-aptdaemon (= 0.43+bzr769-0ubuntu1), python-gi, gir1.2-glib-2.0
 Breaks: software-center (<< 1.1.21)
 Section: admin
 Priority: extra
 Homepage: https://launchpad.net/aptdaemon
 Description: transaction based package management service
  Aptdaemon allows normal users to perform package management tasks, e.g.
  refreshing the cache, upgrading the system, installing or removing software
  packages.
  .
  Currently it comes with the following main features:
  .
   - Programming language independent D-Bus interface, which allows one to
     write clients in several languages
   - Runs only if required (D-Bus activation)
   - Fine grained privilege management using PolicyKit, e.g. allowing all
     desktop user to query for updates without entering a password
   - Support for media changes during installation from DVD/CDROM
   - Support for debconf (Debian's package configuration system)
   - Support for attaching a terminal to the underlying dpkg call
  .
  This package contains the aptd script and all the data files required to run
  the daemon. Moreover it contains the aptdcon script, which is a command
  line client for aptdaemon. The API is not stable yet.
 Original-Maintainer: Julian Andres Klode <email address hidden>

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: apt 0.8.16~exp12ubuntu4
ProcVersionSignature: Ubuntu 3.2.0-18.28-virtual 3.2.9
Uname: Linux 3.2.0-18-virtual x86_64
ApportVersion: 1.94-0ubuntu1
Architecture: amd64
Date: Tue Mar 6 21:21:17 2012
Ec2AMI: ami-0400dd6d
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1c
Ec2InstanceType: m1.large
Ec2Kernel: aki-825ea7eb
Ec2Ramdisk: unavailable
SourcePackage: apt
UpgradeStatus: No upgrade log present (probably fresh install)

Related branches

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
description: updated
description: updated
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Apt log of import

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Steps to repoduce (reliably):

1) Launch an instance in EC2
2) Run: sed -i "s,ec2.archive.ubuntu.com,ec2.archive.ubuntu.com.s3.amazonaws.com,g" /etc/apt/sources.list
3.) Set your apt debug levels to whatever you want
4) Run "apt-get -y update"
5) Run "apt-get -y install firefox xorg libreoffice > /tmp/install.log 2>&1"

The S3 EC2 mirrors seem to reproduce it more reliably for some reason. I thought that this was an issue with S3, but in attempting to prove to Amazon that is is S3, I found the evidence indicating that the cached files are being misnamed.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
tags: added: apt-get mirrors
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
Download full text (3.8 KiB)

Confirmed on Natty:

Fetched 13.7 MB in 12s (1,106 kB/s)
W: Failed to fetch bzip2:/var/lib/apt/lists/partial/us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com_ubuntu_dists_natty-updates_main_binary-i386_Packages Hash Sum mismatch

W: Failed to fetch bzip2:/var/lib/apt/lists/partial/us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com_ubuntu_dists_natty-updates_universe_binary-i386_Packages Hash Sum mismatch

W: Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/dists/natty-updates/main/i18n/Index No Hash entry in Release file /var/lib/apt/lists/partial/us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com_ubuntu_dists_natty-updates_main_i18n_Index

root@ip-10-117-77-162:/var/lib/apt/lists/partial# md5sum *
66b1e003087af95e4c3ecb2595888723 us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com_ubuntu_dists_natty-updates_main_binary-i386_Packages
 -- Matches universe/source/Sources.bz2
6c0170a630ab2c2526fce1a15a557cc7 us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com_ubuntu_dists_natty-updates_main_binary-i386_Packages.decomp.FAILED
 -- Matches universe/sources/Sources
27bb09fcfc7a4dee731a84d32ab59561 us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com_ubuntu_dists_natty-updates_main_i18n_Index
 -- Matches universe/binary-i386/Packages.bz2
5238567412ee16c16aa4dfb1282e4baa us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com_ubuntu_dists_natty-updates_universe_binary-i386_Packages
 -- Matches main/binary-i386/Packages.bz2
57a2cf0d9cf10366d3905285f3507227 us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com_ubuntu_dists_natty-updates_universe_binary-i386_Packages.decomp.FAILED
 -- Matches main/binary-i386/Packages

From natty-updates Apt Sources:
 5238567412ee16c16aa4dfb1282e4baa 431291 main/binary-i386/Packages.bz2
 c757bc649fcb9fd50d9c5b895727daa8 554975 main/binary-i386/Packages.gz
 d7f12a4b835c3837b80f214b1dd6753e 102 main/binary-i386/Release
 57a2cf0d9cf10366d3905285f3507227 3333025 main/binary-i386/Packages
 f12dc36fa5dd364ae0b7776ba4bafa28 183259 universe/binary-i386/Packages.gz
 9f930c4d57ab8f70aab2d36f06a27b3d 779085 universe/binary-i386/Packages
 0951480c2c33548ec3618ab1082b77d9 106 universe/binary-i386/Release
 27bb09fcfc7a4dee731a84d32ab59561 138336 universe/binary-i386/Packages.bz2
 6c0170a630ab2c2526fce1a15a557cc7 149525 universe/source/Sources
 46cd339686c8e38c23a541190b436a29 46951 universe/source/Sources.gz
 66b1e003087af95e4c3ecb2595888723 40485 universe/source/Sources.bz2
 a4245d47dc44f65f4ce1786652c27c3d 108 universe/source/Release

And Curl headers:
root@ip-10-117-77-162:/var/lib/apt/lists/partial# curl -I http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/dists/natty-updates/main/binary-i386/Packages.bz2
HTTP/1.1 200 OK
x-amz-id-2: HmP/KlL95l4732MGdCO2Yuz/Sx40eMJ0Z3XYspztQra8J9PjkNxajSKT/l4ZcQan
x-amz-request-id: CA9A66000053719F
Date: Tue, 06 Mar 2012 22:42:08 GMT
Last-Modified: Tue, 06 Mar 2012 20:56:01 GMT
ETa...

Read more...

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in apt (Ubuntu):
status: New → Confirmed
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
Download full text (3.7 KiB)

Bug confirmed for Oneiric too:

root@ip-10-64-61-11:/var/cache/apt/archives/partial# dpkg -I xserver-xorg-core_2%3a1.10.4-1ubuntu4.2_i386.deb
 new debian package, version 2.0.
 size 90230 bytes: control archive= 720 bytes.
     709 bytes, 18 lines control
     255 bytes, 3 lines md5sums
 Package: xserver-xorg-video-mach64
 Version: 6.9.0-1
 Architecture: i386
 Maintainer: Ubuntu Developers <email address hidden>
 Original-Maintainer: Debian X Strike Force <email address hidden>
 Installed-Size: 268
 Depends: libc6 (>= 2.4), xorg-video-abi-10, xserver-xorg-core (>= 2:1.10.0-0ubuntu1~)
 Provides: xorg-driver-video
 Section: x11
 Priority: optional
 Description: X.Org X server -- ATI Mach64 display driver
  This driver for the X.Org X server (see xserver-xorg for a further description)
  provides support for the ATI Mach64 series.
  .
  More information about X.Org can be found at:
  <URL:http://www.X.org>
  .
  This package is built from the X.org xf86-video-mach64 driver module.
root@ip-10-64-61-11:/var/cache/apt/archives/partial#

root@ip-10-64-61-11:/var/cache/apt/archives/partial# dpkg -I xserver-xorg-video-mach64_6.9.0-1_i386.deb
 new debian package, version 2.0.
 size 6706 bytes: control archive= 977 bytes.
    1206 bytes, 25 lines control
     217 bytes, 3 lines md5sums
 Package: xserver-xorg-video-ati
 Version: 1:6.14.99~git20110811.g93fc084-0ubuntu1
 Architecture: i386
 Maintainer: Ubuntu Developers <email address hidden>
 Installed-Size: 96
 Depends: libc6 (>= 2.1.3), libpciaccess0, xorg-video-abi-10, xserver-xorg-core (>= 2:1.10.0-0ubuntu1~), xserver-xorg-video-r128, xserver-xorg-video-mach64, xserver-xorg-video-radeon
 Provides: xorg-driver-video
 Section: x11
 Priority: optional
 Description: X.Org X server -- AMD/ATI display driver wrapper
  This package provides the 'ati' driver for the AMD/ATI Mach64, Rage128,
  Radeon, FireGL, FireMV, FirePro and FireStream series. This driver is
  actually a wrapper that loads one of the 'mach64', 'r128' or 'radeon'
  sub-drivers depending on the hardware.
  These sub-drivers are brought through package dependencies.
  .
  Users of Rage, Mach, or Radeon boards may remove this package only if
  they use Driver "r128", "mach64", or "radeon" in /etc/X11/xorg.conf
  instead of relying on autodetection.
  .
  More information about X.Org can be found at:
  <URL:http://www.X.org>
  .
  This package is built from the X.org xf86-video-ati driver module.
 Original-Maintainer: Debian X Strike Force <email address hidden>

root@ip-10-64-61-11:/var/cache/apt/archives/partial# dpkg -I xserver-xorg-video-ati_1%3a6.14.99~git20110811.g93fc084-0ubuntu1_i386.deb
 new debian package, version 2.0.
 size 12712 bytes: control archive= 785 bytes.
     793 bytes, 20 lines control
     316 bytes, 4 lines md5sums
 Package: xserver-xorg-video-fbdev
 Version: 1:0.4.2-3ubuntu6
 Architecture: i386
 Maintainer: Ubuntu Developers <email address hidden>
 Installed-Size: 96
 Depends: libc6 (>= 2.1.3), xorg-video-abi-10, xserver-xorg...

Read more...

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

I've confirmed this bug exists on Hardy, Lucid, Maverick, Oneiric, Natty and Precise.

tags: added: server
Revision history for this message
Scott Moser (smoser) wrote :

Ben,
  Can you see this fail on mirrors other than the new s3 backed mirrors?
  I've tried several times and have been unable to reproduce on the us-east-1.ec2.archive.ubuntu.com mirrors.
  However, I can reproduce fairly regularly against s3 mirrors

  See log at http://paste.ubuntu.com/872478/
  The command I was using was:

 ( sudo apt-get clean; sudo apt-get update --assume-yes ; sudo apt-get --download-only -y install -o=Debug::Acquire::http=true --assume-yes firefox ; echo $? ) </dev/null 2>&1 | tee /tmp/apt-download.log

  r=0; log=/tmp/apt-download.log; pkg="ubuntu-desktop";
  while [ "$r" = "0" ]; do
    ( sudo apt-get clean; sudo apt-get update --assume-yes &&
      sudo apt-get --download-only -y install -o=Debug::Acquire::http=true --assume-yes $pkg ;
      echo $? ) </dev/null 2>&1 | tee $log
    tail -n 1 $log > out; read r < out ;
    echo "$(date): finished, $r";
   done

Then, I've got this running in a loop from a precise m1.large in us-east-1:

r=0; log=/tmp/apt-download.log;
while [ $r -eq 0 ]; do
  ( sudo apt-get clean; sudo apt-get update --assume-yes && sudo apt-get --downl
oad-only -y install -o=Debug::Acquire::http=true --assume-yes ubuntu-desktop ; echo $? ) </dev/null 2>&1 | tee $log ;
  tail -n 1 $log > out; read r < out ;

  Until we can see it fail against more traditional http server, we almost have to suspect failure of S3 mirrors. Anything
 "$(date): finished, $r"; done

Revision history for this message
Scott Moser (smoser) wrote :

Some more data, after discussing with slangasek.
The attached script will run seemingly forever (100 times run) as it is, however if we remove the flag:
  -o=Acquire::http::Pipeline-Depth=0
to apt, then it will generally fail < 10 times.
(data collected on m1.large "right now", with mirrors set to http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/).

It would seem to me that the Pipeline-Depth is the source of the issue. See [1] for apt.conf description, and [2] for a quick google of 's3' 'http' 'pipeline' that gives some evidence that Pipeline and S3 aren't friends.

--
[1] http://manpages.ubuntu.com/manpages/precise/en/man5/apt.conf.5.html
[2] http://stackoverflow.com/questions/7752802/does-s3-support-http-pipeling

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Scott,

I can confirm that "-o=Acquire::http::Pipeline-Depth=0" appears to fix the problem. We'll need to look at putting that configuration option into the cloud-images.

Thanks Scott and Steve for digging on this.

Revision history for this message
Scott Moser (smoser) wrote : Re: [Bug 948461] Re: apt-get hashsum/size mismatch due caused by swapped local file names

On Wed, 7 Mar 2012, Ben Howard wrote:

> Scott,
>
> I can confirm that "-o=Acquire::http::Pipeline-Depth=0" appears to fix
> the problem. We'll need to look at putting that configuration option
> into the cloud-images.

Well.... thats not really a complete fix though.
 * existing images do not have that setting in them, and they will have to
   'apt-get update && apt-get upgrade' to *get* that setting (whch can
   fail).
 * other images out there would not have this setting, and would just
   start to fail when we switch DNS records to S3 mirrors.
 * other people building images would need to know they need to do this.
   (this could be alleviated by it being a file laid down by a common
   package like cloud-init, but even then, we'd have to SRU the fix
   to old releases.)

I'm really not sure what to do. We dont want EC2 laden with "gotchas"
that only work if you use our special images. While I certainly want our
images to work, I don't want to just make like for difficult for others
who are not using them.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : Re: apt-get hashsum/size mismatch due caused by swapped local file names

Regarding comment #13, I think the concern is moot right now. I was able to replicate the issue with "-o=Acquire::http::Pipeline-Depth=0" turned on. Which means that pipelining may contribute but doesn't appear to be the sole cause.

Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/libx/libxinerama/libxinerama1_1.1.1-3build1_amd64.deb Size mismatch
Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/libx/libxrandr/libxrandr2_1.3.2-2_amd64.deb Size mismatch
Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/s/shared-mime-info/shared-mime-info_1.0-0ubuntu1_amd64.deb Size mismatch
Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/g/gtk+2.0/libgtk2.0-0_2.24.10-0ubuntu5_amd64.deb Size mismatch
Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/x/xorg/x11-common_7.6+10ubuntu1_all.deb Size mismatch
Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/libi/libice/libice6_1.0.7-2build1_amd64.deb Size mismatch
Failed to fetch http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/pool/main/p/policykit-1/libpolkit-agent-1-0_0.104-1_amd64.deb Size mismatch
Fetched 28.1 MB in 8s (3,261 kB/s)
E: Some files failed to download

root@ip-10-80-113-210:/var/cache/apt/archives/partial# dpkg -I libxinerama1_2%3a1.1.1-3build1_amd64.deb
 new debian package, version 2.0.
 size 456104 bytes: control archive= 1639 bytes.
     805 bytes, 18 lines control
    1330 bytes, 16 lines md5sums
     479 bytes, 20 lines * postinst #!/bin/sh
     376 bytes, 19 lines * postrm #!/bin/sh
     194 bytes, 14 lines * prerm #!/bin/sh
      34 bytes, 1 lines triggers
 Package: shared-mime-info
 Version: 1.0-0ubuntu1
 Architecture: amd64
 Maintainer: Ubuntu Developers <email address hidden>
 Installed-Size: 2188
 Depends: libc6 (>= 2.3), libglib2.0-0 (>= 2.24.0), libxml2 (>= 2.7.4)
 Conflicts: libglib2.0-0 (<< 2.17.2), libgnomevfs2-0 (<< 1:2.24.0), tracker (<< 0.6.90)
 Section: misc
 Priority: optional
 Multi-Arch: foreign
 Homepage: http://freedesktop.org/wiki/Software/shared-mime-info
 Description: FreeDesktop.org shared MIME database and spec
  This is the shared MIME-info database from the X Desktop Group. It is required
  by any program complying to the Shared MIME-Info Database spec, which is also
  included in this package.
  .
  At this time at least ROX, GNOME, KDE and XFCE use this database.
 Original-Maintainer: Sebastian Dröge <email address hidden>

Revision history for this message
Steve Langasek (vorlon) wrote :

           One setting is provided to control the pipeline depth in cases
           where the remote server is not RFC conforming or buggy (such as
           Squid 2.0.2). Acquire::http::Pipeline-Depth can be a value from 0
           to 5 indicating how many outstanding requests APT should send. A
           value of zero MUST be specified if the remote host does not
           properly linger on TCP connections - otherwise data corruption will
           occur. Hosts which require this are in violation of RFC 2068.

I guess an alternative to trying to SRU this everywhere it affects images would be to ask Amazon to support RFC 2068?

summary: - apt-get hashsum/size mismatch due caused by swapped local file names
+ apt-get hashsum/size mismatch because s3 mirrors don't support http
+ pipelining correctly
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Scratch my previous comment in #14, it was a test error.

Revision history for this message
Paul Belanger (pabelanger) wrote :

I'm also seeing this too on preceise.

Revision history for this message
Paul Belanger (pabelanger) wrote :

I clicked 'Post Comment' too fast. I'm seeing this when I do an apt-get update.

W: Failed to fetch bzip2:/var/lib/apt/lists/partial/mirror.csclub.uwaterloo.ca_ubuntu_dists_precise_main_source_Sources Hash Sum mismatch

W: Failed to fetch bzip2:/var/lib/apt/lists/partial/mirror.csclub.uwaterloo.ca_ubuntu_dists_precise_universe_binary-amd64_Packages Hash Sum mismatch

W: Failed to fetch bzip2:/var/lib/apt/lists/partial/mirror.csclub.uwaterloo.ca_ubuntu_dists_precise_main_binary-i386_Packages Hash Sum mismatch

W: Failed to fetch bzip2:/var/lib/apt/lists/partial/mirror.csclub.uwaterloo.ca_ubuntu_dists_precise_universe_binary-i386_Packages Hash Sum mismatch

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Steve,

We've been able to prove that disabling HTTP pipelining fixes the problem, but the question remains:

How come package-A.deb has package-B's contents? The downloaded files are whole and match the proper checksums, but they have the wrong content. If I understand comment #15 right, shouldn't the content fail to checksum properly?

Changed in apt (Ubuntu Precise):
importance: Undecided → High
Revision history for this message
Paul Belanger (pabelanger) wrote :

Apparently, my problems have cleared up. Not sure why.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
Download full text (3.4 KiB)

The problem here is that S3 is broken regarding HTTP pipeline responses. What is happening is that the client is that apt is requesting files A, B, C, D in the same requests and S3 appears to be responding with files B, C, D, A (or some simularly arbitrary order). The result is that apt saves the data under the wrong file name.

Here is a summary of a packet trace (edited for readability). S3 uses the "ETag" header to store the MD5 of the file. So using the the Etag, you can conclude what file contents are.

GET /ubuntu/pool/main/libx/libxdamage/libxdamage1_1.1.3-2build1_amd64.deb HTTP/1.1
GET /ubuntu/pool/main/libx/libxfixes/libxfixes3_5.0-4ubuntu1_amd64.deb HTTP/1.1
GET /ubuntu/pool/main/libx/libxxf86vm/libxxf86vm1_1.1.1-2build1_amd64.deb HTTP/1.1
GET /ubuntu/pool/main/m/mesa/libgl1-mesa-glx_8.0.1-0ubuntu2_amd64.deb HTTP/1.1
GET /ubuntu/pool/main/g/gtk+2.0/libgtk2.0-common_2.24.10-0ubuntu5_all.deb HTTP/1.1
GET /ubuntu/pool/main/libx/libxcomposite/libxcomposite1_0.4.3-2build1_amd64.deb HTTP/1.1
GET /ubuntu/pool/main/libx/libxcursor/libxcursor1_1.1.12-1_amd64.deb HTT
GET /ubuntu/pool/main/libx/libxi/libxi6_1.5.99.3-0ubuntu2_amd64.deb HTTP/1.1
GET /ubuntu/pool/main/libx/libxinerama/libxinerama1_1.1.1-3build1_amd64.deb HTTP/1.1
        6e773d7e05527fa35a201209660b517c - libxfixes3
            x-amz-id-2: EgVy5vizxmUOYRxfyjFXJzt7k30EVGtn2gyZ1q0P/jN3ROZS+3KjUEwSzX3JFJKE
            x-amz-request-id: 8CD6DF060D72CFDA
        c06221dc919029cf8482198e724fa111 - libxxf86vm1
            x-amz-id-2: xr56YqhmQzov1YP+Oev9lnpKQtOYX98DkrJD7aWkSwQq8Ke77+4PD4yHiRvcBo6w
            x-amz-request-id: 67B263265E35A521
GET /ubuntu/pool/main/g/gtk+2.0/libgtk2.0-0_2.24.10-0ubuntu5_amd64.deb HTTP/1.1
        a47eb17218d742a4ceae59cc08e80d8c - libgl1-mesa-glx
            x-amz-id-2: Ou92TrRA7dfHyF97UEfO0TvQ9XI9vOxYj0LgWCR3GXFHMhAZ7nzQ71cGUqMQHI1X
            x-amz-request-id: C9393E43C43C6471
        90787d373defad09a6dd99db3c3e7614 - libgtk2.0-common
            x-amz-id-2: ze4hg1y4hT8+ZXXzY7JHRclypybdYj+HIQEG6r2eulAt+twaZn/GG+y6ifi9MnL8
            x-amz-request-id: 5DDC52A0301600FC
        77440ea3c635a3243cc3f6bc98305f22 - libxcomposite1
            x-amz-id-2: +hzkmvonwW+1nh4gcsQ/qXsmXUq2CUj4hAQ69EjqAEb6MVOpxHvPnh8l7Qh2Wc5Q
            x-amz-request-id: 54FDE3289C1021F3
        f5754520b029296a6f0a5a14a1dd4b5a - libxcursor1
            x-amz-id-2: WvhZ9o1R3kMvsFEmERvI2V/eOY+0ksHnvs1suVCX1Es8aa/XxyU7drgvj0Bj9zHj
            x-amz-request-id: 9F408ACCBFFA66E5
        728ef8819d6f3ec4b827d887951a86ce - libxi6
            x-amz-id-2: Tgqse2GgWGu+OEVzv/atXhjP6FdszvjVeFGic3gq5LxGUXDFVS5YdF+CGFtF7dkC
            x-amz-request-id: C57783745F1192CE
        90b84d263ad2b7f76868ce02967c2107 - libxinerama1
            x-amz-id-2: 1Mz7B1r13S1NnNIingOxj2iOlQq8a5arP7K/Wt/zgd+NL78Q73/MRDzFhPf4kXFL
            x-amz-request-id: DB666BE3004C6CD9
        652e1f2550726fc72fa432413e6e287b - libxdamage1
            x-amz-id-2: X5TQqCJey0xtne7jKZYlX+8c5AqRM2aTVD1cvEHWPIx4qcc8L9rvyqlG0FJcPT4g
            x-amz-request-id: 74A2EB9093B4F00A
        c8f474321079048af432d47d66a4afb4 - libgtk2.0-0
            x-amz-id-2: Fgkgv/JY7WnSojk+pBeQ430gNCvjRvXG1KZzrbQfEG2j/jqPvvcFVRS07ZBMsFa9
  ...

Read more...

Revision history for this message
Steve Langasek (vorlon) wrote :

Note that this issue with HTTP pipelining is known to apt upstream, and documented in the apt.conf(5) manpage - this is exactly why the commandline option exists to toggle the behavior, basically.

           One setting is provided to control the pipeline depth in cases
           where the remote server is not RFC conforming or buggy (such as
           Squid 2.0.2). Acquire::http::Pipeline-Depth can be a value from 0
           to 5 indicating how many outstanding requests APT should send. A
           value of zero MUST be specified if the remote host does not
           properly linger on TCP connections - otherwise data corruption will
           occur. Hosts which require this are in violation of RFC 2068.

Scott Moser (smoser)
Changed in cloud-init (Ubuntu Precise):
importance: Undecided → High
status: New → Fix Released
Changed in cloud-init (Ubuntu Oneiric):
importance: Undecided → High
status: New → Triaged
Changed in cloud-init (Ubuntu Natty):
importance: Undecided → High
status: New → Triaged
Changed in cloud-init (Ubuntu Maverick):
importance: Undecided → High
status: New → Triaged
Changed in cloud-init (Ubuntu Lucid):
importance: Undecided → High
status: New → Triaged
Changed in cloud-init (Ubuntu Hardy):
importance: Undecided → High
status: New → Triaged
description: updated
Revision history for this message
Scott Moser (smoser) wrote :

Ben and I did backports, and I've uploaded to lucid, maverick, natty, oneiric.
We're waiting on approval for opulation into -proposed.

The plan from there is to get testing and population into -updates.
Then,
  * release new images including this modification
  * give some time for existing instances to populate
  * email to different lists announcing need for the update/change
  * switch S3 mirrors into production.

Bryce Harrington (bryce)
description: updated
Revision history for this message
Scott Moser (smoser) wrote :
Scott Moser (smoser)
description: updated
Bryce Harrington (bryce)
description: updated
Revision history for this message
Bryce Harrington (bryce) wrote :

I notice this bug is reported against both apt and cloud-init, but the patches are only for cloud-init. Are there any changes required of apt? If not, perhaps those tasks can all be closed out?

Revision history for this message
Clint Byrum (clint-fewbar) wrote : Please test proposed package

Hello Ben, or anyone else affected,

Accepted cloud-init into oneiric-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in cloud-init (Ubuntu Oneiric):
status: Triaged → Fix Committed
tags: added: verification-needed
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Hello Ben, or anyone else affected,

Accepted cloud-init into natty-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in cloud-init (Ubuntu Natty):
status: Triaged → Fix Committed
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Hello Ben, or anyone else affected,

Accepted cloud-init into maverick-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in cloud-init (Ubuntu Maverick):
status: Triaged → Fix Committed
Changed in cloud-init (Ubuntu Lucid):
status: Triaged → Fix Committed
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

FYI, I accepted the maverick one, even though it is EOL very soon, thinking that it may be useful for generating one last AMI which people will use to migrate to natty. If no new cloud image will be produced, I don't see any reason to actually move that to maverick-updates.

Steve Langasek (vorlon)
Changed in apt (Ubuntu Hardy):
status: New → Won't Fix
Changed in apt (Ubuntu Lucid):
status: New → Won't Fix
Changed in apt (Ubuntu Maverick):
status: New → Won't Fix
tags: added: rls-mgr-p-tracking
Steve Langasek (vorlon)
Changed in apt (Ubuntu Natty):
status: New → Won't Fix
Steve Langasek (vorlon)
Changed in apt (Ubuntu Oneiric):
status: New → Won't Fix
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Re #29: I plan on spinning a final Maverick on its EOF for historical reasons. The other reason why I would like to see Maverick get this to updates is for people who upgrade from Maverick to something newer.

Re #27: I've confirmed the fixes in proposed work. We're good to go here.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.5.10-0ubuntu1.7

---------------
cloud-init (0.5.10-0ubuntu1.7) lucid-proposed; urgency=low

  * add ability to configure Acquire::http::Pipeline-Depth via
    cloud-config setting 'apt_pipelining' (LP: #948461)
  * debian/cloud-init.postinst: address population of apt_pipeline
    setting on installation.
 -- Scott Moser <email address hidden> Fri, 16 Mar 2012 14:32:50 -0400

Changed in cloud-init (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.6.1-0ubuntu8.1

---------------
cloud-init (0.6.1-0ubuntu8.1) natty-proposed; urgency=low

  * add ability to configure Acquire::http::Pipeline-Depth via
    cloud-config setting 'apt_pipelining' (LP: #948461)
  * debian/cloud-init.postinst: address population of apt_pipeline
    setting on installation.
 -- Ben Howard <email address hidden> Fri, 16 Mar 2012 11:05:48 -0600

Changed in cloud-init (Ubuntu Natty):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.6.1-0ubuntu22.1

---------------
cloud-init (0.6.1-0ubuntu22.1) oneiric-proposed; urgency=low

  * add ability to configure Acquire::http::Pipeline-Depth via
    cloud-config setting 'apt_pipelining' (LP: #948461)
  * debian/cloud-init.postinst: address population of apt_pipeline
    setting on installation.
 -- Ben Howard <email address hidden> Fri, 16 Mar 2012 15:44:50 -0600

Changed in cloud-init (Ubuntu Oneiric):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.5.15-0ubuntu3.1

---------------
cloud-init (0.5.15-0ubuntu3.1) maverick-proposed; urgency=low

  * add ability to configure Acquire::http::Pipeline-Depth via
    cloud-config setting 'apt_pipelining' (LP: #948461)
  * debian/cloud-init.postinst: address population of apt_pipeline
    setting on installation.
 -- Scott Moser <email address hidden> Fri, 16 Mar 2012 14:36:07 -0400

Changed in cloud-init (Ubuntu Maverick):
status: Fix Committed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote :

this has been mitigated by applying a fix to the EC2 images.

Changed in apt (Ubuntu Precise):
status: Confirmed → Won't Fix
Changed in apt (Ubuntu):
importance: High → Low
tags: added: rls-p-nottracking
removed: rls-mgr-p-tracking
Revision history for this message
T[m] (t-w-sch) wrote :

apt (Ubuntu Precise)

running Ubuntu in VM-ware through a proxy and a Secure-Web-Gateway.

To disable the http pipeline when doing a dist-upgrade did the trick for me.
(temporary) sudo apt-get -o Acquire::http::Pipeline-Depth="0" -y dist-upgrade
(permanent) echo 'Acquire::http::Pipeline-Depth "0";' | sudo tee /etc/apt/apt.conf.d/99-no-pipelining

Revision history for this message
Rolf Leggewie (r0lf) wrote :

Hardy has seen the end of its life and is no longer receiving any updates. Marking the Hardy task for this ticket as "Won't Fix".

Changed in cloud-init (Ubuntu Hardy):
status: Triaged → Won't Fix
Revision history for this message
Julian Andres Klode (juliank) wrote :

This was fixed in 1.1 or later, not entirely sure about the exact version.

Changed in apt (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.