maverick kernel 2.6.35-22 panics when booting on Dell Precision T3500

Bug #653238 reported by matt_hargett
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Linux
Unknown
Unknown
linux (Ubuntu)
New
Undecided
Unassigned

Bug Description

On my Dell Precision T3500 workstation, Ubuntu 10.04 worked wonderfully. When installing 10.10, either via upgrade or a fresh install with a re-format, I get a kernel panic on boot with 2.6.35-22. Booting into the 2.6.32-21 kernel works fine.

Attached is the dmesg log generated from busybox while the kernel was crashed. The lspci output was generated in the 2.6.32-21 kernel, as I can't run that command in the panic'd kernel.

This is a critical bug, as this is the standard workstation hardware used by my company.

Tags: kj-triage
Revision history for this message
matt_hargett (matt-use) wrote :
Revision history for this message
matt_hargett (matt-use) wrote :
tags: added: kj-triage
Revision history for this message
matt_hargett (matt-use) wrote :

This is a known kernel regression:
https://bugzilla.kernel.org/show_bug.cgi?id=16228

The aforementioned kernel bug link contains a patch to fix the regression.

Revision history for this message
Avi Carmi (avi-carmi) wrote :

I have the same identical problem as Matt (and Trigger, see: https://bugs.launchpad.net/ubuntu/+bug/659149).

Same DELL Precision T3500

there is a new A08 BIOS dated 9/17/10 available from Dell, which I'll install as soon as I am done posting.

http://support.dell.com/support/downloads/download.aspx?fileid=416864

however... found this too in https://bugzilla.kernel.org/show_bug.cgi?id=16228

> FYI: I just updated my T3500 to the latest BIOS A08 from Dell. It still doesn't
> boot 2.6.34 without the proposed patches. So still no BIOS fix...

I'll have to recall how to apply patches and compile the kernel, last time I've done it was years ago RHEL 5 or 6, so spoiled now by Ubuntu

found this yet in another thread: https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/647043/comments/14

which has a link to a patched test kernel, with the same list of patches:

> I had Lisa do a series of tests. We subsequently discovered a series of patches that have been submitted upstream but are a bit > invasive (ie. they won't land in upstream 2.6.36, and will likely have to wait for 2.6.37).
>
> http://marc.info/?l=linux-kernel&m=128476278029918&w=2
> Patch : https://patchwork.kernel.org/patch/189182/
> Patch : https://patchwork.kernel.org/patch/189232/
> Patch : https://patchwork.kernel.org/patch/189242/
> Patch : https://patchwork.kernel.org/patch/189252/
>
> I've subsequently built a test kernel with the above patches applied. Lisa has confirmed this fixes the issue and the
> USB ports work with this test kernel. The test kernel can be found at the following:
>
> http://people.canonical.com/~ogasawara/lp647043/i386/
>
> I'll have to discuss with the Ubuntu Kernel SRU team if they will qualify for a Stable Release Update to Maverick.

-avi

Revision history for this message
Avi Carmi (avi-carmi) wrote :

the A08 BIOS update did not resolve the problem.

 pci=nocrs does work with the Ubuntu kernel.

did not yet try the test patched kernel (got to get some real work done...)

-avi

Revision history for this message
Cristian Măgherușan-Stanciu (cristi) wrote :

I'm also affected by this bug on my T3500.

Is there any other side effect of the "pci=nocrs" option besides fixing this bug?

Revision history for this message
Trigger (triggerds) wrote :

Hi,

By adding the option "pci=nocrs", I have not seen real side effect. The only strange thing I can see are some text furtively displayed at the startup and the shutdown of Ubuntu, but nothing that affect the user.

Revision history for this message
Bjorn Helgaas (bjorn-helgaas) wrote :

This is a different problem from bug #64703.

In this problem, the Dell T3500 panics when we move a USB device into an e820-reserved area. There's a patch upstream that fixes this (commit 4dc2287c1805e7fe8a7cb90bbcd44abee8cdb914, see https://bugzilla.kernel.org/show_bug.cgi?id=16228).

In bug #647043, Dell 1536 we also move USB devices, but the new address is not in an e820-reserved area, and the machine doesn't panic. The USB devices just don't work. We don't understand this problem yet (see https://bugzilla.kernel.org/show_bug.cgi?id=31602).

Coincidentally, the same workaround, "pci=nocrs", works for both bugs.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.