Sending a signature to the crash DB before the full core dump

Registered by Evan

Investigate whether we can work around the problems of not being able to retrace core dumps on the local system and ASLR making generating an accurate retrace without function names difficult. Being able to create a stack trace with just addresses would allow us to submit that small chunk of data to the crash database before sending the full data set. The crash database could then decide if this crash already exists or if it needs the full core dump so that it can create a new crash bucket with that retraced.

Google's Breakpad is able to do this through Microsoft's Minidump format. We may be able to mimic this behavior in apport without having to link every application to Breakpad.

Blueprint information

Status:
Started
Approver:
Steve Langasek
Priority:
Medium
Drafter:
Evan
Direction:
Approved
Assignee:
Evan
Definition:
Approved
Series goal:
Accepted for precise
Implementation:
Started
Milestone target:
milestone icon precise-alpha-2
Started by
Evan

Related branches

Sprints

Whiteboard

[vorlon] N.B.: breakpad integration should probably be discussed with the security team, since arbitrary ptrace support is disabled by default on Ubuntu as a security measure. They can probably how best to implement this to maximize the utility without accidentally compromising security.

Work items:
[pitti] Change the infrastructure to keep around the stack traces: DONE
[pitti] Duplicate signature: DONE
[pitti] add client-side API, query for duplicate checking, and corresponding UI behaviour: DONE
[pitti] update Apport CrashDatabase duplicate DB for matching/bucketing crashes with different client signatures: DONE
Investigate how and when we can attach breakpad to the dying process or to the core dump (preferred) so that we can generate a breakpad report: POSTPONED
Is there information that Breakpad is collecting that we're not collecting, and how can we collect that? Breakpad is currently ptracing the process, which might give it information which we currently cannot get at since we do not do that as well: POSTPONED
Does breakpad give us better reports than the simple address list? If breakpad has magic to get symbol names anyway, then we wouldn't have to do the delta matching of stack frame lists: POSTPONED
[ev] Integrate crash signature into Cassandra work. Take signature, report back with a "yes we have that.": DONE

work items for ubuntu-12.04-beta-1:
[pitti] investigate whether it is worth to restrict address sig matching on a common prefix: DONE

pitti, 2012-01-10: Address signatures have been in use for 6 weeks now, and a large enough sample size. 679 (87%) of crashes have just one address signature, 771 (99.2%) have <= 5, and only one crash has 10 signatures. This shows that we do not need to be worried about prefix matching right now, the system works well enough (and also more correctly) with full signature matching.
SELECT count(signature) AS numsigs, crash_id FROM address_signatures GROUP BY crash_id ORDER BY numsigs;

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.