Builds regularly hang at the very end of build process
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linaro Android Infrastructure |
Won't Fix
|
Medium
|
Unassigned |
Bug Description
https:/
Actually, on a second thought, SSH transfer is just one build step, and entire build process should be covered by Build Timeout plugin, so it doesn't work that reliable (I tested it does work in general sense, of course). Also, SSH plugin has own timeouts for networking operation, they don't help either (or it helps not in networking access).
So, issues should be reported to both plugins upstream. However, it's clear that we need other, last-resort stop-gap measure to kill runaway build slaves. And such measure was requested yet in May 2011, and implementation was slipping since then. well, because issues come in irregular waves - it hits, we're concerned, it subsides - we think we fixed it and other pressing projects overtake. Well, we can't fix it - it's all stream of random errors rooted in complexity of systems we use. System has many layers, so we should fight with errors on many levels to make system robust.
Suggestion: create BP for this, schedule for immediate execution (12.03).
Related branches
- Stevan Radaković: Pending requested
- Linaro Infrastructure: Pending requested
-
Diff: 13 lines (+4/-0)1 file modifiedutils/mangle-jobs/builders.xml (+4/-0)
summary: |
- Build regularly hang at the very end of build process + Builds regularly hang at the very end of build process |
+ /mnt/jenkins/ workspace/ linaro- android_ vexpress- ics-gcc46- armlt-stable- open/build- tools/build- scripts/ post-build- lava.py Don't know how to test this board. Skip testing. SSH: Connecting from host [ip-10-243-34-224] SSH: Connecting with configuration [snapshots. linaro. org] ... SSH: Disconnecting configuration [snapshots. linaro. org] ... SSH: Transferred 0 file(s) SSH: Connecting from host [ip-10-243-34-224] SSH: Connecting with configuration [snapshots. linaro. org file-move] ... SSH: EXEC: STDOUT/STDERR from command [reshuffle-files linaro- android_ vexpress- ics-gcc46- armlt-stable- open/101] ... WARNING: Expected directory /home/android- build-linaro/ android/ .tmp/linaro- android_ vexpress- ics-gcc46- armlt-stable- open/101 does not exist SSH: EXEC: completed after 201 ms SSH: Disconnecting configuration [snapshots. linaro. org file-move] ... SSH: Transferred 0 file(s)
So, it happened here on 2nd batch of transfers, when we transfer lava-job-info. So, that transfer was completed, 0 files were transferred because we still don't have support for vexpress LAVA config in android-build. But it never moved to next step - calling reshuffle-files.