Linaro Android Infrastructure

Experiment with separation of "infrastructure" vs "compile" failures in Jenkins

Registered by Paul Sokolovsky on 2011-11-08

Currently, there's no separation between "infrastructural" errors during build (e.g., checkout failed due to network error) vs actual compile errors. This causes lot of false negatives with CI. Try to separate different types of failures, preferrably in way which is easily visible in Jenkins (e.g., using "unstable" vs "failed" build statuses).

Blueprint information

Status:: Complete

Approver:: Данило Шеган

Priority:: Medium

Drafter:: Paul Sokolovsky

Direction:: Approved

Assignee:: Paul Sokolovsky

Definition:: Approved

Series goal:: None

Implementation:: Implemented

Milestone target:: 11.11

Started by: Paul Sokolovsky on 2011-11-18

Completed by: Paul Sokolovsky on 2011-11-23

Related branches

Related bugs

Sprints

Whiteboard

Notes:
[pfalcon 2011-11-10] From mail:
Ok, I did some Jenkins research ans here's difference between "failed"
vs "unstable" build:
https://wiki.jenkins-ci.org/display/JENKINS/Terminology

I.e. if build failed to compile (that includes checkout phase too), it
is "failed" (red ball). If it compiled, but there're special publisher
which can run tests, and those tests fail, it is "unstable" (yellow
ball).

That's not exactly what we want, but we could try to shift semantics of
"failed" to "there was infrastructural issue with setting up build" and
"unstable" to "compilation error". I'm not sure that's exactly
possible, but I've got idea to try. Its cornerstone is ability for a
build to set its status from within the slave using set-build-result
CLI command: https://android-build.linaro.org/jenkins/cli .

[pfalcon 2011-11-23] I tried jenkins-cli and well, it of course requires auth. And auth is rather cumbersome: it goes to HTTP to get java JNLP port which it then uses for actual communication. Consequently, that port needs to be open in firewall (not our case). It can use HTTP transport as alternative though, but that doesn't work with Apache proxying. In latest Jenkins version, they also throw in a bit of SSH keys to "improve security". Bottom line? Jenkins CLI is a big, big mess. And even if we wade thru it, we'll need to store credentials on slaves which we try to avoid as much as possible.

What we really need is very simple thing: "Shell script" builds runner should not treat any non-zero code as "build failed", but recognize some special exit codes as meaning other build statuses. I finally took good time to review all the wealth of Jenkins plugins (found few we really should leverage, e.g. lp:890860), but didn't found such thing! So, I went on with prototyping such plugin, and had it running. So, that's where it is - there's strategy for solution, there's a (local) prototype. That constitutes successful completion of this BP, and further work includes turning this prototype into product (set up git, clean up and commit stuff, test, propose plan for deployment, deploy).

Headline:
Experiment with separating infrastructural and compile failures using existing Jenkins statuses of "failed" and "unstable" build.

Acceptance:
Either prove that "failed"/"unstable" reuse for our case is not feasible, or set up at least one build which visibly separates infrastructural and compile failures using those status values.

(?)

Work Items

Work items:
Figure out suitable Jenkins API to mark build as "unstable": DONE
Prototype failure types separation: DONE
Lay out plan for further work: DONE

This blueprint contains Public information

Everyone can see this information.

Subscribers

No subscribers.