Mir/XMir Quality and Performance Benchmarking for Saucy

Registered by Francis Ginther

To keep up with the Mir/XMir development, we need to focus attention on improving the QA strategy and assessing the quality through bench-marking and other testing. This includes:
- broadening test cases
  - performance (more performance test suites)
  - unit/integration

- improving the performance dashboard
  - what views provide the most readable data
  - showing trends over time
  - better separation of systems/benchmarks, ...

Blueprint information

Status:
Complete
Approver:
Francis Ginther
Priority:
Undefined
Drafter:
Francis Ginther
Direction:
Approved
Assignee:
Canonical Platform QA Team
Definition:
Obsolete
Series goal:
Proposed for saucy
Implementation:
Unknown
Milestone target:
None
Completed by
Gema Gomez

Related branches

Sprints

Whiteboard

Mir/XMir Quality and Performance Benchmarking for Saucy

Goals:
- broadening test cases
  - performance (more performance test suites)
  - unit/integration
- improving the performance dashboard
  - what views provide the most readable data
  - showing trends over time
  - better separation of systems/benchmarks, ...

Current State of Testing:
- Performance benchmarking on xmir test ring
- Stress testing as part of -ci job
- On demand testing of ppas using the xmir test ring
Needs:
- Multi-monitor testing
  - May be able to utilize cert lab
  - Believe to have hardware in the QA Lab to do this
    - Call for testing plan: https://wiki.ubuntu.com/Mir/MultiMonitorTesting
    - Unity 7 had multimonitor tests
    - Need to test physical hot-plugging (priority 1)
    - Settings changes mirror/extend (priority 2)
    - Change resolution (priority 3) (already automated by checkbox)
    - Change orientation (priority 3) (already automated by checkbox)
    - Use unity 7 as the baseline for xmir
    - multimonitor demo client
- Screen blanking, suspend/resume testing
- Valve/Steam testing (add)
  - Run a specific set of games to cover valve needs
  - evaluate test cases for valve and revisit this as additional performance benchmarks

Testing Gates:
- Upstream:
  - build, unit and stress test
- Daily Release:
  - No specific testing, tested through higher layers
  - Is there a mir or xmir specific integration tests?
    - No, use existing unity 7 and augment with power and fps collection
    - For native mir, we would like to test the Qt/mir backend. There was an implementation plan, but it never got implemented.
      - power testing: do have tests that we can utilize from cert lab
        - Need to make sure we don't have severe regressions regarding battery life.
      - performance: relying on stated benchmarks
        - Want to focus on this for mir, xmir is less of a concern due to using xstack
        - Want to collect a baseline now
        - Will monitor performance benchmarks to determine of performance has dropped belowe a
- Image:
  - Desktop:
    - None at this time
    - Once we hit FF, xmir will be default and will be tested through higher layers
  - Touch Image:
    - None at this time
    - Need to add input tests separate from application tests
    - Instrument frame rate measurements:
      - add to lttng reporting?
      - could be impacted by underlying changes

Current State of Performance Benchmarking:
- http://reports.qa.ubuntu.com/graphics/openarena/

Performance Benchmarking Needs:
- Comparison of X vs XMir for different machine/test combinations. I.e.- I need to be able to see that XMir was 7% slower than X on the intel/openarena tests.
  - Mock up (thanks Thomi): https://plus.google.com/110637625325308955144/posts/9C6JmySiXGZ
- A trend of performance for XMir (and maybe X as well) over time. I need to be able to see that XMir FPS has improved over the last month's worth of test runs.
- Categorization of data:
  - Comparision by GPU manufacturer and/or chipset

What are the question's we're trying to solve with the graphs:
 - Has xmir performance changed over the past week
 - How does x compare with xmir?
   -

What would be helpful
 - Keep the tabs per benchmark, then separate the graphs by chipset
 - Use line graphs for trend analysis (priority 2)
 - Show a single set of x/xmir results to show percent difference (priority 1)
 - Look at what http://openbenchmarking.org/ is doing
 - Investigating the use of more phoronix-test-suite benchmarks
 - Some 'free' benchmarks from the cert lab
 - Providing test coverage for mesa (drive requirements, then create a plan for next UDS)
 - Benchmarking of unity itself?
   - Has been discussed, but have yet to see performance concerns on shell components.
   - No work is planned.
 - Graphical Corruption
   - Tests exist in phonix-test-suite
   - Need to address for 14.04

Action Items:
 - [thomi, thomas.v, robert, kevin] Define the top priorities for additions to the dashboard
 - [qa team] Automation of multi-monitor testing

Application Support
 - Support for X application will continue after the switch from Xmir to Mir
   - Legacy support for X
   - Native support for Qt, etc.
 - Support for mixed (qt, but with some X)
   - Need to mine some data to determine which apps these are and how many
   - Would most likly be run by the legacy container

 - We do have a list of applications that need to be support

(?)

Work Items