Magnum Troubleshooting Guide

Registered by Ton Ngo

Users, new contributors frequently ask for help on the Magnum IRC in debugging various problems. When an experienced developer is not available to help, the user can be left frustrated with little recourse. This can also inhibit adoption of Magnum by Cloud Providers. Since many of these troubleshooting sessions follow similar patterns, it would be useful to document a troubleshooting guide for common scenarios.

At the contributor meeting in the Tokyo Design Summit, we decided to put together a skeleton for a Troubleshooting Guide so that contributors can fill in with content over time.

Initially the following scenarios will be considered:

- What do do when a Bay create fails

- Heat client examples for debugging failed heat stacks

- How to introspect k8s (when heat works and k8s does not)

- How to check on a swarm cluster (see membership information, view master/agent containers)

- Cluster networking issues (whoops I don't have internet access!)

- TLS Issues

- debugging Barbican issues

- etcd

- Docker CLI

Blueprint information

Adrian Otto
Ton Ngo
Ton Ngo
Series goal:
Accepted for newton
Milestone target:
Started by
Ton Ngo
Completed by
Adrian Otto


12-15-2016 - suro-patz
swarm-bay creation failed on devstack due to default size of docker-volume-size. Hongbin helped me to troubleshoot and fix the same. Either you can refer to the IRC log to include this, or if there is shared doc/file, let me know, if can edit/add this case.

Gerrit topic:,topic:bp/magnum-troubleshooting-guide,n,z

Addressed by:
    Skeleton for Troubleshooting Guide

Addressed by:
    Add initial documentation for troubleshooting gate

Addressed by:
    Add troubleshooting for network

Addressed by:
    Troubleshooting Kubernetes networking

Addressed by:
    Add Flannel troubleshooting

Addressed by:
    Add etcd troubleshooting

Addressed by:
    Add troubleshooting steps for trustee creation


Work Items

Work items:
Initial Outline: DONE
[tango] Heat stacks: DONE
Barbican service: TODO
[tango] Cluster internet access: DONE
[tango] Kubernetes Networking: DONE
[tango] etcd service: INPROGRESS
[tango] flannel service: INPROGRESS
Kubernetes services: TODO
Swarm services: TODO
Mesos services: TODO
Barbican issues: TODO
Docker CLI: TODO
Request volume size: TODO
Heat software resource scripts: TODO
[thomas-maddox] Troubleshooting Gate: DONE
[tango] Debugging unit tests: INPROGRESS
[dimalg] Gate logs: TODO
Database migration: TODO

This blueprint contains Public information 
Everyone can see this information.