Magnum Troubleshooting Guide
Users, new contributors frequently ask for help on the Magnum IRC in debugging various problems. When an experienced developer is not available to help, the user can be left frustrated with little recourse. This can also inhibit adoption of Magnum by Cloud Providers. Since many of these troubleshooting sessions follow similar patterns, it would be useful to document a troubleshooting guide for common scenarios.
At the contributor meeting in the Tokyo Design Summit, we decided to put together a skeleton for a Troubleshooting Guide so that contributors can fill in with content over time.
Initially the following scenarios will be considered:
- What do do when a Bay create fails
- Heat client examples for debugging failed heat stacks
- How to introspect k8s (when heat works and k8s does not)
- How to check on a swarm cluster (see membership information, view master/agent containers)
- Cluster networking issues (whoops I don't have internet access!)
- TLS Issues
- debugging Barbican issues
- etcd
- Docker CLI
Blueprint information
- Status:
- Complete
- Approver:
- Adrian Otto
- Priority:
- High
- Drafter:
- Ton Ngo
- Direction:
- Approved
- Assignee:
- Ton Ngo
- Definition:
- Approved
- Series goal:
- Accepted for newton
- Implementation:
- Implemented
- Milestone target:
- None
- Started by
- Ton Ngo
- Completed by
- Adrian Otto
Related branches
Related bugs
Bug #1548371: Document how to resolve the error "Multiple head revisions are present for given argument 'head'" | Fix Released |
Sprints
Whiteboard
12-15-2016 - suro-patz
swarm-bay creation failed on devstack due to default size of docker-volume-size. Hongbin helped me to troubleshoot and fix the same. Either you can refer to the IRC log to include this, or if there is shared doc/file, let me know, if can edit/add this case.
http://
Gerrit topic: https:/
Addressed by: https:/
Skeleton for Troubleshooting Guide
Addressed by: https:/
Add initial documentation for troubleshooting gate
Addressed by: https:/
Add troubleshooting for network
Addressed by: https:/
Troubleshooting Kubernetes networking
Addressed by: https:/
Add Flannel troubleshooting
Addressed by: https:/
Add etcd troubleshooting
Addressed by: https:/
Add troubleshooting steps for trustee creation
Work Items
Work items:
Initial Outline: DONE
[tango] Heat stacks: DONE
TLS: TODO
Barbican service: TODO
[tango] Cluster internet access: DONE
[tango] Kubernetes Networking: DONE
[tango] etcd service: INPROGRESS
[tango] flannel service: INPROGRESS
Kubernetes services: TODO
Swarm services: TODO
Mesos services: TODO
Barbican issues: TODO
Docker CLI: TODO
Request volume size: TODO
Heat software resource scripts: TODO
[thomas-maddox] Troubleshooting Gate: DONE
[tango] Debugging unit tests: INPROGRESS
[dimalg] Gate logs: TODO
Database migration: TODO