Solum

Logging Architecture

Registered by Paul Montgomery on 2013-10-14

Solum would benefit from a centralized logging solution which can provide useful information about the system to audiences ranging from customers to Solum administrators. The recommendation is to utilize Project Meniscus as the basis for centralized Solum logging.

From past experience, there are some security aspects of logging to consider for the project. For example, it is feasible to imagine a scenario where exception data may contain Solum administrative information which should not be displayed to a customer (example: a database exception could contain credential information that a customer should not be able to access). With this in mind, these are some goals to consider for Solum logging:

* The ability to easily tie customer-visible logs with the administrator-only components of the log (more detailed information) to enable customer support and root causing of issues
* Make the code highlight confidential vs customer-visible log data (and have an API to support this)
* A way to retain automatic logging of file/module and line number for enhanced Solum developer debugging/troubleshooting
* Support any Python data type as a log component
* Capability to bind information to a future log call to trace state and carry information forward (similar to StructLog)
* Maintain bind information and log state across modules
* Look/act like Python logging module where possible
* JSON log output
* Single code location to make formatting or content changes

Propose updating HACKING.rst to include logging rules such as:
* Log administrator-only data (anything that customers should not see) in 'private' log locations only (which may be filtered out before sending to customers)
* Do not log plain text private keys or passwords
* Only log the minimal amount of customer PII required for debugging/troubleshooting

Note: There is example code for a deeper dive on this topic if there is interest.

Note: Please leverage/improve code in Oslo:
http://git.openstack.org/cgit/openstack/oslo-incubator/tree/openstack/common/log.py

Blueprint information

Status:: Complete

Approver:: Adrian Otto

Priority:: Medium

Drafter:: Paul Montgomery

Direction:: Approved

Assignee:: Paul Montgomery

Definition:: Obsolete

Series goal:: Accepted for icehouse

Implementation:: Not started

Milestone target:: 2014.1.1

Completed by: Adrian Otto on 2015-06-11

Related branches

Related bugs

Sprints

Whiteboard

paulmo:
https://review.openstack.org/#/c/71970/ has been merged and includes a trace/log architecture which enables up front identification of data confidentiality.

===== paulmo added =====

I am proposing the following steps:

* Split logging into two blueprints
* The M1 blueprint would include:
- Identification of confidential data in logs by following rules in https://wiki.openstack.org/wiki/Solum/Logging
* Structured logging
* Addition of tenant/project ID and any unique user identification needed in logs to filter for each user appropriately
- This may need discussion as full authentication and RBAC isn't planned for M1

Depending on investigation into current Oslo log updates in flight, there may be another >= M2 logging blueprint which would incorporate any Oslo log changes desired by Solum.

---------

Since the oslo logging code has been integrated, should we call logging implemented at least as a first step for milestone-1? --russellb

I am not convinced that we have achieved the spirit of this blueprint yet. We want to make Solum easy for our target persona to understand and debug. I think the oslo style logging definitely helps OpenStack operators (the service providers), but I don't think it really helps individual Application Developers who will be consuming Solum. I'd like a way that they can easily get a log stream of events relevant to their own deployments (real-time and retroactively), and in a way that allows for suitable security masking of PII and secrets. The sooner we land this capability, the better, because as we add functionality and features, we can incrementally add the trace logging calls. This is more about "user logs" than "system logs". for example: Where can users check to see that their gate tests in their CI failed? If we did not have spare capacity in the cloud to deploy their app, how do we find a way to express that to them in context they will understand? --aotto

Gerrit topic: https://review.openstack.org/#q,topic:bp/logging,n,z

Addressed by: https://review.openstack.org/55102
Import logging from oslo-incubator

Gerrit topic: https://review.openstack.org/#q,topic:bp/rest-api-base,n,z

=== Note added by Devdatta Kulkarni =============

1) Paul M, Is it correct to say that the mask_password functionality in oslo/log.py similar in spirit to the point about highlighting confidential data vs. customer visible data?
While mask_password is very specific thing, I believe what you are suggesting is a generic way to mask and/or differentiate data that should not be exposed in the logs from the data that might be okay to do so. Is that correct?

2) I haven't used oslo/log, but based on the docstring at the top of the code, it seems like the main purpose of the oslo/log.py is to decorate logging messages with specific formatting context that consumer can specify. Paul M, if I am understanding what you are suggesting as part of this blueprint, the ability to bind and carry forward and logging data from one module to another. I believe this is similar to the 'context object' in the current oslo/log with additional information than just the formatting information.

=== Note added by Devdatta Kulkarni =============

I am not familiar with Oslo log so I can't really answer right now.
1) Yes. A password is only one of MANY pieces of confidential information.
2) I am not sure what capabilities it has. Testing if information persists across modules and many other things would need to be attempted before knowing these kinds of answers. StructLog did NOT persist across modules even though it appeared to do so from my experimentation.

>> Cool.
Regarding #1, I think what you are proposing above is a generic logging framework that would provide flexible way to specify and store different kinds of confidential information in a manner that can be easily filtered out from information available to a client.

Regarding #2, I think ability to persist log information across modules seems like a nice to have feature as well.

If there are any code examples that you can share about this that would be awesome!!

=== Note added by Kurt Griffiths =============

I just wanted to add my +1 for doing this sooner rather than later. Retroactively addressing security is a PITA, and only increases the risk of zero-day exploits. This doesn't preclude doing security in Solum in an iterative fashion; we can break up the critical work for the first Solum release across several milestones, and the less critical work can leak into subsequent releases.

Given recent security bugs[1] wrt logging in several OS projects, we should be striving to raise the bar by contributing code and ideas upstream into Oslo and the OSSG.

[1]: An ironic example: https://bugs.launchpad.net/bugs/1004114

(asalkeld) some thoughts

Heat's equivalent of this (it's a bit old) https://blueprints.launchpad.net/heat/+spec/user-visible-logs
Let's do something that is consistent with all the other projects please. I can imagine operators getting really pi** off if each project did something different.
There is some great work going on in Ceilometer that we could potentially make use of simply by sending notifications rather than logs (maybe notice and above)?

https://blueprints.launchpad.net/ceilometer/+spec/stacktach-integration
https://blueprints.launchpad.net/ceilometer/+spec/notifications-triggers

Gerrit topic: https://review.openstack.org/#q,topic:logging,n,z

Addressed by: https://review.openstack.org/68307
Change oslo log imports to be consistent

Addressed by: https://review.openstack.org/71970
WIP: Trace data class for enchanced Solum logging

(?)