Re: [boost] [log] Boost.Log Review...

The above wasn't explicitly distilled in the documentation in a single place, but I think future developers would find it very useful to see that spelled out explicitly. Given the extremely modular design of the library, I understand why the docs are the way they are and know that some of this is touched on in the Design Overview, but until I became very competent with the library, the documentation seemed too abstract. Despite the importance of logging, becoming a logging expert was tangential to the problem I was trying to solve and I would have personally found the above summary very useful.
Well, that summary may describe your use case very well, but it is by no means complete with regard to Boost.Log as a whole. I would even say it's misleading in some parts - in particular: "Loggers are collections of Sinks" and "Think of channels as a Sink-level observer".
Fair response. I think a more general way for me to classify my response/concern (instead of providing a loosely worded example) would be: "Which class does what and how do the classes relate to each other?" Right now answering that question results in a decent exploration of Boost.Log's code and documentation. It would have been useful to me, and likely future developers, to have the classes that comprise Boost.Log summarized so that conceptual map could have been more easily formed without having to dig in to the actual headers. Something similar to the contents of the Definitions page, but with architectural clues. http://boost-log.sourceforge.net/libs/log/doc/html/log/defs.html As an example, expanding on the Sinks definition from above:
Log Sink: A target, to which all log records are fed after being collected from user's application. It is sink's nature that defines where and how the log records are going to be stored.
Maybe include something as brief as:
With a Log Sink, developers can modify the filtering and add additional backends.
I consider myself a modestly competent user of boost, but still am getting parts of this wrong after over three months of using/digging in to Boost.Log.
A hindsight 20/20 architecture and terminology for Boost.Log could be simplified in terms of:
What is "20/20"? I'll continue my answers assuming that the following was your understanding of the current Boost.Log architecture.
Correct. The contents of this email has evolved from sometime around mid-January or whenever I learned that Boost.Log was going to be reviewed and accepted. It also crudely mirrors my understanding of the library as I went through the development process of utilizing the library.
*) Log Record pre-conditions. Sinks control whether or not a Log Record is to be generated
Looks like you're confusing sinks with filters. Also, don't forget about the global filter.
I became sloppy with my terminology. Filters are stored in Sinks and the Global core. Speaking of filters, yesterday I stumbled across a feature request:
Program received signal SIGABRT, Aborted. 0x00007fff81de2fe6 in __kill () (gdb) bt #0 0x00007fff81de2fe6 in __kill () #1 0x00007fff81e83e32 in abort () #2 0x00007fff81e03de2 in _Unwind_Resume () #3 0x00000001000a3b1c in main (argc=1, argv=0x7fff5fbff338) at /tmp/foo.cc:267
If I wrap everything in a boost exception, I get:
exception: /usr/local/include/boost/log/filters/attr.hpp(134): Throw in function (unknown) Dynamic exception type: N5boost16exception_detail10clone_implINS0_19error_info_injectorINS_12log_mt_posix13missing_valueEEEEE std::exception::what: Requested attribute value not found
It'd be nice if the error message were able to somehow include the string representation of the value (attribute) it wasn't able to find. Turns out it was a type-o so I had an unregistered attribute, but the type-o would've been very easy to identify if it were included in the error message someplace (instead it took a fair amount of manual investigation and backtracking to figure out it was a filter searching for an attribute that didn't exist). [snip useful responses/corrections ]
Observers could then be placed on Sinks that would stamp Log Records with Severities, or Channels, or in whatever hodge-podge configuration the developer wanted and such examples would be very cookbook.
Now that you presented a code snippet, your suggestion makes more sense to me. Let me express my thoughts on it in two parts:
1. Custom log records. Yes, the model of "mandatory" attributes look promising. It could, at least, partially solve the problem of possible library misconfiguration that was highlighted in another review. I imagine that some kind of log record prototype (or factory) should be defined by the user in order to be able to construct log records with the specific set of attribute values. Filters and formatters could then request these mandatory attribute values without having to worry about their absence. Lookup performance may also be higher, but I doubt it will be compile-time. The idea is quite vague now but after some thinking over it might become a good addition to the library.
2. Designing sinks (and the core, I presume) after the log records. No, that looks unacceptable to me, at least for what Boost.Log tries to achieve. This would introduce too much coupling between the logging side of the library (which tends to be exposed to many parts of the application) and the consuming side (which tends to be configured in a single place of the application). To my mind, these sides of the library should be maximally decoupled. Also, it's not clear how to support multiple log record types in such a design.
*) Global loggers vs. per-object references to singletons
Not sure I understand.
Is it better to have global objects that reference singletons:
typedef boost::log::sources::severity_channel_logger_mt<levels::severity_level> logger_t;
BOOST_LOG_DECLARE_GLOBAL_LOGGER(default_log, logger_t); BOOST_LOG_DECLARE_GLOBAL_LOGGER_CTOR_ARGS(stats_log, logger_t, (boost::log::keywords::channel = "stats"));
or per object references to logging singletons:
typedef boost::shared_ptr< boost::log::sinks::synchronous_sink< boost::log::sinks::text_ostream_backend > > console_sink_t; typedef boost::shared_ptr< boost::log::core > logger_core_t;
struct my_obj { my_obj(logger_core_t& logger) : lg(logger) {} void foo() { BOOST_LOG(lg) << "log msg"; } logger_core_t& lg; }
int main() { console_sink_t cerr_sink; cerr_sink = boost::log::init_log_to_console( std::cerr, boost::log::keywords::filter = boost::log::filters::attr< int >("Severity") >= warning, boost::log::keywords::format = "%TimeStamp%: %_%", boost::log::keywords::auto_flush = true); my_obj o(boost::log::core::get()); o.foo(); }
... I think that example's right.
*) Channel severity loggers example
That example wouldn't be much different from the ones in sections about severity_logger or channel_logger. But I'll add a code sample.
Where I spent time with this was figuring out how multiple chan severity loggers interacted with the core and how my app was supposed to interact with the distinct loggers, which were linked in to the core, behind the scenes.
typedef boost::log::sources::severity_channel_logger_mt<levels::severity_level> logger_t; BOOST_LOG_DECLARE_GLOBAL_LOGGER(default_log, logger_t); BOOST_LOG_DECLARE_GLOBAL_LOGGER_CTOR_ARGS(lg1, logger_t, (boost::log::keywords::channel = "chan1")); BOOST_LOG_DECLARE_GLOBAL_LOGGER_CTOR_ARGS(lg2, logger_t, (boost::log::keywords::channel = "chan2")); #define LG1_WARN BOOST_LOG_SEV(lg1::get(), log_severities::warn) #define LG1_INFO BOOST_LOG_SEV(lg1::get(), log_severities::info) ... LG1_INFO << "info msg";
*) Setting up an initial console logger as line one in main()
Hmm. Ok, I'll add something about that to the tutorial.
Easy to do (see above), but I didn't see anything that said, "it would be wise to do this first and then remove your console sink after you setup your logging environment." Initially I disregarded the init_to_console() convenience functions while reading the docs, and it wasn't until I started dealing with error messages that weren't being logged to a file because my logging setup routines hadn't been completed (or had failed for some reason due to FS perms), that I suddenly took note of needing to get console logging working first.
*) Changing a backend filter at a later date
Just call set_filter on the backend, that's all.
This was one of the areas where there were lots of ways that I could do it, but I couldn't tell which was the right way. libs/log didn't suggest a preferred mechanism or right way to do it.
*) Using async logging for INFO level logs, but switching to sync for CRIT errors.
You can register two sinks - one synchronous and one asynchronous, and set their filters accordingly.
Ah, clever way of handling that.
*) Obtaining the severity of a particular logger
You mean, the default severity? There's no public API for this, but there's a protected function default_severity in the logger feature. Why would you need it?
Internally you're using spirit for log record filtering, so it's impossible from what I can tell to do that inside of Boost.Log. Having a way of poking at a sink to obtain information about its runtime status. Using the above console sink as an example:
typedef boost::shared_ptr< boost::log::sinks::synchronous_sink< boost::log::sinks::text_ostream_backend > > console_sink_t;
console_sink_t cerr_sink; cerr_sink = boost::log::init_log_to_console( std::cerr, boost::log::keywords::filter = boost::log::filters::attr< int >("Severity") >= warning, boost::log::keywords::format = "%TimeStamp%: %_%", boost::log::keywords::auto_flush = true);
std::cout << "Current Severity Level: " << /* or something like this */ cerr_sink->filter_severity() << std::endl;
I ended up adding this to my application's surrounding code.
*) How do you remove log events given a severity level at compile time. Currently this isn't explicit defined in the documentation like it is in Google's glog library:
http://google-glog.googlecode.com/svn/trunk/doc/glog.html#strip
The library has no direct means for that. Users can conditionally define their logging macros, based on BOOST_LOG_SEV or pure C++ interface and then enable or disable the particular logging statements from the source.
*nods* I just didn't see anything in the docs to do this, which is why I ended up with the practice:
#ifdef RELEASE
# define LG1_WARN BOOST_LOG(null_log::get()) #else # define LG1_WARN BOOST_LOG_SEV(lg1::get(), log_severities::warn) #endif
null_log always filters everything sent to it. Probably less than ideal, but I couldn't think of a better way to handle this.
*) Is there a way to set a bit in a Log Record that overrides a Sink's filter? For instance, there are certain events that I want to make sure go to the console 100% of the time without changing the severity of the log message. Do I just log twice? It'd be keen if Log Records could have a bit that could be set and would override a filter.
bool log_to_console = true; BOOST_LOG_SEV_CONS(lg, warning, log_to_console)<< "my message";
You can set an attribute (say, "Emergency") for that record, and setup filters so that whenever this attribute is present, the record is passed on to the console backend.
You mean a scope attribute?
boost::shared_ptr< attrs::mutable_constant< bool > > console_attr(new attrs::mutable_constant< bool >(false)); lg.add_attribute("Console", console_attr); BOOST_LOG(lg) << "Console is false by default";
{ BOOST_LOG_SCOPED_THREAD_ATTR(lg, "Console", true); BOOST_LOG(lg) << "Console is true"; }
BOOST_LOG(lg) << "Console is false again";
Like that?
*) It should be noted that the internal async log message queuing mechanism uses a vector and during very bursty logging activity that the vector can grow large in size and doesn't ever reduce in size. This is a known limitation and the use of a vector was chosen to facilitate atomic append operations for log records. Using a std::deque would be preferred for bursty applications that way space could be reclaimed easier, but as Andrey pointed out, it would require a full mutex which would reduce the performance of the library.
There is no vector in the async frontends. The problem with the current queuing mechanism is not the storage strategy but the lack of queue boundaries. But I agree, I should have drawn attention to this aspect in the docs. Will do.
*nods* I brought this up in a private email w/ you and was satisfied with your answer, but thought it worth noting for others. As for the frontends having their own queue, with the way that async frontends work and their feed_records() call, I'd assumed the backend used a time-based mechanism to come by and pick up the log records stored in per-frontend TSS queues. Something where frontends atomically pushed records on, and the backend would atomically swap() the list of records off for sorting and processing by the backend. Just a point of confusion after reading: http://boost-log.sourceforge.net/libs/log/doc/html/log/detailed/sink_fronten... and I never dove in to the source to investigate further.
*) I find the argument against Lazy Streaming of attributes to be weak.
Current BOOST_LOG() expands to a for() loop where the contents of the for() loop aren't executed unless a Sink in a Logger accepts the record.
So why can't Log Records have compile-time attributes that can be populated with typed attributes in a stream?
And then be able to extract the compile-time attribute in a backend:
Just as it is said in the rationale - the streaming expression is _not_ executed until _after_ the filtering is done. And filtering already involves attribute values, so there's no way to use these attribute values you specified in the streaming expression. Besides, after the view of attribute values is constructed, it is immutable, so nothing can be added after the filtering is done.
That's why I make the distinction between "log record preconditions," attributes that can be used in a filter, and "attributes," which is a catch-al term for data stored in a Log Record. I agree that it is impossible for streamed attributes to be used as preconditions, but streamed attributes act as an extremely convenient (from a syntax perspective) way of passing data to a backend for processing.
*) Dovetailing on the above two points, a flush bit for a given Log Record would be very nice as well. 99.999% of the time async logging is desired, but in the 0.001% of the time when a Log Record must get out to disk, chances are that one Log Record every blue moon is disproportionally valuable compared to the normal Log Record. Having a flush() that doesn't return until the backend has fsync(2)'ed data would be really spiffy.
BOOST_LOG_SEV(lg, warning)<< "My message"<< lg::flush;
As I already noted, your mileage is too biased, at least. I, personally, tend to use synchronous logging. But I do recognize that some kind of flushing ability would be nice in case of async logging.
I like the idea of using both async and sync sinks. It leaves me with a few questions, however: How does that work if they're both using the same text file backend? Will log entries still be ordered?
typedef boost::log::sinks::text_file_backend backend_t; typedef boost::log::sinks::ordering_asynchronous_sink< backend_t > sink_t; boost::shared_ptr<sink_t> sink(new sink_t( boost::make_shared< backend_t >(), keywords::order = logging::make_attr_ordering("LineId", std::less< unsigned int >()), keywords::ordering_window = boost::posix_time::seconds(1)));
Thanks again in advance. -sc -- Sean Chittenden sean@chittenden.org

On 03/14/2010 05:36 AM, Sean Chittenden wrote:
I think a more general way for me to classify my response/concern (instead of providing a loosely worded example) would be:
"Which class does what and how do the classes relate to each other?"
Right now answering that question results in a decent exploration of Boost.Log's code and documentation. It would have been useful to me, and likely future developers, to have the classes that comprise Boost.Log summarized so that conceptual map could have been more easily formed without having to dig in to the actual headers. Something similar to the contents of the Definitions page, but with architectural clues.
http://boost-log.sourceforge.net/libs/log/doc/html/log/defs.html
As an example, expanding on the Sinks definition from above:
Log Sink: A target, to which all log records are fed after being collected from user's application. It is sink's nature that defines where and how the log records are going to be stored.
Maybe include something as brief as:
With a Log Sink, developers can modify the filtering and add additional backends.
I consider myself a modestly competent user of boost, but still am getting parts of this wrong after over three months of using/digging in to Boost.Log.
The Definitions section is rather detached from the library implementation. It simply makes the user to understand the terms used in the rest of the docs. The information you seek is probably given in the Design overview. Having read that section, you should have the idea on the major components of the library and their relations. Then Detailed features description maps that knowledge on the concrete tools (down to classes and functions).
It'd be nice if the error message were able to somehow include the string representation of the value (attribute) it wasn't able to find. Turns out it was a type-o so I had an unregistered attribute, but the type-o would've been very easy to identify if it were included in the error message someplace (instead it took a fair amount of manual investigation and backtracking to figure out it was a filter searching for an attribute that didn't exist).
Yes, I've been requested to improve exception messages in another review. Will do.
*) Global loggers vs. per-object references to singletons
Not sure I understand.
Is it better to have global objects that reference singletons:
typedef boost::log::sources::severity_channel_logger_mt<levels::severity_level> logger_t;
BOOST_LOG_DECLARE_GLOBAL_LOGGER(default_log, logger_t); BOOST_LOG_DECLARE_GLOBAL_LOGGER_CTOR_ARGS(stats_log, logger_t, (boost::log::keywords::channel = "stats"));
or per object references to logging singletons:
typedef boost::shared_ptr< boost::log::sinks::synchronous_sink< boost::log::sinks::text_ostream_backend> > console_sink_t; typedef boost::shared_ptr< boost::log::core> logger_core_t;
struct my_obj { my_obj(logger_core_t& logger) : lg(logger) {} void foo() { BOOST_LOG(lg)<< "log msg"; } logger_core_t& lg; }
int main() { console_sink_t cerr_sink; cerr_sink = boost::log::init_log_to_console( std::cerr, boost::log::keywords::filter = boost::log::filters::attr< int>("Severity")>= warning, boost::log::keywords::format = "%TimeStamp%: %_%", boost::log::keywords::auto_flush = true); my_obj o(boost::log::core::get()); o.foo(); }
... I think that example's right.
The latter example is not correct, as the logging core is not a logger, and cannot be used with logging macros. Also, you save a dangling reference in my_obj, as core::get returns shared_ptr by value. I take it that you're trying to ask if acquiring a reference to the logger defined with BOOST_LOG_DECLARE_GLOBAL_LOGGER* macros is expensive. No, it's not, at least in the long run. Simply put, the macros define a module-wide reference to the logger, which is initialized upon the first request for the logger. All subsequent requests will return the initialized reference. So there's no point in keeping a reference to global loggers in your classes.
*) Channel severity loggers example
That example wouldn't be much different from the ones in sections about severity_logger or channel_logger. But I'll add a code sample.
Where I spent time with this was figuring out how multiple chan severity loggers interacted with the core and how my app was supposed to interact with the distinct loggers, which were linked in to the core, behind the scenes.
As you can see, the workflow with severity_channel_logger is quite the same as with severity_logger. Only logger construction is different - which is in line with channel_logger. I'd say, the resulting workflow is quite natural for the fused logger, don't you think?
*) Setting up an initial console logger as line one in main()
Hmm. Ok, I'll add something about that to the tutorial.
Easy to do (see above), but I didn't see anything that said, "it would be wise to do this first and then remove your console sink after you setup your logging environment." Initially I disregarded the init_to_console() convenience functions while reading the docs, and it wasn't until I started dealing with error messages that weren't being logged to a file because my logging setup routines hadn't been completed (or had failed for some reason due to FS perms), that I suddenly took note of needing to get console logging working first.
Well, I wouldn't say that registering a console sink first thing in the "main" is the solution for everyone (or recommended, for that matter). If logging fails to initialize, you will probably get an exception, in which case the most simple and reliable thing to do is to write its what() into std::cerr and bail out with an error code. In other, more complex cases, when logging may occur before the initialization, yes, it might be reasonable to register a console sink first. But I'd say, it's an individual choice.
*) Changing a backend filter at a later date
Just call set_filter on the backend, that's all.
This was one of the areas where there were lots of ways that I could do it, but I couldn't tell which was the right way. libs/log didn't suggest a preferred mechanism or right way to do it.
Are there other ways to set the filter? I don't remember those. :)
*) Obtaining the severity of a particular logger
You mean, the default severity? There's no public API for this, but there's a protected function default_severity in the logger feature. Why would you need it?
Internally you're using spirit for log record filtering, so it's impossible from what I can tell to do that inside of Boost.Log. Having a way of poking at a sink to obtain information about its runtime status. Using the above console sink as an example:
[snip]
I ended up adding this to my application's surrounding code.
Well, the filter is self-contained. In most cases you can't extract its criteria into a variable because it can be very complex. For example, how would you do it in case of this filter: attr< level >("Severity") == Dump || attr< level >("Severity") >= High || has_attr("Emergency") This filter may be very real in some server-side application, if raw packets (received or sent to clients) are logged at Dump level, all errors are logged with High severity, and occasionally Emergency records appear.
You can set an attribute (say, "Emergency") for that record, and setup filters so that whenever this attribute is present, the record is passed on to the console backend.
You mean a scope attribute?
Yes. Or create a global logger with this attribute. Whichever suits you best.
As for the frontends having their own queue, with the way that async frontends work and their feed_records() call, I'd assumed the backend used a time-based mechanism to come by and pick up the log records stored in per-frontend TSS queues. Something where frontends atomically pushed records on, and the backend would atomically swap() the list of records off for sorting and processing by the backend. Just a point of confusion after reading:
http://boost-log.sourceforge.net/libs/log/doc/html/log/detailed/sink_fronten...
and I never dove in to the source to investigate further.
What part of that section made you think so? No, backends are absolutely passive (at least, for now). There are no TSS queues. The feeding thread is spawned by the frontend, and it feeds record from the very same queue, which is used to store records in the logging threads.
*) I find the argument against Lazy Streaming of attributes to be weak.
Just as it is said in the rationale - the streaming expression is _not_ executed until _after_ the filtering is done. And filtering already involves attribute values, so there's no way to use these attribute values you specified in the streaming expression. Besides, after the view of attribute values is constructed, it is immutable, so nothing can be added after the filtering is done.
That's why I make the distinction between "log record preconditions," attributes that can be used in a filter, and "attributes," which is a catch-al term for data stored in a Log Record. I agree that it is impossible for streamed attributes to be used as preconditions, but streamed attributes act as an extremely convenient (from a syntax perspective) way of passing data to a backend for processing.
Well, there's no such distinction in the library now, and I doubt it would be a good idea to add it. It would lead to confusion which attribute is which, and eventually - more errors in setting them correctly.
I like the idea of using both async and sync sinks. It leaves me with a few questions, however:
How does that work if they're both using the same text file backend? Will log entries still be ordered?
It will probably crash, since the backend will be accessed from different threads. The backend has to be used from only one frontend.
participants (2)
-
Andrey Semashev
-
Sean Chittenden