Re: [boost] [Logging] Review

10 Feb 2008

...
Summary: I was disappointed by the amount of code needed to get the 
most basic logging working.  The use of macros to declare loggers seems
Oh really, like - http://torjo.com/log2/doc/html/getting_started.html
Is this a lot?
...
like a bad design choice.  The absence of support for syslog was
Macros are a bad choice? You'd like to always write code like:
if ( is_logger_enabled) logger << ... ;
...
surprising.  I encountered many compiler warnings.  The documentation
About syslog - I don't think this is the end of the world. Next thing 
someone will reject my lib because it can't write Event Log...
First
- I showed you how easy it is to add it, and
- I actually added it - so it's as if it's there already

Sorry about the many compiler warnings - I did not compile with -Wall 
and that's that - you saw how easy that was to fix.
...
was unsatisfactory.  Many of the features that I would have liked to 
see were missing.  Despite all this, John seems to be enthusiastic and
Like what? I assume I'll see below...
...
responsive: there is a real chance that he will be able to fix many of 
these problems.
Well, I am enthusiastic, but this is slowly going away when I see that 
based on some minor details my library gets rejected...
...
These techniques are all very simple to use, and I feel that a logging 
library needs to be equally straightforward for basic operations:
#include <boost/logging.hpp>
log << "starting to frobnicate";
... which is already there...
...
Of course it needs to offer more features than that - but the most 
basic use needs to be simple.  (If it's not that simple, then potential 
users will instead just write to cerr as above; as their needs slowly 
grow they'll add their own ad-hoc logging functions; it will then be 
too late to take the leap to a logging library.)
So I started by reading the main_intro.html and getting_started.html 
docs.  I eventually got to "Example 1", which described how I had to 
add two new files to my project (21 lines) and call an initialisation 
function from main().  I could then log by changing my code as follows:
if (Configuration::singleton().get_bool("imapd_log_imap",false)) {
       //syslog(LOG_MAIL|LOG_DEBUG,"{%d} S: * %s",num,msg.c_str());
       L_ << "{" << num << "} S: * " << msg;
     }
Actually that should have been just

L_ << "{" << num << "} S: * " << msg;

because the macro should have contained the filter as well - the filter 
should have been enabled if "imapd_log_imap" was true or so.
...
Having fixed one error in the example (g_l_filter vs. g_log_filter) and 
copious warnings I was able to see log output on the console (twice?) 
and in a file.  This was all a lot more effort than I was expecting, 
but it did seem to work.
About the output to the console - twice, sorry, my bad - it's because 
you logged messages to cout & debug, and for Linux, debug = cout.
...
I found the use of macros in example 1 unsatisfactory.  Specifically, 
the logger "object" which I declare in the header and to which I send 
my log messages (L_ in the code above) is actually a macro, not an 
object.  This means that I have to handle it with great care.  For 
example, say I currently have:
void do_something(ostream& log_to, .......) {
   log_to << "starting to do something";
   ...
}
and I want to convert this to use this library.  I write:
void do_something(logger& log_to, .......) {
   log_to << "starting to do something";
   ...
}
This makes me wonder what you're actually trying to do.
Why do you want to pass the ostream to a function, when you have already 
have the L_ macro?

Having to pass the logger as argument to functions is way too cumbersome 
- what if you find yourself in a (member) function that has no such 
logger argument? Then, you'd just have to make sure all functions that 
call it, get such an argument, and those functions that call those 
functions get the logger as argument, and you'd end up with almost all 
functions have a logger as argument.

Due to the above problems, you'd need a way to easily locate the right 
logger in a global and easy way. Thus, macros come into play.
When defining your macro, you can make it have arguments; those 
arguments decide:
- if the logger is enabled
- and optionally what the logger is
...
This fails - and probably with an incomprehesible error message.
The motivation for the macro is sensible: it evaluates to something like
if (l.level > current_level) l
Actually, it's your call how you want to *filter* your logging. If you 
want to use levels, then yes, you'll have something like above.
You can have something that is like an on/off switch (like, filter::no_ts):

http://torjo.com/log2/doc/html/workflow.html#workflow_filter
...
so that time-consuming functions in the log line are not evaluated if 
they will not actually be stored.  But I would much prefer to see this 
optimisation made more explicit, so that the user can tell what's going 
on and avoid it if they want or need to:
Why would you want to do logging if a logger is not enabled?
I can tell you right now - if I were to do this, I'd have at least 10 
times more people telling me - hey, I want this to happen automatically.
...
I think that what's practically needed to make this work is to bundle 
together the current filter and logger objects into a single _object_, 
rather than bundling them together using a macro.
I do agree with you here - I do want to allow for this - in fact, it's 
on my TODO list.
...
Next I tried to use the library's "filter" mechanism in place of the 
if-condition you can see above.  I looked at "Step 1: Filtering the 
message" at workflow.html, expecting to find a filter that I could 
initialise with my condition (as a value, or by reference to a 
variable, or passing a functor or function-pointer).  But it seems that 
all of the provided filter classes are things to provide various sorts 
of thread-safety.  Writing my own is not difficult, though it might be 
more conventional to use a functor (i.e. operator()()) rather than a 
class with an is_enabled member [this would allow you to use things 
like lambda expressions, I think].  So I wrote:
Fair enough - yes this is a very good idea - to use functors instead of 
an is_enabled member function. I'll put that on my TODO list.
...
struct log_imap_filter {
   bool is_enabled() const {
     return Configuration::singleton().get_bool("imapd_log_imap",false);
   }
};
Using this involved more macros:
// log.hh:
typedef log_imap_filter filter_type;
BOOST_DECLARE_LOG_FILTER(g_log_filter, filter_type)
// log.cc:
BOOST_DEFINE_LOG_FILTER(g_log_filter, filter_type )
I don't know why these need to be macros.  It seems to obfuscate what 
is going on to me.
Well in fact, please read:
http://torjo.com/log2/doc/html/defining_your_logger_filter.html

In there, I stated that you can choose to declare/define your 
loggers/filters manually - or, you can choose macros, which do help 
quite a bit.
For instance:
- the macros make sure the logger is created before its first use
- the macros make sure the logger is destroyed last (that is, you won't 
run into the "using the logger after it's been destroyed" scenario)
...
The last thing that I needed to replace my syslog call was to specify 
the log level, which I noticed in workflow.html under "Step 2A: 
Gathering the message".  (The fact that I couldn't find any description 
of this fairly central functionality anywhere else, e.g. in any of the 
pages in the tutorial section, seems typical of the documentation).  
Based on that example, I tried:
L_(dbg) << "happily debugging";
L_(debug)
...
The final thing that I looked at in detail was scoped logging.  When 
I'm writing more detailed logging code (e.g. for debugging) I generally 
make it scoped, i.e. reporting interesting function entries and 
normally exits.  I like to indent each scope by a few characters in the 
log output, e.g.
void foo() {
   cerr << "foo starting\n";
   for(int i=0; i<5; ++i) {
     blah(i);
   }
   cerr << "foo done.\n";
}
void blah(int i) {
   cerr << "  blah " << i << "\n";  // note indentation
   ...
}
I read the scoped_logs.html page, and it seems that the 
BOOST_SCOPED_LOG[_CTX] macro deals with the starting/ending business.  
Compared to my explicit start/done output above, it has the feature 
that the done message will be generated if the function returns using 
an explicit return statement or if it throws an exception.  This may be 
an advantage or a disadvantage: with my code, the absence of the done 
message would suggest that an exception might have occurred.  Ideally 
the end-of-scope log message would indicate whether it was invoked 
during exception handling; could that be done using compiler-magic?
Yes that can be done - it just wasn't asked by anyone before.
...
But why does it have to be a macro?  And there doesn't seem to be any 
indentation or other way of matching up the start and end messages.  If 
they were matched up, then a log file viewer tool could provide 
collapse/expand tree view functionality (does Purify still do something 
like that?).
Well again - no one asked for this until now.
...
I was also surprised that no use was made of __LINE__, __FUNCTION__ 
etc. in the scoped logging or other macros.
In fact this is logger-dependent. Have tags (fileline, function) in your 
logger, and they'll be in your scoped log as well.
...
I also don't see any way to couple the scoped logging features with 
enabling and disabling logging.  Specifically, I might detect that
I don't understand what you're saying. Note that if the logger is 
disabled, the scoped log will not log anything.
...
something of interest is happening and want to increase the log level 
within that scope:
void foo() {
   scoped_log sl(log);
   for (int i=0; i<100; ++i) {
     scoped_enabled_logging_if s(log, i==42);
If I understand what you want correctly:
- first, I don't really see the point of ever doing something like this
- second, a filter decides whether a logger is enabled or not. The 
filter can have several arguments, in order to decide whether is enabled 
or not.
...
blah(i);  // logging enabled in here only if i is 42
   }
}
I didn't look in much detail at the "destinations" functionality, but 
I'd like to note some of the features that syslog offers for files that 
it writes to:
- Flushing of files after each message is optional.  Not flushing helps 
performance, but it might mean that critical messages are lost before a crash.
Try destination::file - it offers this
...
- Enormous numbers of identical messages within a short period are 
compressed to a single line, "last message repeated 193247 times".
I don't see any reason for this - if your application loggs 2342 
identical messages, I think you need to rewrite it.
...
- Periodic "MARK" lines can be generated to show that the application 
is still running.
I'd assume your application could do that.
...
Other features that I've seen in applications that do their own logging 
to files include:
- Creating new files for each sub-process or thread, by appending a 
thread identifier to the filename.
- Truncating individual messages to some limit (e.g. to avoid logging a 
rogue char* that points to a huge block of stuff with a \0 a megabyte 
later), or individual fields within the log message.
Does my lib prohibit you from doing this?
...
In my quick review I didn't see any of that functionality.
Does my lib prohibit you from doing this?
last message repeated 2 times
...
The rolling_file feature looks useful for long-running applications 
where it's necessary to move the log files (e.g. to compress them) 
without interrupting the process.  I suggest referring to the logrotate 
utility for how this sort of thing is often handled on Unix servers; 
for example, it can be configured to send a signal to a process to tell 
it to close and re-open its log file with the same filename.
I don't understand what you want - using a rolling_file does that 
automatically - you just keep logging to it.
...
Finally I'd like to look at how the user can control the logging.  I 
don't see any built-in functionality for this.  I have a couple of 
suggestions: environment variables and command-line options.
How about formatters and destinations?
...
An environment variable could be used as follows.  The code would do 
something like this:
g_l()->writer().write(::getenv("MYAPP_LOG_FMT"), ::getenv("MYAPP_LOG_DEST"));
The programmer's choice of format and destination may or may not 
override the user's choice via the variables.  The user could then 
control the format and destination when invoking the program:
$ export MYAPP_LOG_FMT='%time%($hh:$mm.$ss.$mili) [%idx%] |\n'
$ export MYAPP_LOG_DEST='file(debug.log)'
$ myapp
That is completely up to you - they can be command line arguments, a 
configuration file, whatever. You can build a layer on top of my lib, 
which will provide just that.
...
For command-line integration, I suggest providing an initialisation 
function that processes argc and argv looking for debug options, and 
strips them:
int main(int argc, char* argv[]) {
   log.init(argc,argv);  // modifies argc and argc
This can be a simple global function, which anyone can write, based on 
his/her needs.
...
DirectFB has something like that.  Again you could use this to define 
formats and destinations.
You could also give the user control over scoped filtering via a 
command-line interface.  For example:
$ myapp --log:default=none --log:parser=debug --log:scheduler=info
Note that you can choose whatever way you want, in order to configure 
your application. I don't want to enforce any on you.
...
Conclusion:
* What is your evaluation of the design?
Breaks up the problem into a sensible set of components.  Basic usage 
requires too much effort.  Lacks many features that I would find
Again, basic usage I find rather easy: 
http://torjo.com/log2/doc/html/getting_started.html
...
* What is your evaluation of the potential usefulness of the library?
Everyone needs logging.  If they do it using this library, they'll have 
to write a lot of code themselves to implement their own filters, 
destinations and so on.  In practice they won't bother and will write 
their own ad-hoc code.
I disagree. What filters do you need to implement? What about 
destinations - as I've said, I added syslog.

Best,
John

-- 
http://John.Torjo.com -- C++ expert
http://blog.torjo.com
... call me only if you want things done right

Re: [boost] [Logging] Review

John Torjo