Re: [boost] Boost.Logging: formal review

14 Nov 2005

      BTW before I start, I recalled couple more issues I forgot to mention in 
original post:

1. Design: Lightweight interface
Log library brings a lot of dependencies. In some case I do not want this if 
I need to use the library in header. IMO log needs to support/implement some 
lightweight debug abstract interface Something like:

struct DebugLog {
    virtual void log( string const& ) = 0;
    virtual void log( int ) = 0;
};

struct Log : DebugLog { ... };

Now I could use DebugLog interface without need to know an implementation. 
It comes with the price of virtual call. I primarily use this for debug only 
logging.

2. Design: Enter/Exit function scope support
I believe it's required for log library to provide some kind of support for 
scope tracking. It could be an addon to core functionality to it needs to be 
configurable.

3. Design: hexdump
I believe log library needs to provide support for dumping memory in hex 
format on log file. It could be an addon to core functionality.

4. Design: logged assert
I believe log library needs to provide support for custom assert. Such 
assert would log into log file and invoke system assert depends in assert 
level. It could be an addon to core functionality.

5. Implementation/Design: log entry data copying.
Modifiers implemented by applying modifier for string that represent entry 
data. This model is inefficient in following regards:
  a) you need to make a copy of original entry data for every output.
  b) you need to prepend the strings in most cases - which is inefficient 
operation.
  c) you couldn't share the work for different outputs - you need to repeat 
it for every string

IMO better model would be to apply modifiers to outputs instead.
...
...
I. Design Flaws
------------------
1. Filtering support
I understand that author strived to minimize the solution, but IMO it 
lead
to minimal value. IMO proper log system should support arbitrary user
defined filtering of log entries. At the bare minimum it should support:
entry level - levels set is ordered set of values indication importance 
of
information provided. Filtering is based on threshold. Examples: DEBUG,
INFO, MAJOR
How's this different from levels, in the library?
Not much. But this is just one of the many *custom* filters library needs to 
support and supply.
...
...
entry category - categories set is a set of unique values indicating kind 
of
information provided
Filtering is based on masking. Examples: ARGS, RETURN_VALUE, ERROR_LOG,
DATA_FLOW, PROG_FLOW
What? I'm really curious how a line of code for logging would look, in
your wishful lib?
Actually since macros are not primary interface in different subsystems log 
may look different. Couple examples:

1. LOG( INFO, PROG_FLOW, "SocketLib" ) << "Connection esteblished to: " << 
address;
2. LOG( DEBUG, RETURN_VALUE ) << "Fetched value: " << v;
Here a LOG macro refer to some keyword by name. Usually I have keyword per 
file, per library or per class.
3. DEBUG_LOG( DATA_FLOW ) << "Input msg: " < msg;
Here DEBUG_LOG automatically apply DEBUG level and refer to keyword by name
...
...
entry keyword - keyword set is set of user defined keywords (most 
frequently
strings) identifying area of the program. Filtering is based on match of
keywords. Keywords usually are used to mark specific part of application.
Example: "NetworkLevel", "UI", "ABC" to mark class abc, "dfg.cpp"
In my lib, this is actually the log itself. You have a log, and can add
appenders/modifiers to it.
It has nothing to do whatsoever with log name. I mean single log. And 
entries are filtered by string (or not string) keyword. Actually all above 
were an examples of what library should provide as possible filters. 
Acceptable solution would allow custom filter specification.
...
...
Also multithreaded application should support filtering based on thread 
id.
And what's stopping you from making an appender that does just this?
// trivial example
struct only_for_thread_id {
  only_for_thread_id(int tid, appender_func forward_to)
    : tid(tid), forward_to(forward_to) {}
  void operator()(const std::string&,const std::string & msg) {
    if ( ::GetCurrentThreadId() == tid) forward_to(msg);
  }
  int tid;
  appender_func forward_to;
}
You must be kidding, right? I want filtering on log level. So that entry 
appear or not appear in all outputs.
...
...
The best solution would allow to plug-in at compile time any user defined
Filter (in a form of Policy). This way I could have a log that filter
nothing (no filters defined) or log that filter based on time of the day 
if
I want.
See above.
I see nothing.
...
...
2. Configuration support
Submitted solution supports configuration in a form of arbitrary 
modifiers
that user could register in log. While this looks like very inflexible 
it's
also as very inconvenient. In most cases I prefer log to be configured at
run time from within configuration file. his would be very cumbersome 
with
modifiers model. Not only each modifier needs to be added separately. I 
also
Did you take a look at Darryl's proposal?
http://torjo.com/code/lib/alternate_manipulating.html)
And how that has anything to do with what I am talking about? It's just a 
twist of the same unacceptable interface.
...
...
need to keep track of it's name to be able to remove it and it's index to
know how modifiers are ordered. This ability to configure log remotely in
many locations makes it impossible to really know what and how is written
into log.
   My preferred way to configure the log is with use of configuration
string. First of all it's simplify an interface and you immediately see 
how
your log entries going to look like. Here is an example:
"keyword=*,-ACD;categ=prog_flow,return
value,-details;l=debug;track=on;roll=10000;prefix=file ( line ) - time
:;timeformat=%s:%m".
You can implement Configuration Support on top of the current library.
I do not see how it possible and why should I do this on top of useless API.
...
And besides, if you want to configure logging, you'll still need to
identify each log, appender and manipulator by a string.
Why is that? I do not want or need to know log name to configure it. I have 
an instance.
...
...
3. Per output configuration.
Log may have multiple appenders attached (BTW - I very much prefer term 
used
by iostreams library - sink). what if I want different entry
formatting/prefixing/filtering in different outputs. I may have
errors_output which contains only errors. I may have end_user_log that
contains only major information for end user with detailed time and date 
of
the entry. And I may have developer_output that contains performance
warnings with file and line information. Submitted library have no 
support
for this.
I don't recall anyone else asking for this, but anyway, if other people
wish it, I can implement it.
As of now you don't support this.
...
And besides, you can implement it even now (not very efficiently), if
you truly wish:
[...]

That is completely different from what I need. I don't like an idea of 
modifiers in a first place. Now you want to propagate this inconvenience and 
inefficiency into every output.
...
...
4. Internationalization support
I do see some vague traces in code that indicate that author new this 
needs
to be supported. But since documentation have exactly zero information 
and I
couldn't figure out how to do it myself, I assume that library doesn't
support this. IMO it's required for industrial strength solution.
How many times have you needed internationalization support logging?
Me, personally - never. But may be the fact that I work in domestic 
department of American company could be accounted for that. But for the 
boost level library this is a showstopper. Internationalization is not an 
afterthought.
...
...
6. Misfeature: pre-init cashing
Library starts with cashing turned on to support pre-init logging. IMO 
this
is too dangerous. Now developer needs to remember how many log entries 
are
printed. Otherwise some of them will just start disappearing (could be
different in every run). And it may start happening only at the user 
site.
Much more reliable solution is to use some kind of startup log that does 
not
require configuration.
And where would that log write to? What if it's very important data?
To console? What if you don't have console?
I think I covered that in my last statement.
...
And lets face it -- when one starts using a logging lib, will at least
trace one call to the debug lib. If he sees the line doesn't show up
anywhere, he'll most likely read the docs. If you take a look at the
I don't see how reading the docs will help. Solution is dangerous in it's 
nature.
...
Basic Usage, you'll see the 'flush_log_cache()' call.
Besides, even if you don't call it and you write 256 messages to a log,
the caching will automatically turn itself off and flush. The same
And what if log is not configured yet? What will happened? We will loose all 
those 256 messages or program will crash?
...
happens if the application ends, and caching was still on.
...
7. Misfeature: log hierarchies
I personally never have a need for that and have some difficulties to
imagine why would anyone have. You could probably have it on top of 
existent
solution as add-on. If it doesn't require any new interfaces (or some
Have you heard of modules, and sub-modules? What if each
module/sub-module was to have a log to write to?
I primarily use one file for everything. Why would I want 10 different 
files? So to figure out what happened I will need to jump through 10 
different output comparing timestamps? (*if* they are printed). I do 
*rarely* use more than additional logs for information dedicated to 
different target audience (operation departments, end users). But this is 
not driven by module/submodule hierarchy. To mark the modules I use much 
better idiom - keyword (all within bounds of same log).
...
...
8. Misfeature: compile-time log support
As it clear from above IMO log is (almost) never always enabled (and in 
any
case there is a way to implement this without any additional support). 
And
always disabled log could be implemented on macro level. Based on this I
Please enlighten me: how can you implement a disabled log at macro level?
Numerous ways. Here is from top of my head:

struct nil_stream {}

template<typename T>
nil_stream const& operator<<( nil_stream const& ns, T const&) { return ns; }

#define LOG( .... ) if(true) {} else nil_stream() <<

In my practice though I never do that. I found that single if( log_is_on ) 
is good enough even in most performance tight places.
...
...
believe all the compile_tile machinery employed by library is unnecessary
and just complicate implementation
It was asked by users of the lib. I never use it, but they need it.
Well it's just wrong solution then.
...
...
9. Interface
IMO primary interface advertised by library should be macro-less.
Please provide such an implementation, so that the following works:
some_log << some_lengthy_function();
Which, if some_log is disabled, some_length_function() is not called -- 
without the use of macros.
I never said Interface should look like above. But users need to be able to 
write these helper macros themselves based on known public API.
...
...
Documentation does need to cover how these macros needs to be written, 
but
with user defined filters user will need to do the job oneself. Also in 
my
experience I found that on some systems Even if( ... ) <actual log here>
could be unsatisfactory from performance standpoint. I employ one global
Boolean is_log_on flag to guard all my log entries. and it should be a 
part
of these macros
As I recall, noone else asked for this -- but it's not hard to implement.
There is nothing to implement  here (for library author). Library just need 
to advertise macroless interface and users will write macros.
...
...
II. Implementation flaws
------------------
2. Multithreading support implementation
  a) For some unclear reason author chose to reimplement some of
Boost.Threads functionality in some places (and in some places 
Boost.Thread
is used always). I believe log library shouldn't do this.
It was requested -- some users wanted to remove dependency on 
Boost.Thread.
Well it still wrong. You couldn't justify that by user's request. BTW you do 
depend on Boost.Threads still.
...
...
b) I noticed some fishy construct (m_destroyed data member). It opens
some thread safety issues IMO.
It's used for the global logs (the ones defined with BOOST_DEFINE_LOG),
which return a static reference to a boost::logging::logger - to know if
the log is used after its destruction. I'm looking for better 
alternatives.
...
c) static manager_type & manager() doesn't look thread safe.
It's there only for function overloading -- the returned reference is
never used
I am not an expert. But it's still fishy.
...
...
3 . enabled_logger::m_stream
For some reason enabled_logger allocates m_stream dynamically for each 
log
entry. I do not see why we need to do this. Even more: why not reuse a
single instance per thread?
Yes, in the future, I can re-implement it as a single instance per thread.
Well in a future we could reconsider this proposal.
...
...
4. collection in use
appender_array and modifier_array are better be implemented using lists
Have you done some tests, or is this just a gut feeling?
I did not provide any tests, did you? And it's not a gut feeling. In places 
where it's important you never need random access.
...
...
5. exception support
try {
    LOG << "aaa" << foo()
}
catch(...) {
    LOG << "exception"
}
This construct doesn't seems generate what is expected if foo() throws an
exception
What is expected?
Try to run this and you will see.
...
...
5. appender_array
appender_array seems to be defined in 2 placed
Yes, once in the log_impl's class definition, and once in
namespace boost { namespace logging { namespace {
Why would you give the same name to 2 different things?
...
...
6. Appenders copying.
...
From code it seems that each logger has its own copy of each appender. 
Why
would we need that?
Because if we had one appender for all loggers, we could run into
thread-safety issues.
But I may have different logger for every log entry. It's unacceptable 
design. You need to lock on different level.
...
...
If we do why all the appenders are so heavy?
I don't understand what you mean.
So costly to copy.
...
...
8. shared_memory.hpp
This header refer to the library that is not part of boost yet. I don't
think it should be part of this submission.
Why not? First of all, you don't need to include it, if you don't use it.
Because it brings a wrong message to boost users. you refer to 
boost/shmem/... and it's not there.
...
...
9. Time string cashing
prepend_time and prepend_time_strf calls are not cashed. It's a lot of 
work
for nothing.
Please explain.
You may have 1000 messages per second. And you format time string for every 
single one.
...
...
10. write_msg locking
write_msg shouldn't lock on appenders writing. Each appender should be
locked independently instead.
It's because appenders/modifiers could be added/deleted while someone is
writing a message to the log. Thus, it needs to be thread-safe.
You need to find an alternative solution. Every output has to be locked 
independently. This is one of the "premature pessimization" you force on the 
user.
...
Best,
John
Regards,

Gennadiy

Re: [boost] Boost.Logging: formal review

Gennadiy Rozental