Re: [boost] [Review] The review of the Boost.Logging library startstoday, November 7th (long)

9 Nov 2005

      ...
today starts the review of the Boost.Logging library written by John 
Torjo.
http://torjo.com/code/logging.zip
I took detailed look on the library and IMHO it
still needs a lot of work. It would be better
not to accept it as is now and wait until current
problems get resolved.

The overall concept feels sound and sufficient
to me but the details should be more polished.

What needs more work:

* API: in [9] I suggest even more simple
  (but IMHO sufficient) interface to manipulate
  logs. Some other suggestions are speread throughout
   notes. Few new features are proposed.

* Documentation. Overcommented examples are badly needed.
  Reference documentation is missing. Many details
  are not clear. See my comments bellow.

* Efficiency. More attention could be devoted
  to make the tool fast. For too many people
  logging is hog that needs to be trimmed down the
  metal. See [25] for example.

* Thread handling: I am not sure the flushing thread
  works as expected in EXE/DLL combinations [10].
  The thread locking may be broken [5].

Once again I like the library but am annoyed by
the not finished details. When finished the result
will be both powerful and easy to use and will get
widely used.

/Pavel

_____________________________________________________
1. detail/defs.hpp:

the

#if (defined _WIN32) || ....

could be replaced by:

#if (defined BOOST_WINDOWS) && !(defined BOOST_DISABLE_WIN32)

-------------------

the check for DLL:

#if defined(_DLL) || defined(_USRDLL)
#ifndef BOOST_LOG_DYN_LINK
#define BOOST_LOG_DYN_LINK
#endif
#endif

would be better omitted.
Logger may be used internally within DLL only
and not exported. If user wants Boost-dynlink
he should say it explicitly.

If the code is kept anyway then it should handle
other compilers too,
e.g. for BCB it is __DLL__.

--------------------

The name BOOST_LOG_SOURCE is not very enlightening.
It is also not mentioned in documentation.

_____________________________________________________
2. log_fwd.hpp: none of #include seems to be
   really needed here, just

#include <string>
#include <sstream>

should be enough.

_____________________________________________________
3. log_manager.hpp: since this file is not intended
   for include by user it should be moved into detail/
   directory.

-------------------

Name DEFAULT_CACHE_LIMIT should be
DefaultCacheLimit or so to avoid preprocessor clashes.

-------------------

Since the functions here are in namespace "logging"
the word "log" doesn't need to be repeated in many
function names.

_____________________________________________________
4. detail/ts_win32.hpp: the functionality here is
   possibly covered by Boost.Thread (in header only
   form). If really so, it should be reused.

--------------

The #include <windows.h> may be replaced by few simple
forward defines and declaration to speed up the
compilation.

_____________________________________________________
5. detail/ts_posix.hpp: the same advice related
   to Boost.Thread as above.

--------------

assert() -->> BOOST_ASSERT()

--------------

I am suspicious of MT safety on multiprocessor machines
due to use of unprotected "m_count" value.
This value may be cached and changes to it not visible.

If possible, Boost.Thread should be reused to avoid
such doubts.

_____________________________________________________
6. basic_usage.html: the code snippets would look better
   if syntax highlighted.

-------------

Code snippets for header and implementation file should
be more visually separated.

--------------

Code snippets may have more comments. Overcommenting
is positive thing, especially for first few examples.

--------------

Code snippet: functions add_appender() and add_modifier()
sometimes take pointer and sometimes value. I guess this
is mistake in docs.

_____________________________________________________
7. logs.html: there should be hyperlinks so one can
   check out what appenders/modifiers interfaces are.

----------------

Instead of "appender" word "target" may be better.
Instead of "modifier" perhaps "formatter" sounds more common.

----------------
"much like the directory tree on an Operating System"
==>>
"much like the directory tree on an filesystem"

---------------

There should be mentioned whether

"xyz*" matches "xyz" or not

-------------------

It should be said log name is ASCIIZ.

---------------------

The code snippet with

BOOST_DEFINE_LOG(gui, "app.gui")

should say what the 'gui' is - object,
"string name" or "C++ name"?

---------------------

Possible feature of logger: it would be nice
to have ability to automatically append newline
(if not already at the end of a log).

Loggers I wrote do this and it helps since I
do not need to search and fix source if I forget
newline somewhere.

-----------------------

using namespace boost::logging;

is missing in code snippets.

In first code snippet you may use
full qualification to give reader clue
what namespaces are used where.

_____________________________________________________
8. modifier_logger.html: the header name

#include <boost/log/functions.hpp>

is very unintuitive. Perhaps
"modifiers_and_appenders.hpp" would be better.

---------------------

There should be table of all available modifiers
and all available appenders at this moment.
This table should contain one-line mini-examples.

---------------------

An idea for another modifier: limit the text
to gives maximal length. It may happend the dump is
unexpectedly large and would
fill disk/overwrite cache/take too much time/be unreadable.

Limiting one log size would help here a lot
and would relieve end user of doing it manually.

-----------------------

Another idea for a modifier: limit length
of line in dump and automatically insert newline
after N characters.

-----------------------

Modifier API:
void modifier_func(const std::string & log_name, std::string &msg);

Is it really needed to use std::string for
log_name? Woudn't it be enough to have const char*?

I ask because it may help to avoid one dynamic allocation
per log and this may be quite important in RT applications.

-----------------------

Order in which appenders and modifiers are called is
not specified.

_____________________________________________________
9. manipulating_logs.html:

"When manipulating logs, you alway manipulate a hierarchy of logs."
(typo "alway")
==>>
"You can manipulate with one or more logs at the same time."

-------------------

DEFAULT_INDEX /cannot/ be uppercase (clashes with
preprocessor).

-------------------

What if I do:

manipulate_logs("*").xyz(....).abc(...);

manipulate_logs("*").abc(....).xyz(...);

Will the last call overwrite the previous or will
they accumulate? Should be explained here.

----------------------

The rules where to use ampersand for appenders/modifiers
should be the same. It is not acceptable sometimes to
have it and sometimes not. This is really, really confusing.

----------------------

In
.add_modifier( func, index, name);

it should be explained whatr is "name" and what is "index".
It should be clear whether name has something in common
with "C++ name" or "string name" and whether "index"
relates to "level".

----------------------

Seeing term proliferation I suggest to add page "Term ditionary"
and link the terms when used for the first time in docs and
where it may be useful.

--------------------

"insures" ==>> "ensures"?

--------------------

typo "amongst"

--------------------

Perhaps using "index" to sort modifiers is overengineering.

Intuitively I would expect sorting being derived from order
of add_appender()/add_modifier() within manipulate_logs().

If you remove the index the API would get pleasantly smaller
w/o loosing feature.

With such functionality you can get rid of del_modifier()
API as well. Latest manipulate_logs() would define current
state fully and would overwrite any previous settings.

I very much suggest to use this quite simpler, more intuitive
but sufficiently powerful way to manage logs. It is also easier
than the Darryl Green's solution.

--------------------

"default_" should be "default_level".
Similarly other names. I suggest to get rid of
"level" sub-namespace and use it as suffix.

--------------------

Is it reasonable to use UINT_MAX for disable_all
level? The value is 64 bits.

Woudn't it be better to use:

std::numeric_limits<unsigned>::max()

?

----------------------

The last code snippet doesn't say whether
BOOST_LOG_DEFINE_LEVEL can be in header.

It also seems to put the value into
boost::logging::level namespace which is wrong.

If the macro is already within a namespace
it would insert new subnamespace boost/.... here
making big mess in code.

Namespacing should be left to the user:

BOOST_LOG_DEFINE_LEVEL(my_namespace::xyz, 1111)

or

namespace my_namespace {
  BOOST_LOG_DEFINE_LEVEL(xyz, 1111)
}

----------------------------

You may obfuscate a little bit your email
in docs to keep spambots out.

_____________________________________________________
10. predefined.html:

append_enter ==>> append_newline

----------------

prepend_thread_id should exists also for POSIX
(there's one simple function for it).

--------------------
Missing escape sequences:

$d - day with 1 or 2 digits
$M - month with 1 or 2 digits
$MON - month as Jan/Feb/....
$ms - millisecond (it is quite useful for many common cases)
$HH - like "03 AM"

--------------------

write_to_dbg_wnd ==>> write_to_debug_window
ts_appender ==>> longer, more intuitive name

----------------------

The docs for "ts_appender" suggests other
ways are /not/ MT safe. This should be cleared out
(I hope all ways are MT safe, no matter what I specify).

The later code snippets do not clear my confusion.

The other documentation for "ts_appender" is very confusing,
I was not able to understand it.

-----------------------

"appender_array" - confusing documentation, no idea what is says.

-----------------------

rolling_file_appender - example needed, picture would help.

Name should be "rotating....".

-----------------------

periodic_file_appender - should have parameters
to delete old log files (by time, by count or by size).

Example is missing, docs is insufficient.

The name should be changed.

-------------------------

shared_memory_appender - insufficient docs,
example missing.

It should be explicitly said here there are
no delays or caching when writing into shmem.

Docs doesn't says what happend when shmem is full.
I think the only reasonable behavior is to overwrite
last logs in FIFO manner.

Is it possible to use several shmems? Docs should say
and should have couple of examples.

--------------------

Dedicated thread used for logging: does it work
if both application executable and a DLLs do log into
the same file? It is quite tricky to ensure singleton
in such circumstances. Jason Hise works on library "Singleton"
where he had solved this (after a lot of effort).

I would suggest to investigaet the problem and if real then
to use Singleton library (or extract Jason's solution).

If this is too much of work then the documentation should
warn user that each executable (exe, dll, so) would have
one separate thread and they cannot deal with the same file/shmem.

----------------------

Static destruction: currently the library cannot
reliably log from destructors of static objects.
The Jason's library contains support for this situation.

--------------------------

It may be possible to design the BOOST_DECLARE_LOG
and BOOST_DEFINE_LOG so that the log get initialized
on first use: then the cache would not be needed.

This way should be investigated since possible
simplification of design and API would be rather high.

_____________________________________________________
11. Feature request: quite often I am not interested
    not in absolute time (2005/12/11 16:33) but in
    time offset from some event.

I suggest to add API:

manipulate_logs(...).set_time_start_point(
   bool either_in_all_threads_or_just_in_current_thread
   );

and escape sequences:

$Xms   - e.g. "12 ms", "78321 ms" (only milliseconds are used)
$Xsec  - e.g. "2 s, 23 ms" (only seconds and millis)
$Xmin  - e.g. "3 m, 21 s, 10 ms"
$Xhour - e.g. "1 h, 33 m, 20 s, 54 ms"
$Xday  - e.g. "2 d, 1 h, 33 m, 20 s, 54 ms"

This would allow automatically see and compare
app performance characteristics and check timeouts.

Different threads may use different time start points
(e.g. useful for long lasting worker threads).

_____________________________________________________
12. Feature request: thread ID (both Win32 and POSIX)
    is of low use. I suggest to add API to specify
    string name as alternative to number:

.prepend_thread_name()

manipulate_logs(...).set_current_thread_name("gui thread");

If not specified number ID would be default.

_____________________________________________________
13. Feature request: Boost.Filesystem should be supported
    (for specifying file names). The library is stable
    and likely in next Standard.

_____________________________________________________
14. efficiency.html:

level is bigger ==>> level is higher

--------------

This is what was likely discussed all over: for
some apps (e.g. embedded with tight constraints)
I may wish to remove all logging code completely:

BOOST_LOG(app, (<< "x = " << x));

Some compilers may not optimize strings and logging support
aways from the code - they won't be called but they will
be present in the executable. A way to surely get rid of them
should exists.

_____________________________________________________
15. caching.html: the docs doesn't say whether flushed
    logs follow the rules given by manipulate_logs().

--------------

It doesn't say whether manual flush() is needed and what
is default.

Complete example is sorely needed.

--------------

The docs doesn't say what
   int cache_limit = 256
means.

What is 256? Bytes, logs, lines?

--------------

I do not like very much functions as
flush_log_cache() are standalone functions.

I would prefere use of singleton or monostate
pattern here, just for better feeling.

_____________________________________________________
16. thread_safe.html: this page is messy and I am not
    able to understand what it is for and what it
    tries to explain.

What is very missing is getting ensured that ALL logs no matter
what are MT safe.

_____________________________________________________
17. scoped.html: cannot BOOST_SCOPEDLOG be somehow merged
    with BOOST_LOG? Having two macros is confusing and
    error prone.

    Perhaps some Boost.Preprocessor trick could be
    employed here.

_____________________________________________________
18. compile_time_logs.html: the fact that "level"
    is ignored is strange and counterintuitive.

If implementation cannot be changed here the docs should
stress this strange behaviour more.

Perhaps my suggestion in [14] (about completely
compiling out any logging support) could be integrated
with "compile time logging".

----------------

The term "compile time logs" is misleading,
"compile time log enabling" or so would be better.

--------------

The example should show mix of runtime and
compile time loggers working together.

_____________________________________________________
19. examples.html: every example could be linked here,
    together with short description.

_____________________________________________________
20. Feature: since each log may span to several lines
    a modifier could be added to the library:

   .add_modifier(log_ending("###"))

where every log will be ended with [optional \n if needed]###\n.

This would allow easier create tools that manipulate
with logs.

Similar modifier:
   .add_modifier(log_starting("@@@"))

where every log starts with @@@ (no newlines)
may be considered but this one is not that urgent.

_____________________________________________________
21. view.html: several examples of real output may
    be here, from simplest to most complicated ones.

_____________________________________________________
22. Source files: it may be considered to use only
    one *.cpp file to simplify needed changes in makefiles.

_____________________________________________________
23. The documentation may contain more tables, listing
    all available options for features + tiny examples.

--------------------

The documentation lacks reference for some important
objects like manipulate_logs(). At least list of all
standalone functions should exists.

--------------------

The documentation may contains some short overview
of internal structure of the library. This is just nice
to have, not absolute requirement.

--------------------

The documentation doesn't say whether and exception
can be throwns out of it and what happens in such case.

It is possible to play a lot increasing exception safety
(for example to use internally allocator that may fall back
to static memory block if bad_alloc). This is /a lot/ of work
to do it completely and right, though.

The documentation should at least warn that in low memory
situation results are not predictable.

--------------------

Exceptions thrown from the library (e.g. failure to create
lazy thread for flushing) should be documented and
also be present in log_fwd.hpp.

--------------------

The documentation should contain complete example how to write
a) custom appender and modifier
b) custom back end (e.g. my own shmem). This one is
   very important.

--------------------

Nice to have docs wish: a page discussing stranghts
of this logger vs other common ones (log4cpp etc).

--------------------

Documentation about performance of the library is missing
- absolute times, # of dynamic allocations per typical log, etc.

--------------------

Docs should have table of all macros (visible to user)
and info whether they can be defined by user.

For example now BOOST_LOG_NO_LIB is not mentioned anywhere
and it is unclear what it is for.

_____________________________________________________
24. Boost license should be used.

_____________________________________________________
25. src/functions.cpp: function write_to_dbg_wnd()
    should be specialised for char and for wchar_t
    and OutputDebugStringA() and OutputDebugStringW()
    used respectively in each specialisation.

--------------------

Functions like prepend_prefix() may use some
preallocated buffer to write their output.

Just this function does 3 dynamic allocations
and 3-4 copyings of complete log data.

I have seen people doing a lot of work to make logs
as fast as possible, even writing their own printf()s
or delaying their evaluation, fiddling with stack etc.

Manipulating with one log can use single preallocated
buffer and fall back to heap if needed. Even if it would
complicate API for appenders/modifiers it feels as worth.

It may be worth to create your own "preallocated_string"
class and work with it instead of with std::string.

---------------------

Docs saying that thread ID is available only in Win32
is wrong - I see POSIX implemented.

--------------------

It may be worth to measure whether GetCurrentThreadId()
is fast enough or whether thread local variable
storing the ID is faster. (No guess from me.)

----------------

prepend_time::prepend_time()

array indexes;

may use reserve() to avoid needless
reallocations. Even static array would be
far then enough for the few sequences.

The whole code feels underoptimized to me.

----------------

prepend_time_strf(): this function does
two system calls (time() and localtime()).

Perhaps separate thread may be considered.
This thread would sleep for cca 1 second then
it would find local time and save it into
certain data structures.

prepend_time_strf() would simply fetch
in these data structures w/o any system calls.

This may have positive effect inside tight loops
of performace sensitive apps.

If milliseconds are needed (and implemented)
then prepend_time_strf() needs to use
different algorithm (but still single system call
should be enough).

_____________________________________________________
26. Similary to OutputDebugString() on Win32 with POSIX
    saving to syslog may be implemented.

    On Win32 EventLog suppot may be added (very handy
    for errors).

_____________________________________________________
27. There should be way to disable thread locking
    manually when useful (e.g. via macro). For some
    people with single thread app such overhead may
    feel too high.

_____________________________________________________
28. src/log_manager.cpp: name_matches_spec() may be more optimised:

  if ( (spec == "*") || spec.empty() )

==>>

  if (spec.empty() || strcmp(spec.c_str(), "*") == 0)

-> one dynamic allocation less.

------------

"searches" arrar may be reserve()d before.

---------------

Does the matching works with pattern like:
  "abc.*.xyz"
?
The docs doesn't say.

----------------

The code may be modified to deal with exception
support switched off.
File <boost/detail/no_exceptions_support.hpp>
contains few helpers.

---------------

Length of lines may be limited.
Some lines have over 240 characters and that
very hard to read.
I'd say that 90-100 chars is reasonable limit.

_____________________________________________________
29. log.hpp: class logger: the member m_destroyed
    looks like having no real effect at all.

_____________________________________________________
30. Compiling examples with BCB 6.4:

functions.hpp needs:
#include <ctime>

and in "struct prepend_time_strf":

    time_t m_t;
    tm m_tm;

==>>

    std::time_t m_t;
    std::tm m_tm;

-----------------------

The file log_impl.hpp should be wrapped by:

#include <boost/detail/workaround.hpp>

#if BOOST_WORKAROUND(__BORLANDC__, BOOST_TESTED_AT(0x564))
#  pragma warn -8026 // some functions are not expanded inline
#endif

.... file contents ....

#if BOOST_WORKAROUND(__BORLANDC__, BOOST_TESTED_AT(0x564))
#  pragma warn .8026
#endif

The same wrapping should be in log/functions.hpp.

File src/log_manager.cpp should have

#if BOOST_WORKAROUND(__BORLANDC__, BOOST_TESTED_AT(0x564))
#  pragma warn -8026 // some functions are not expanded inline
#endif

on top

-----------------------
Files:
* src/functions.cpp
* examples/fast_compile/manipulate_logs.cpp:

sprintf ==>> std::sprintf

_____________________________________________________
31. log.hpp: the code to find out release mode

#ifndef NDEBUG
#define BOOST_LOG_IS_DEBUG_MODE 1
#else
#define BOOST_LOG_IS_DEBUG_MODE 0
#endif

doesn't work correctly on BCB. Here IDE uses
by default _DEBUG macro. I suggest:

#if BOOST_WORKAROUND(__BORLANDC__, BOOST_TESTED_AT(0x564))
// IDE of BCB uses _DEBUG to indicate debug mode, it doesn't
// (by default) use NDEBUG.
#  ifdef _DEBUG
#    define BOOST_LOG_IS_DEBUG_MODE 1
#  else
#    define BOOST_LOG_IS_DEBUG_MODE 0
#  endif
#else
#  ifdef NDEBUG
#    define BOOST_LOG_IS_DEBUG_MODE 0
#  else
#    define BOOST_LOG_IS_DEBUG_MODE 1
#  endif
#endif

_____________________________________________________
32. the example examples/shmem_example/writer/writer.cpp
    should also clean up the shmem memory when the
    data are read out sucessfully.

Issue of MT safety: you write two different items:
"shared_log_object" and "shared_occupied_size".
This is MT unsafe though unlikely in practice.

I would recommend to overhaul the shmem completely
since current implementation is not very practical.

Better implementation would allow:
* reading and writing items into shmem in paralel,
  at any time and repeatedly.
* overwriting oldest logs if shmem is full.

_____________________________________________________
EOF

Re: [boost] [Review] The review of the Boost.Logging library startstoday, November 7th (long)

Pavel Vozenilek