Re: [boost] [logging] Interest check on logging library.

Andrey Semashev wrote:
Sorry, couldn't resist. I think, my library fits these requirements quite well. Why would you start your own implementation from scratch? Is there something that doesn't suit you in my implementation?
It started because I thought yours was no longer in development. I have no idea why I thought that, since looking at your cvs logs you have been consistently working on it. It was largely a mistake. I'm writing a internet spaceship game. There's a client and a cluster of servers with different types. The protocol is simple message passing, with the serialization code shared between client and server. This leads to a very simple example like: if(message.size > packet.size) { log << severity::warn() << "Packet was larger than the buffer that contains it!" << build_target::server() << " Packet source= " << connection << ", account = " << user << "\n" << dump_hex(packet); // error handling } On the client side, we note that the packet was too big and move on. On the server side, we actually log the packet since evil hackers are always trying to steal my space bucks. Since the template generates empty code at compile time, both gcc and msvc are smart enough to remove the " Packet source=" and ", account= " strings from the binary, providing less clue in the client as to what the server is watching for. On the server side, I probably want this record to go into a database, so having the entire thing as a single record is very important. That precludes items of the form: log << severity::warn() << "..."; if(server) // or #ifdef log << "Packet source = "; //... I also really don't like having the conditional in the logging. Looking at the serialization code, one shouldn't have to wade through the logic of logging. I think I could add this to your library using features. That's not a light weight thing to add, based upon what I see features need to be instantiated. In my scheme, since the if() test is built into the log record instantiation there's single point to change to affect all appropriate loggers. With your scheme I'd have to change the logger type, the if()s that wrap them or the macros that contain the ifs, and possibly client code as well. Switching to a different scheme, I could do: log << severity::warn() << "Packet was larger than the buffer that contains it!" << auditing::full() << " Packet source= " << connection << ", account = " << user << "\n" << dump_hex(packet); Now, on the client side I can turn off full auditing at compile time and on the server time I can turn off packet logging at run time and have a condition that turns it on when I suspect a problem. There is a separation of concerns between what audit level and severity of the record. The whole thing could be rewritten with scope variables for connection and user that would allow them to be first class denizens of the database or fed into an event processing system without having to do entity extraction on the record. In addition, immediately insertable variables prevent the overhead of setting scope variables if not needed: log << severity::warn() << "Packet was larger than the buffer that contains it!" << auditing::full() << << set("User", user) << set("connection",connection) " Packet source= " << connection << ", account = " << user << "\n" << dump_hex(packet); --- Jason Wagner jason@nialscorva.net

Jason Wagner wrote:
It started because I thought yours was no longer in development. I have no idea why I thought that, since looking at your cvs logs you have been consistently working on it. It was largely a mistake.
The library is very much alive, although for various reasons I didn't touch it for the last few weeks.
I'm writing a internet spaceship game. There's a client and a cluster of servers with different types. The protocol is simple message passing, with the serialization code shared between client and server. This leads to a very simple example like:
if(message.size > packet.size) { log << severity::warn() << "Packet was larger than the buffer that contains it!" << build_target::server() << " Packet source= " << connection << ", account = " << user << "\n" << dump_hex(packet); // error handling }
On the client side, we note that the packet was too big and move on. On the server side, we actually log the packet since evil hackers are always trying to steal my space bucks. Since the template generates empty code at compile time, both gcc and msvc are smart enough to remove the " Packet source=" and ", account= " strings from the binary, providing less clue in the client as to what the server is watching for. On the server side, I probably want this record to go into a database, so having the entire thing as a single record is very important. That precludes items of the form:
log << severity::warn() << "..."; if(server) // or #ifdef log << "Packet source = "; //...
I also really don't like having the conditional in the logging. Looking at the serialization code, one shouldn't have to wade through the logic of logging.
I see two issues with your approach: 1. Although compilers may be smart enough to optimize away writing constants and simple variables into a dummy stream (which, I assume, log is when logging is disabled), I'm not so sure about more complex expressions, like writing the result of a function call, like dump_hex in your example. I didn't investigate this, but I suspect the call will still be present in the compiled code, which, AFAICT, is not what you intended. 2. I'm not sure how you would implement filtering. More on this below. From where I'm standing, I would suggest using macros. It's not that bad in this particular area and would not require changing your code a lot: if(message.size > packet.size) { LOG() << ...; } where LOG can be defined differently for client and server builds. In connection to my library, it could be: #ifdef SERVER # define LOG(logger) BOOST_LOG(logger) #else // client # define LOG(logger) while (false) BOOST_LOG(logger) #endif That will drop unnecessary code for sure on practically any compiler.
I think I could add this to your library using features. That's not a light weight thing to add, based upon what I see features need to be instantiated. In my scheme, since the if() test is built into the log record instantiation there's single point to change to affect all appropriate loggers. With your scheme I'd have to change the logger type, the if()s that wrap them or the macros that contain the ifs, and possibly client code as well.
Switching to a different scheme, I could do:
log << severity::warn() << "Packet was larger than the buffer that contains it!" << auditing::full() << " Packet source= " << connection << ", account = " << user << "\n" << dump_hex(packet);
The problem is that you ignore run time filtering here, which should happen before formatting the record (i.e. before the streaming statement you mention above). If you are fine with it, your approach is sufficient. Otherwise you'll eventually come to something more complex, maybe something like my lib. :) If all you need is to globally enable or disable logging at compile time, you can define macros as I've shown above. There's no need to play with loggers or complicate your code with conditionals. If you need run time filtering, the following constraints appear: 1. You cannot mix the attributes (sorry, I don't know the equivalent term for your library) into the streaming expression. The reason is obvious: attributes participate in filtering, and filtering takes place before the streaming. 2. You will have to write an "if" around the streaming statement. That "if" will check the filters in run time and decide whether the record will be processed or not. You can wrap it into a macro, but it will be there, and it will not be zero-cost in terms of performance and code size. 3. All attributes you want to put to log have to be set up before that "if" statement. My lib provides scoped attributes (or tags) for that purpose, although they may not look as neat as your manipulators in the streaming statement. In your use case the scoped attributes can be set in the macros I've shown above, so that they will not be present in the client code.
Now, on the client side I can turn off full auditing at compile time and on the server time I can turn off packet logging at run time and have a condition that turns it on when I suspect a problem. There is a separation of concerns between what audit level and severity of the record.
The whole thing could be rewritten with scope variables for connection and user that would allow them to be first class denizens of the database or fed into an event processing system without having to do entity extraction on the record. In addition, immediately insertable variables prevent the overhead of setting scope variables if not needed:
log << severity::warn() << "Packet was larger than the buffer that contains it!" << auditing::full() << << set("User", user) << set("connection",connection) " Packet source= " << connection << ", account = " << user << "\n" << dump_hex(packet);
I'm sorry, I'm afraid, I don't understand what you're saying here.

On Thu, Jan 1, 2009 at 6:51 AM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
2. You will have to write an "if" around the streaming statement. That "if" will check the filters in run time and decide whether the record will be processed or not. You can wrap it into a macro, but it will be there, and it will not be zero-cost in terms of performance and code size.
With risk of being annoying, I'll second my earlier post about "the if" in question. :) I'd rather have "the if" out of any macros and fully visible in user code. That way it would be clear what enables or disables the logging statement itself, and what functions are called and when. In a simple case, you'd simply write: if( logging::severity>=2 ) log() << "whatever: " << whatever(); and you'll know that the function wouldn't be called if logging::severity is below 2. In more complex conditions, users can decide to separate the condition in a predicate: if( logging_connections() ) log() << "server: connection from " << connection.ip(); where: bool logging_connections() { return (server_logging::connections || client_logging::connections) && logging::severity>=1; } Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

Emil Dotchevski wrote:
On Thu, Jan 1, 2009 at 6:51 AM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
2. You will have to write an "if" around the streaming statement. That "if" will check the filters in run time and decide whether the record will be processed or not. You can wrap it into a macro, but it will be there, and it will not be zero-cost in terms of performance and code size.
With risk of being annoying, I'll second my earlier post about "the if" in question. :)
I'd rather have "the if" out of any macros and fully visible in user code. That way it would be clear what enables or disables the logging statement itself, and what functions are called and when.
That is your choice, indeed. However, in general, this will not allow you to immediately see if the record will be processed or not. This is because filters are usually set up in some initialization code, which is rather remote from places where you actually write logs. So, you'll have a trivial "if" most of the time, something like this: if (logger.is_enabled(severity::normal)) logger << ...; This check does not say that filtering is based on severity (and if it is, in what way). Instead, it checks if a record with normal severity will pass the (opaque) filter. You can write it all the time, but, personally, I find it annoying.

On Jan 1, 2009, at 5:11 PM, Andrey Semashev wrote:
That is your choice, indeed. However, in general, this will not allow you to immediately see if the record will be processed or not. This is because filters are usually set up in some initialization code, which is rather remote from places where you actually write logs.
I believe what Jason is talking about is compile-time filtering where the compiler actually does know the answer and is able to eliminate code. The "trivial 'if'" is gone after optimization. This seems like a very nice feature to me. (earlier post:)
writing the result of a function call, like dump_hex in your example. I didn't investigate this, but I suspect the call will still be present in the compiled code, which, AFAICT, is not what you intended.
These is what the lazy function evaluation is for. It seems to me that lazy functions are going to be necessary for any non-macro solution that doesn't require ifs in user code, but they might also be helpful with the runtime filtering you are talking about? I really like the syntax and compile-time filtering of this new proposal - please figure out how to combine these two libraries! Gordon

Gordon Woodhull wrote:
On Jan 1, 2009, at 5:11 PM, Andrey Semashev wrote:
That is your choice, indeed. However, in general, this will not allow you to immediately see if the record will be processed or not. This is because filters are usually set up in some initialization code, which is rather remote from places where you actually write logs.
I believe what Jason is talking about is compile-time filtering where the compiler actually does know the answer and is able to eliminate code. The "trivial 'if'" is gone after optimization. This seems like a very nice feature to me.
I agree that this is a very useful feature. One of my points in the reply to Jason was that there are simpler and more efficient ways to do it than employing metaprogramming and hoping that the optimizer does the rest. The other point was that, the way it is presented now, the proposal does not combine well with run time filtering, which, I think, is no less useful.
(earlier post:)
writing the result of a function call, like dump_hex in your example. I didn't investigate this, but I suspect the call will still be present in the compiled code, which, AFAICT, is not what you intended.
These is what the lazy function evaluation is for. It seems to me that lazy functions are going to be necessary for any non-macro solution that doesn't require ifs in user code, but they might also be helpful with the runtime filtering you are talking about?
Although lazy functions solve the problem, they complicate logging a lot. If you just want to put to log the result of a function call, you have to create a lambda expression or some functional wrapper for that call. I know, this is quite simple with Phoenix or Lambda, but this is still quite an amount of scaffolding for such a trivial task.

On Jan 1, 2009, at 7:22 PM, Andrey Semashev wrote:
Gordon Woodhull wrote:
I believe what Jason is talking about is compile-time filtering where the compiler actually does know the answer and is able to eliminate code. The "trivial 'if'" is gone after optimization. This seems like a very nice feature to me.
I agree that this is a very useful feature. One of my points in the reply to Jason was that there are simpler and more efficient ways to do it than employing metaprogramming and hoping that the optimizer does the rest.
I like the idea of expression templates for logging and find Jason's syntax somewhat compelling, allowing different output to different sinks from the same line. But I recognize that macros are going to be the best solution for many projects, and it seems hard to match the sureness that they'll completely compile away.
Although lazy functions solve the problem, they complicate logging a lot. If you just want to put to log the result of a function call, you have to create a lambda expression or some functional wrapper for that call. I know, this is quite simple with Phoenix or Lambda, but this is still quite an amount of scaffolding for such a trivial task.
Yeah, it might be annoying to have to lazify every item that might cost processor time!

On Thu, 01 Jan 2009 14:24:08 -0500, Emil Dotchevski wrote:
With risk of being annoying, I'll second my earlier post about "the if" in question. :)
I'd rather have "the if" out of any macros and fully visible in user code. That way it would be clear what enables or disables the logging statement itself, and what functions are called and when.
You can do this trivially with any logging library if you want it to be explicit. Just set it the logger to always feed to the sink, then wrap it with your own ifs. The other methods are for those who want to hide those details. -- Jason Wagner jason@nialscorva.net

On Thu, 01 Jan 2009 09:51:39 -0500, Andrey Semashev wrote:
Jason Wagner wrote:
It started because I thought yours was no longer in development. I have no idea why I thought that, since looking at your cvs logs you have been consistently working on it. It was largely a mistake.
The library is very much alive, although for various reasons I didn't touch it for the last few weeks.
Yes, it was completely my mistake for thinking that it was not.
I'm writing a internet spaceship game. There's a client and a cluster of servers with different types. The protocol is simple message passing, with the serialization code shared between client and server. This leads to a very simple example like: if(message.size > packet.size) { log << severity::warn() << "Packet was larger than the buffer that contains it!" << build_target::server() << " Packet source= " << connection << ", account = " << user << "\n" << dump_hex(packet); // error handling } On the client side, we note that the packet was too big and move on. On the server side, we actually log the packet since evil hackers are always trying to steal my space bucks. Since the template generates empty code at compile time, both gcc and msvc are smart enough to remove the " Packet source=" and ", account= " strings from the binary, providing less clue in the client as to what the server is watching for. On the server side, I probably want this record to go into a database, so having the entire thing as a single record is very important. That precludes items of the form: log << severity::warn() << "..."; if(server) // or #ifdef log << "Packet source = "; //... I also really don't like having the conditional in the logging. Looking at the serialization code, one shouldn't have to wade through the logic of logging.
I see two issues with your approach: 1. Although compilers may be smart enough to optimize away writing constants and simple variables into a dummy stream (which, I assume, log is when logging is disabled), I'm not so sure about more complex expressions, like writing the result of a function call, like dump_hex in your example. I didn't investigate this, but I suspect the call will still be present in the compiled code, which, AFAICT, is not what you intended.
MSVC 9 is actually smart enough to do it if there are no side effects. gcc says it is, but it's definition of "side-effect" seems to cover any writes to memory at all. I haven't found a situation where it actually does optimize the call away. The solution I'm going with is lazy functions using boost::function and boost bind. There's a wrapper called lazy() that is templated similar to boost bind and returns a boost::function that is only evaluated if the statement will be logged.
2. I'm not sure how you would implement filtering. More on this below.
From where I'm standing, I would suggest using macros. It's not that bad in this particular area and would not require changing your code a lot:
if(message.size > packet.size) { LOG() << ...; }
where LOG can be defined differently for client and server builds. In connection to my library, it could be:
#ifdef SERVER # define LOG(logger) BOOST_LOG(logger) #else // client # define LOG(logger) while (false) BOOST_LOG(logger) #endif
That will drop unnecessary code for sure on practically any compiler.
I think I could add this to your library using features. That's not a light weight thing to add, based upon what I see features need to be instantiated. In my scheme, since the if() test is built into the log record instantiation there's single point to change to affect all appropriate loggers. With your scheme I'd have to change the logger type, the if()s that wrap them or the macros that contain the ifs, and possibly client code as well. Switching to a different scheme, I could do: log << severity::warn() << "Packet was larger than the buffer that contains it!" << auditing::full() << " Packet source= " << connection << ", account = " << user << "\n" << dump_hex(packet);
The problem is that you ignore run time filtering here, which should happen before formatting the record (i.e. before the streaming statement you mention above). If you are fine with it, your approach is sufficient. Otherwise you'll eventually come to something more complex, maybe something like my lib. :)
No, run time filtering works just fine. The type severity::warn is an typedef for a type and an integer-- basically an enum. There's a trait for turning the type into a string. Each insertion of a tag adds to an mpl::map of types to integers (well, integer types). As soon as there's something that actually gets logged, all of the compile time maps gets turned into a run time attribute/variable. During record generation, any time there's a variable change via a tag or explicit setting, the filter chain is re-evaluated and a new list of sinks is made: Compile time filters: severity > debug Runtime filters: when (severity >= warning) log to "critical.txt" when (severity >= info) log to "info.txt" Statement: log << severity::debug() // 1 << "debug message" // 2 << severity::error() // 3 << "error message" // 4 << severity::info() // 5 << "info message" // 6 << severity::debug() << "last text"; // 7 1 -- inserting a tag. This tag is disabled at compile time and the <<() call is an empty function returning a message<false> temporary. 2 -- inserting content. The message<false> object has an empty <<() function and returns itself. 3 -- inserting tag. This updates the mpl::map and returns a message<true> object. No actual work is done, just changes the type of the temporary returned. 4 -- inserting content. The message<true>::operator<<() knows that it's the first effective call, so takes the mpl::map it carries, sets all the tags to run-time variables, finds all the interested sinks (critical.txt and info.txt), then returns. 5 -- inserting tag. This changes the mpl::map, but is still compile time enabled. It also resets severity in the attribute map, which causes sinks to be reevaluated, only "info.txt" is listening now. 6 -- inserting content. The pending record for "info.txt" is appended, the record for "critical.txt" is not. 7 -- inserting tag. The mpl::map gets a new tag, which fails the compile time condition. The op returns message<false> and eats the next peice of text in a no-op call. Run time filters are evaluated only when they need to be, but can be invoked in the middle of a record.
If all you need is to globally enable or disable logging at compile time, you can define macros as I've shown above. There's no need to play with loggers or complicate your code with conditionals. If you need run time filtering, the following constraints appear: 1. You cannot mix the attributes (sorry, I don't know the equivalent term for your library) into the streaming expression. The reason is obvious: attributes participate in filtering, and filtering takes place before the streaming. 2. You will have to write an "if" around the streaming statement. That "if" will check the filters in run time and decide whether the record will be processed or not. You can wrap it into a macro, but it will be there, and it will not be zero-cost in terms of performance and code size. 3. All attributes you want to put to log have to be set up before that "if" statement. My lib provides scoped attributes (or tags) for that purpose, although they may not look as neat as your manipulators in the streaming statement. In your use case the scoped attributes can be set in the macros I've shown above, so that they will not be present in the client code.
They also potentially participate in the formatting, even when they don't participate in the filtering. In those cases, I would consider them part of the record. It also makes sense in my use cases since the conditionals occur within the insertion operations themself, not around the statement as a whole.
Now, on the client side I can turn off full auditing at compile time and on the server time I can turn off packet logging at run time and have a condition that turns it on when I suspect a problem. There is a separation of concerns between what audit level and severity of the record. The whole thing could be rewritten with scope variables for connection and user that would allow them to be first class denizens of the database or fed into an event processing system without having to do entity extraction on the record. In addition, immediately insertable variables prevent the overhead of setting scope variables if not needed: log << severity::warn() << "Packet was larger than the buffer that contains it!" << auditing::full() << << set("User", user) << set("connection",connection) " Packet source= " << connection << ", account = " << user << "\n" << dump_hex(packet);
I'm sorry, I'm afraid, I don't understand what you're saying here.
Part of the problem is that I was typing this when I mistakenly sent the email. Let me rewrite it: The sink is a database table with columns severity, user, ip_address, and message. It maps the variables "user" to column user, "connection" to connection, "severity" to severity, and the record text to the message column. I do not want to split up the record into two rows, but I don't want to dump the packet information except when I'm doing a full audit. Note that the condition has nothing to do with the severity, only with the degree of auditing I'm doing. auditing := sparse < moderate < detailed; log << severity::warn() << "Packet was larger than the buffer that contains it!" << set("User", user) << set("connection",connection) << " Packet source= " << connection << ", account = " << user << "\n" << auditing::detailed() << dump_hex(packet); By default, I set a filter that checks to make sure that auditing <= sparse. The record will be built without a hex dump of the packet in it, the auditing::detailed() removes the followign information from it. Later, if there's reason to suspect problems I can load the filter with the program running, change the condition to auditing <= detailed, and suddenly the record will contain a hex dump of packets that over flowed. In addition, since "user" and "connection" are attributes/variables, I can put a filter on them so that I have detailed auditing on certain users or IP addresses, but not on other records. The thing of key importance is that it's all in a single logging record, not split between multiple lines. It's one packet of information with conditional content. That makes it far easier to view logs through GUIs, databases, or any other type of structured viewing. For systems that generate tens to hundreds of megabytes of logging per day, that's fairly important for usability. Grep only goes so far. -- Jason Wagner jason@nialscorva.net
participants (4)
-
Andrey Semashev
-
Emil Dotchevski
-
Gordon Woodhull
-
Jason Wagner