
On 01/10/2010 04:28 PM, Darryl Green wrote:
I haven't done more than read the docs (oh - I also checked the performance thread on sourceforge),but if I understand correctly the current architecture *always* evaluates all attributes so that filtering can be applied (based on the attribute values).
No, not exactly. Attribute values are generated as needed during filtering, so if the filter fails, the unused attributes are not evaluated. If the record passes filtering, the rest of attributes are evaluated. However, one should not count on that behavior in attribute or filter implementation. This is just an internal optimization.
It appears from the example of adding an uptime attribute that the attribute value is heap allocated and a shared ptr returned. Presumably, to avoid issues with allocator performance in general and heap contention in particular, a custom allocator should be used for "lightweight" attributes?
Memory allocation issue is one of my concerns, However, my current point of view is that memory allocator is orthogonal to the library. There may be different solutions, such as using a faster allocator, like tcmalloc from Google, or use memory pooling inside attributes as they generate values. Besides, some attributes don't allocate memory for values.
Even with the use of constant attributes, is seems filtering performance is inherently not as good as a system which takes advantage of the constant nature of filtering attributes by performing early evaluation of filters and caching the result where possible. Specifically, in the case where the logger "identity" comprised of a set of constant attributes associated with the log, is established at logger construction, it should be possible to determine whether the filter (assuming the filter itself does not change) will always match or never match at that time. Caching this result means that the evaluation of the filter is simply the evaluation of a bool. Potentially, this could be implemented within the proposed logging design by making an "identity" attribute that caches internally in this way and using it as the only attribute for loggers and filters that require high performance when the logger is disabled.
Is this a reasonable approach to take? The main problem I see is that in general filters are not constant and this would need some sort of filter/attribute interaction to force reevaluation when the filter is "newer" than the cached bool result.
The other, and the most critical obstacle to follow this approach is that there's no distinction between attribute values that came from loggers and other sets. Also, filters don't interact with attributes, but rather with their values, which often are different and independent objects. Besides, one of the most frequent filtering rejections is when severity level is not high enough, and severity level is not constant. I don't think that caching would help here. But in case of channel logging, I agree, this could give some performance boost. However, right now I don't see how I could implement that without major architecture changes. It's worth thinking over, though.