Re: [boost] Review of mustache starts today.

18 Feb 2023

      Le dimanche 05 février 2023 à 11:55 +0800, Klemens Morgenstern via
Boost a écrit :
...
The formal review of the mustache starts today, 
Review
========
Hi. This is my review of the proposed boost.mustache library. To put
some context, I'm not particularly knowledgeable of templating
libraries, i have been using jinja for small tasks, and that's nearly
all. Writing this review was the occasion to dig more in this area, and
I wish to thank Peter for proposing the library.
...
Please explicitly state that you either *accept* or *reject* the
inclusion of this library into boost.
The accept can be conditional.
I think this library should be REJECTED for boost inclusion in its
current state. I think there are too many defects that needs to be
addressed before inclusion. However, I think a templating engine would
be a nice addition, so I would like this library to evolve and be
proposed again for inclusion in boost.
...
Also please indicate the time & effort spent on the evaluation and
give the reasons for your decision.
I have spent between 8 and 10 hours I think, across several days (which
makes it hard to track an exact count), reading the documentation, the
documentation on mustache, reading the source code and playing with toy
projects to evaluate the library.
...
* Do you have an application for this library?
I currently maintain a python script which generate some code from
description files in yaml format, using jinja2 templates. I would be
happy to use a C++ tool for that, which would avoid those python
dependencies on the build server. Unfortunately, i came to the
conclusion that the proposed library is not a good fit.
...
* Does this library bring real benefit to C++ developers for real 
  world use-case?
A templating engine is a useful tool. I have some concerns, however,
whether mustache is a good enough/adequate template language. The
concerns are as follows:

- html escaping by default. This is the kind of design decision that
tend to show the language was not designed for general use. I also fail
to see how that's supposed to work in real-life: html documents are
heterogeneous, and several of their sections have different encoding
rules (embedded css or scripts, url arguments, just to name a few). To
me, it looks like php magic_quotes: the kind of ideas that are
inherently bad, and will only cause more trouble than the problem
they're trying to solve.
- no specification of encoding. I'm ok with utf-8 only, but that ought
to be specified somewhere.
- the lack of filters (like jinja filters). Filters are a nice way to
transform the data at the point where it is used. This could be used
for value escaping (actually, this is how we use it to generate valid
c++ identifiers from values in our yaml description files).
- it is very hard to generate correctly a comma separated list of
values using mustache templates. This is not a so uncommon need, there
ought to be a canonical simple way to do that.
- this is a lesser concern, but the specification of mustache 
templating language is unreadable as hell. I really wished there were a
RFC instead of this list of yml files. And why isn't a template
provided to format them in a really human readable format, and
the result widely available on a web page? I just wonder...
...
* Does the API match with current best practices?
The API is rather simple. It uses string_view when possible, modern
stuff, etc.
But I was surprised to see the logic. If I were to implement a
templating engine (which is something I've never done, so I may be
completely wrong on this), I would first compile the template into an
internal memory representation, and then apply that compiled template
to different data, reusing the parsing work done (something similar to
SQL prepared statements). This approach is somehow also suggested by
mustache(5) man page, which tells that partials “are rendered at
runtime (as opposed to compile time)”. So, there could be a compile
phase. But that's not the case with boost.mustache: everything is
rendered at runtime. This is not an issue per-se: it works
fine, even when the template is read by blocks (ie, the template is not
fully available during the rendering - I guess supporting this use case
may have lead to these design decisions).

However, there are some oddities:
- when building a renderer object, you need to feed them the partials,
and the data. But obviously, the partials implementation is going to
depend on your template. Let's say i wish to export the same set of
data to different output formats, like markdown and html, i could use
the same renderer with different templates. But that's not the
case, as the partials are likely to differ as well.
- the API requires json objects, or converts them to json. Playing with
the HTML example, I changed:
    boost::mustache::render( html, std::cout, ref,
        { { "header", header }, { "footer", footer }, { "item", item},
          { "body", body } } );
  to 
    boost::mustache::renderer rd(ref, 
        { { "header", header }, { "footer", footer }, { "item", item },
          { "body", body } } );
    rd.render_some(html1, std::cout);
    rd.finish(std::cout);
(as suggested by the documentation). To my surprise, it failed to
compile. Fixing that was easy, but it made me wonder if the current
approach was a good one. I see no reason for partials to be a
json_object: they could be anything that maps a string with another (a
map<string,string> being a reasonable choice, but it would make sense
to also support map<string_view, string_view>). The problem is that
they are currently stored at render object creation, so i understand
why the hardcoding of the type. But i think the wrong one was chosen
here.
- the renderer takes ownership of the json data. I see plenty of cases
where i will not want this, as it will require an extra copy.
- the current API makes it very difficult to integrate lambdas, which
could be a way to work around several limitations of the templating
language.
- the output_ref concept has no way to indicate that it is full, to
specify a maximum size, or to report how many characters it has
actually written. This is an issue, because it makes buffered output
nearly impossible. There's no way, given the input, to correctly
predict the output size. The library correctly handles slicing
the input template, but won't be able to slice its output, and resume
it later. This is likely to cause integration issues. I would prefer
the opposite choice (ie, the whole input template must reside in
memory, but slicing of the output is supported).

And, the worse in my opinion is that there is no error reporting
mechanism. The mustache man page tells that a variable “miss” returns
an empty string, but that it is usually configurable. Unfortunately, it
is not with boost.mustache, which will always silently use an empty
string. Worse, it also do it for partials (for which I fail to see what
the requirements are supposed to be in the documentation). There should
be a rendering policy to select the behaviour in case these events
occurs.
...
* Is the documentation helpful and clear?
The documentation is OK if you want a quick way to generate a simple
document. However, it lacks several sections:

- design decisions. Why is there no “compilation” phase to reuse
templates? Note that the templating language itself lacks a description
on its design decisions (why having lambdas instead of filters, for
example?)
- performance. There is no section on how the library performs,
compared to other implementations. There is also no guidance on how to
efficiently use the library. If I'm doing html templating
server-side, this is obviously something that i will look into.
- there is no section on memory handling -> what are the allocation
strategies, how much parsing and rendering a template allocates, etc.
(ideally, rendering a template should not allocate memory — excluding
memory allocated by the output_ref — unless partials are in use).
- the render_some supports slicing the input. But that's not
documented. It is also not documented what i gain by using this api
instead of the render function (what can be reused?)
- given that the overall documentation on mustache templating language
is pretty bad, some effort could be made to take it to a better level.
- error handling. I see two types of errors:
  - missing variables/partials in template resolution (I already talked
about this)
  - template errors. I'm not sure these ones exist, my understanding is
that mustache is designed so that anything that would be a parse error
is just “non-mustache” and rendered as-is. But nowhere it is
documented.
...
* Did you try to use it? What problems or surprises did you 
  encounter?
I used it only in an evaluation context, for the review. No problems
outside those mentioned.
...
* What is your evaluation of the implementation?
I did not have more than a quick look at it. I've found a few
surprising things at glance (but since this is not a deep review, they
may be incomprehensions/mistakes on my side):

- the json::string whitespace_ . I fail to see why there is a specific
handling for leading spaces, and why the parser does not start into
state_passthrough. I was wondering also why just counting the spaces
was insufficient, but it seems to be related to handling both ' '
and '\t'.
- the data object is stored as non-const in the context stack. I have
already mentioned this in the API comments, these extra copies are
going to be issues.
- the context stack is a json::array. Unless i'm mistaken, this means
that entering a section will cause a copy of the subobject, on top of
the context stack. Since the object is not altered by the renderer,
storing a pointer to the subobject in the context stack should be
enough, and would avoid a copy.
- html escaping is not done correctly in all cases ('\r', for example,
should be escaped in the output).

I did not look further, as I think most of these are implementation
issues that can be fixed later, unlike the API/design choices that
motivates my refusal.

Coming to a conclusion, I think the following question should be
answered:
- what is the purpose of this library? Like in, “do we want a
templating mechanism available, or do we want a strict mustache
conforming implementation?”

I think the former would provide great value to boost. However, it
would mean evaluating different template languages available,
evaluating the use cases we want to support, and taking design
decisions accordingly. Mustache can be a starting point, i don't know
how badly specified are other template languages. On the other side, if
what we want is the latter, then the roadmap would be different: fix
the few design issues, add lambda support, write all missing
documentation, and integrate to boost. But to me, it would look like a
missed opportunity: contrary to what the mustache authors claims,
mustache is *not* a templating system for anything. It has just far too
many limitations and design failures that needs to be circumvented for
this. Someone in a review looks forward using this library to produce
xml or json only by changing the templates: I wish him good luck with
that. My opinion on this is that's it's not reasonably doable with
mustache...

Regards,

Julien

Re: [boost] Review of mustache starts today.

Julien Blanc