I have been using Boost Sprit as a parser for a project that I have been
working on lately. At this point, I have been trying to expand the
software and, in doing so, have had the nagging feeling that there is
something wrong with my overall design. In the best case, I am not
making the best use of my tools, and in the worst case, I am concerned
that the design/code is becoming overly brittle. As an aside, I don’t
have a huge amount of programming experience with this type of application.
A word of background / context :
Essentially, the program performs batch ‘jobs’ which are specified in a
text file (not dissimilar in that sense from a scheduler like Condor).
Spirit parses these ‘Job Description’ files into ‘specification objects’
(my vocabulary) that are used with a builder pattern and factories to
create the appropriate objects. I have some concerns about this which I
will describe later.
The job description (henceforth: JD) file should have 3 parts:
1. Data
2. Tools
3. Operations
The Data section is just a list of data that is to be operated on. At
this stage, this is just a vector of std::pair
containing an identifier and the path of the file. The ‘Tools’ and
‘Operations’ sections are likely composed of nested specifications (the
resulting objects use either a decorator pattern or composite pattern –
depending on the type of tool) – which was the main reason that I
started using Spirit altogether.
My question is really a request for some guidance – to better utilize
the tools available:
I have just received the requirement for section #3 (Operations), to be
included in the JD file (previously it was assumed that this would be
provided in a different manner). So – at this time, I have a working
parser for the Data and Tools portions. I have concerns with the
‘Tools’ portion that I would like to correct and not duplicate in the
‘Operations’ section that I am to be working on next.
At present, I am parsing the JD file mostly into strings, and vectors of
boost::variant -- the latter being a list of parameters in a
completely arbitrarily imposed order. As I’ve previously mentioned, I
have a few problems with this approach.
- The parameters are required to be input in an arbitrary order.
- Different ‘Tool’s and ‘Operation’s have different parameters that are
required and/or optional.
For the past weeks (I only work on this project on a very part-time
basis), I have been going in circles trying to figure out a better way
to do this. I have a strong suspicion that Fusion can be used for this.
I am also concerned about over-complicating the design, but am weary
of leaving the design too simplistic. I.e., I know that I can treat the
parameters as a std::pair> and
then parse key-value pairs with a given delimiter. I am just not
convinced that this is the best way to do this.
I am able to use any Boost Library, and have no restrictions about
compilers (I’m using a recent version of Clang, primarily, right now).
The following is a selection of the data structures and Spirit grammar
that I am using here:
struct Tool_Spec;
typedef std::vector > Tool_Options_t;
typedef std::vector Children_Tool_t;
typedef std::pair Data_Spec;
struct Job_Request {
Data_Spec data_spec;
Tool_Spec model_spec;
Operation_Spec operation_spec;
boost::optionalstd::string description;
};
struct Tool_Spec{
std::string type;
std::string data_designation;
Tool_Options_t; options;
Children_Tool_t children;
boost::optionalstd::string designation;
};
struct Operation_Spec{
/*
Unknown at this time. Need help
*/
};
Datafile %= lit("@START")
>> *Data_Description
>> Tool_Description
>> Operation_Description
>> lit("@END")
;
Data_Description %= lit('%')
>> Datasource
>> lit(';')
;
Datasource =
Designator
>> lit(':')
>> Designator
;
Designator %= +(char_("0-9a-zA-Z/._") | char_('-') );
Comment_Designator %= +(char_("0-9a-zA-Z/._, ()") | char_('-'));
Tool_Description %=
Designator
>> ':'
>> ('@' >> Designator)
>> '['
>> +Options
>> ']'
>> -('{' >> *Child_Tool >> '}')
>> -qi::lexeme[Comment_Designator]
>> ';'
;
Child_Tool %=
Designator
>> ':'
>> ('@' >> Designator)
>> '['
>> +Options
>> ']'
>> -('{' >> *Child_Tool >> '}')
>> -qi::lexeme[Comment_Designator]
>> ';'
;
Options %= (int_ | double_ ) % '|';