Serialization. On to its binary archive format.
Im sorry to bother with this, but im having lots of trouble with this. I was able to serialize and deserialize flawlessly in c++ in the binary archive. But now i need to be able to construct binary payloads from php, and then deserialize them in c++ in the same way. But i have been hex-viewing the binary output of boost::serialization and wasn't able to understand it completely. Is there any guide or general idea to understand such format? Im not sure if i understand the library enough so as to make my own archive format. for example, when i try to serialize something like this: BOOST_CLASS_IMPLEMENTATION(std::vectorstd::string, boost::serialization::object_serializable) void save(const std::vectorstd::string& a) { std::ofstream ofs("caca.txt", std::ios::binary); boost::archive::binary_oarchive oa(ofs, boost::archive::no_header); oa << a; ofs.close(); } int main() { std::vectorstd::string sarr; sarr.push_back("Uno"); sarr.push_back("Dos"); sarr.push_back("Tres"); save(sarr); return 0; } i get: 00000000 03 00 00 00 00 00 00 00 03 00 00 00 55 6E 6F 03 ············Uno· 00000010 00 00 00 44 6F 73 04 00 00 00 54 72 65 73 ···Dos····Tres So that is: 03 00 00 00 means the vector has 3 elements. then 00 00 00 00 i don't know what it is for and then i have 3 times 4 bytes for the length of the string, and then the string. that weird 00 00 00 00 does not happen if i had (for example) a std::vector<int>. and more of that kind of things happen when i try to use derived types. like... class Request { public: virtual void run() = 0; template<class Archive> void serialize(Archive & ar, const unsigned int) { } }; BOOST_CLASS_IMPLEMENTATION(Request, boost::serialization::object_serializable) class Request1 : public Request { public: void run() { std::cout << " | " << ia << " | " << ib << " | " << sname << " | "; for(int i = 0; i < iarr.size(); ++i) std::cout << iarr[i] << " | "; std::cout << std::endl; } template<class Archive> void serialize(Archive & ar, const unsigned int) { ar & boost::serialization::base_object<Request>(*this); ar & ia & ib & sname & iarr; } public: int ia; int ib; std::string sname; std::vector<int> iarr; }; BOOST_CLASS_EXPORT(Request1) BOOST_CLASS_TRACKING(Request1, boost::serialization::track_never) BOOST_CLASS_VERSION(Request1, 0xff) void arrrg(const Request* a) { std::ofstream ofs("caca.txt", std::ios::binary); boost::archive::binary_oarchive oa(ofs, boost::archive::no_header); oa << a; ofs.close(); } int main() { Request1* temp(new Request1); temp->ia = 1; temp->ib = 2; temp->sname = "Omg!"; temp->iarr.push_back(3); temp->iarr.push_back(4); temp->iarr.push_back(5); Request* a(temp); arrrg(a); return 0; } i get: 00000000 00 00 08 00 00 00 52 65 71 75 65 73 74 31 00 FF ··?···Request1·· 00000010 01 00 00 00 02 00 00 00 04 00 00 00 4F 6D 67 21 ?···?···?···Omg! 00000020 03 00 00 00 03 00 00 00 04 00 00 00 05 00 00 00 ?···?···?···?··· So what are the first 2 bytes in zero? Then 08 00 00 00 is the length of the derived type. then 52 65 71 75 65 73 74 31 is the string. then i get another 00 which i don't know what it is. then FF is the version and the rest is just as expected. so it is also weird. but even weirdest if (for example) i make the arrrg function do oa << a; oa << a; instead of just oa << a; then i get: 00000000 00 00 08 00 00 00 52 65 71 75 65 73 74 31 00 FF ··?···Request1·· 00000010 01 00 00 00 02 00 00 00 04 00 00 00 4F 6D 67 21 ?···?···?···Omg! 00000020 03 00 00 00 03 00 00 00 04 00 00 00 05 00 00 00 ?···?···?···?··· 00000030 00 00 01 00 00 00 02 00 00 00 04 00 00 00 4F 6D ··?···?···?···Om 00000040 67 21 03 00 00 00 03 00 00 00 04 00 00 00 05 00 g!?···?···?···?· 00000050 00 00 ·· after the first 05 00 00 00, i would suspect it should come the same object again (as tracking is off!) but i get that 00 00 again first. and then just the data but with no "derived type signature". so how would the deserialization know what is the derived type for the data that follows? Im a little lost here. Id apreciate any help. Thanks a lot in advance. I'll be trying to hack this thing... :S
To my way of thinking this is an almost impossible task. Pointless as well. If you really want to do something like this, you're more likely to be successfull creating an xml archive. A better approach would be to create a C++ component to be invoked from your php script which would take its command line arguments and create an archive. Or us php to create a string of arguments in any confvenient format. Your C++ code would then process these arguments then invoke serialization to create an archive - if you still need it. Robert Ramey Alejandro Martinez wrote:
Im sorry to bother with this, but im having lots of trouble with this. I was able to serialize and deserialize flawlessly in c++ in the binary archive. But now i need to be able to construct binary payloads from php, and then deserialize them in c++ in the same way.
But i have been hex-viewing the binary output of boost::serialization and wasn't able to understand it completely.
Is there any guide or general idea to understand such format? Im not sure if i understand the library enough so as to make my own archive format. for example, when i try to serialize something like this:
BOOST_CLASS_IMPLEMENTATION(std::vectorstd::string, boost::serialization::object_serializable)
void save(const std::vectorstd::string& a) { std::ofstream ofs("caca.txt", std::ios::binary); boost::archive::binary_oarchive oa(ofs, boost::archive::no_header); oa << a; ofs.close(); }
int main() { std::vectorstd::string sarr; sarr.push_back("Uno"); sarr.push_back("Dos"); sarr.push_back("Tres");
save(sarr);
return 0; }
i get:
00000000 03 00 00 00 00 00 00 00 03 00 00 00 55 6E 6F 03 ············Uno· 00000010 00 00 00 44 6F 73 04 00 00 00 54 72 65 73 ···Dos····Tres So that is: 03 00 00 00 means the vector has 3 elements. then 00 00 00 00 i don't know what it is for and then i have 3 times 4 bytes for the length of the string, and then the string.
that weird 00 00 00 00 does not happen if i had (for example) a std::vector<int>.
and more of that kind of things happen when i try to use derived types. like...
class Request { public: virtual void run() = 0; template<class Archive> void serialize(Archive & ar, const unsigned int) { } };
BOOST_CLASS_IMPLEMENTATION(Request, boost::serialization::object_serializable)
class Request1 : public Request { public: void run() { std::cout << " | " << ia << " | " << ib << " | " << sname << " | "; for(int i = 0; i < iarr.size(); ++i) std::cout << iarr[i] << " | "; std::cout << std::endl; }
template<class Archive> void serialize(Archive & ar, const unsigned int) { ar & boost::serialization::base_object<Request>(*this); ar & ia & ib & sname & iarr; } public: int ia; int ib; std::string sname; std::vector<int> iarr; };
BOOST_CLASS_EXPORT(Request1) BOOST_CLASS_TRACKING(Request1, boost::serialization::track_never) BOOST_CLASS_VERSION(Request1, 0xff)
void arrrg(const Request* a) { std::ofstream ofs("caca.txt", std::ios::binary); boost::archive::binary_oarchive oa(ofs, boost::archive::no_header); oa << a; ofs.close(); }
int main() { Request1* temp(new Request1); temp->ia = 1; temp->ib = 2; temp->sname = "Omg!"; temp->iarr.push_back(3); temp->iarr.push_back(4); temp->iarr.push_back(5);
Request* a(temp); arrrg(a); return 0; }
i get:
00000000 00 00 08 00 00 00 52 65 71 75 65 73 74 31 00 FF ··?···Request1·· 00000010 01 00 00 00 02 00 00 00 04 00 00 00 4F 6D 67 21 ?···?···?···Omg! 00000020 03 00 00 00 03 00 00 00 04 00 00 00 05 00 00 00 ?···?···?···?···
So what are the first 2 bytes in zero?
Then 08 00 00 00 is the length of the derived type. then 52 65 71 75 65 73 74 31 is the string. then i get another 00 which i don't know what it is. then FF is the version and the rest is just as expected.
so it is also weird.
but even weirdest if (for example) i make the arrrg function do
oa << a; oa << a; instead of just oa << a;
then i get:
00000000 00 00 08 00 00 00 52 65 71 75 65 73 74 31 00 FF ··?···Request1·· 00000010 01 00 00 00 02 00 00 00 04 00 00 00 4F 6D 67 21 ?···?···?···Omg! 00000020 03 00 00 00 03 00 00 00 04 00 00 00 05 00 00 00 ?···?···?···?··· 00000030 00 00 01 00 00 00 02 00 00 00 04 00 00 00 4F 6D ··?···?···?···Om 00000040 67 21 03 00 00 00 03 00 00 00 04 00 00 00 05 00 g!?···?···?···?· 00000050 00 00 ··
after the first 05 00 00 00, i would suspect it should come the same object again (as tracking is off!)
but i get that 00 00 again first. and then just the data but with no "derived type signature". so how would the deserialization know what is the derived type for the data that follows?
Im a little lost here. Id apreciate any help. Thanks a lot in advance. I'll be trying to hack this thing... :S
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (2)
-
Alejandro Martinez
-
Robert Ramey