Re: [Boost-users] [WAVE] bug: generic lexing error
Andreas Sæbjørnsen wrote:
I do not know why you were not able to reproduce the error as I reproduced it on two different machines using two different boost installations. Using the attached source-file you should be able to reproduce the error. Both icc and g++ compiles this file without error.
If the behaviour in section 2.2.2 of the standard lead to Wave being unable to preprocess some C++ source-files that cpp and the EDG preprocessor will accept, what do you think about creating a feature for optionally turning this behaviour off?
I've induced a null byte into your sample and gcc complains about it as well. So my guess is, that something else goes wrong. But since I still have no luck in reproducing your original problem here on my Windows machine I'll have to try it on a linux box, but will have the possibility for that next week only. Regards Hartmut
Regards Andreas
On 4/21/06, Hartmut Kaiser < hartmut.kaiser@gmail.com <mailto:hartmut.kaiser@gmail.com> > wrote:
Andreas Sæbjørnsen wrote:
Using the wave driver reference implementation on the code found at: http://folk.uio.no/andsebjo/bugInducingCode.C <http://folk.uio.no/andsebjo/bugInducingCode.C> I get the following error: bugInducingCode.C(10157): error: generic lexing error: '\000' in input stream This code compiles with g++ and if any line is cut within the file it does not fail with Wave. The code contains only C++ syntax so it is basically a test of the cpplexer. Is this a know problem?
This problem occurs in both my two months old CVS version of boost and the 1.3.1 build.
Hmmm. Sorry I'm not able to reproduce this problem here. But the error you get says your input stream contains a binary 0 (zero) byte. And yes it's a lexer diagnostic.
The standard says ( 2.2.2 [lex.charset]):
<quote> If the hexadecimal value for a universal character name is less than 0x20 or in the range 0x7F0x9F (inclusive), or if the universal character name designates a character in the basic source character set, then the program is illformed. </quote>
So I'm pretty sure Wave is right to diagnose this.
Could you send me the file as an attachment, please, just to make sure I really get it as you have it on your disk.
Regards Hartmut
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org <mailto:Boost-users@lists.boost.org> http://lists.boost.org/mailman/listinfo.cgi/boost-users <http://lists.boost.org/mailman/listinfo.cgi/boost-users>
Sounds good. The complete specification for the linux boxes I have tested this on is RHEL 3 fully updated on an intel Xeon with g++ 3.4.3 Ubuntu Breezy Badger fully updated on an intel pentium M with g++ 4.0.3 On both machines I tested both the 1.33.1 release and the CVS version. I appreciate all your help in tracking down this issue. Thanks Andreas On 4/22/06, Hartmut Kaiser <hartmut.kaiser@gmail.com> wrote:
Andreas Sæbjørnsen wrote:
I do not know why you were not able to reproduce the error as I reproduced it on two different machines using two different boost installations. Using the attached source-file you should be able to reproduce the error. Both icc and g++ compiles this file without error.
If the behaviour in section 2.2.2 of the standard lead to Wave being unable to preprocess some C++ source-files that cpp and the EDG preprocessor will accept, what do you think about creating a feature for optionally turning this behaviour off?
I've induced a null byte into your sample and gcc complains about it as well. So my guess is, that something else goes wrong. But since I still have no luck in reproducing your original problem here on my Windows machine I'll have to try it on a linux box, but will have the possibility for that next week only.
Regards Hartmut
Regards Andreas
On 4/21/06, Hartmut Kaiser < hartmut.kaiser@gmail.com <mailto:hartmut.kaiser@gmail.com> > wrote:
Andreas Sæbjørnsen wrote:
> Using the wave driver reference implementation on the code found at: > http://folk.uio.no/andsebjo/bugInducingCode.C <http://folk.uio.no/andsebjo/bugInducingCode.C> > I get the following error: > bugInducingCode.C(10157): error: generic lexing error: > '\000' in input stream This code compiles with g++ and if any > line is cut within the file it does not fail with Wave. > The code contains only C++ syntax so it is basically a test > of the cpplexer. > Is this a know problem? > > This problem occurs in both my two months old CVS version of > boost and the 1.3.1 build.
Hmmm. Sorry I'm not able to reproduce this problem here. But the error you get says your input stream contains a binary 0 (zero) byte. And yes it's a lexer diagnostic.
The standard says ( 2.2.2 [lex.charset]):
<quote> If the hexadecimal value for a universal character name is less than 0x20 or in the range 0x7F0x9F (inclusive), or if the universal character name designates a character in the basic source character set, then the program is illformed. </quote>
So I'm pretty sure Wave is right to diagnose this.
Could you send me the file as an attachment, please, just to make sure I really get it as you have it on your disk.
Regards Hartmut
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org <mailto:Boost-users@lists.boost.org> http://lists.boost.org/mailman/listinfo.cgi/boost-users <http://lists.boost.org/mailman/listinfo.cgi/boost-users>
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
output.txt ==2300== Source and destination overlap in memcpy(0x1BBF5E98, 0x1BBF5E99,
After debugging Wave I believe that the error I reported is caused by a memory error. There is a buffer within 'cpp.re' and 'cpp_re.cpp' which is defined the size 196608 ('#define BSIZE 196608'), where the size of the smallest failure inducing input file to the wave driver is 196608. If BSIZE is increased to an arbitrary size bigger than 196608 the Wave error I reported disappears. But increasing the BSIZE is nut a bug fix as the memory error will now occur for a file larger than 393216. Are you able to reproduce this memory error using a memory debugger? The file bugInducingCode.C I send to you earlier is of size 196617, but just trim of a few characters, which is exactly what I did, and you get a code of the same size as I used here. The memory debugging output from valgraind for wave when BSIZE is 196608 is: %valgrind --tool=memcheck ../../../bin/boost/tools/wave/build/wave/gcc/debug/wave ../bugInducingCode.C 196607) ==2300== at 0x1B903A61: memcpy (mac_replace_strmem.c:113) ==2300== by 0x81026B3: boost::wave::cpplexer::re2clex::fill(boost::wave::cpplexer::re2clex::Scanner*, unsigned char*) (cpp.re:188) ==2300== by 0x8102CC8: boost::wave::cpplexer::re2clex::scan(boost::wave::cpplexer::re2clex::Scanner*) (cpp_re.cpp:411) ==2300== by 0x80FEED8: _ZN5boost4wave8cpplexer7re2clex5lexerIN9__gnu_cxx17__normal_iteratorIPcSsEENS0_4util13file_positionINS8_11flex_stringIcSt11char_traitsIcESaIcENS8_9CowStringINS8_22AllocatorStringStorageIcSD_EES6_EEEEEEE3getEv (cpp_re2c_lexer.hpp:142) If BSIZE is increased to 393216 Valgrind does not report any memory errors for a file of size 196608. Thanks Andreas On 4/22/06, Andreas Sæbjørnsen <andreas.saebjoernsen@gmail.com> wrote:
Sounds good. The complete specification for the linux boxes I have tested this on is RHEL 3 fully updated on an intel Xeon with g++ 3.4.3 Ubuntu Breezy Badger fully updated on an intel pentium M with g++ 4.0.3 On both machines I tested both the 1.33.1 release and the CVS version. I appreciate all your help in tracking down this issue.
Thanks Andreas
On 4/22/06, Hartmut Kaiser < hartmut.kaiser@gmail.com> wrote:
Andreas Sæbjørnsen wrote:
I do not know why you were not able to reproduce the error as I reproduced it on two different machines using two different boost installations. Using the attached source-file you should be able to reproduce the error. Both icc and g++ compiles this file without error.
If the behaviour in section 2.2.2 of the standard lead to Wave being unable to preprocess some C++ source-files that cpp and the EDG preprocessor will accept, what do you think about creating a feature for optionally turning this behaviour off?
I've induced a null byte into your sample and gcc complains about it as well. So my guess is, that something else goes wrong. But since I still have no luck in reproducing your original problem here on my Windows machine I'll have to try it on a linux box, but will have the possibility for that next week only.
Regards Hartmut
Regards Andreas
On 4/21/06, Hartmut Kaiser < hartmut.kaiser@gmail.com <mailto:hartmut.kaiser@gmail.com> > wrote:
Andreas Sæbjørnsen wrote:
> Using the wave driver reference implementation on the code found at: > http://folk.uio.no/andsebjo/bugInducingCode.C < http://folk.uio.no/andsebjo/bugInducingCode.C> > I get the following error: > bugInducingCode.C(10157): error: generic lexing error: > '\000' in input stream This code compiles with g++ and if any > line is cut within the file it does not fail with Wave. > The code contains only C++ syntax so it is basically a test > of the cpplexer. > Is this a know problem? > > This problem occurs in both my two months old CVS version of > boost and the 1.3.1 build.
Hmmm. Sorry I'm not able to reproduce this problem here. But the error you get says your input stream contains a binary 0 (zero) byte. And yes it's a lexer diagnostic.
The standard says ( 2.2.2 [lex.charset]):
<quote> If the hexadecimal value for a universal character name is less than 0x20 or in the range 0x7F0x9F (inclusive), or if the universal character name designates a character in the basic source character set, then the program is illformed. </quote>
So I'm pretty sure Wave is right to diagnose this.
Could you send me the file as an attachment, please, just to make sure I really get it as you have it on your disk.
Regards Hartmut
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org <mailto:Boost-users@lists.boost.org > http://lists.boost.org/mailman/listinfo.cgi/boost-users < http://lists.boost.org/mailman/listinfo.cgi/boost-users>
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Andreas Sæbjørnsen wrote:
After debugging Wave I believe that the error I reported is caused by a memory error. There is a buffer within 'cpp.re' and 'cpp_re.cpp' which is defined the size 196608 ('#define BSIZE 196608'), where the size of the smallest failure inducing input file to the wave driver is 196608. If BSIZE is increased to an arbitrary size bigger than 196608 the Wave error I reported disappears. But increasing the BSIZE is nut a bug fix as the memory error will now occur for a file larger than 393216. Are you able to reproduce this memory error using a memory debugger?
The file bugInducingCode.C I send to you earlier is of size 196617, but just trim of a few characters, which is exactly what I did, and you get a code of the same size as I used here.
The memory debugging output from valgraind for wave when BSIZE is 196608 is: %valgrind --tool=memcheck ../../../bin/boost/tools/wave/build/wave/gcc/debug/wave ../bugInducingCode.C > output.txt
==2300== Source and destination overlap in memcpy(0x1BBF5E98, 0x1BBF5E99, 196607) ==2300== at 0x1B903A61: memcpy (mac_replace_strmem.c:113) ==2300== by 0x81026B3: boost::wave::cpplexer::re2clex::fill(boost::wave::cpplexer::re 2clex::Scanner*, unsigned char*) (cpp.re:188 <http://cpp.re:188/> ) ==2300== by 0x8102CC8: boost::wave::cpplexer::re2clex::scan(boost::wave::cpplexer::re 2clex::Scanner*) (cpp_re.cpp:411) ==2300== by 0x80FEED8: _ZN5boost4wave8cpplexer7re2clex5lexerIN9__gnu_cxx17__normal_it eratorIPcSsEENS0_4util13file_positionINS8_11flex_stringIcSt11c har_traitsIcESaIcENS8_9CowStringINS8_22AllocatorStringStorageI cSD_EES6_EEEEEEE3getEv (cpp_re2c_lexer.hpp:142)
If BSIZE is increased to 393216 Valgrind does not report any memory errors for a file of size 196608.
Thanks for figuring that out! I already suspected it has something to do with the buffer/file size, but was not able to reproduce the problem. This is probably because of a different implementation of memcpy, which on Windows works correct with memory blocks overlapping this way... I'll tried to make these memory copy operations safer now. Could you please retry with the HEAD CVS version? BTW: The Re2c scanner buffer management code is fairly old code which I have not changed very much from its former life in a 'C' oriented context... Regards Hartmut
I still experience the same problems as before with regards to this memory bug. I am not able to test this with the wave driver though as it compiles with a lot of errors in a lot of header files with gcc 3.4.3 on linux, but it shows up in the lexed_tokens example. Am I correct to believe that HEAD is the default CVS version for boost-cvs? I tested this on a default checkout from the boost-cvs. Thanks Andreas
Andreas Sæbjørnsen wrote:
I still experience the same problems as before with regards to this memory bug. I am not able to test this with the wave driver though as it compiles with a lot of errors in a lot of header files with gcc 3.4.3 on linux, but it shows up in the lexed_tokens example.
That's puzzling me. Why the Wave driver doesn't compile for you? Are you sure to really have the corrected version (I've replaced all memcpy's in cpp_re.cpp with memmove's, which should avoid the problem reported by valgrind)?
Am I correct to believe that HEAD is the default CVS version for boost-cvs? I tested this on a default checkout from the boost-cvs.
HEAD should be the default. Regards Hartmut
participants (2)
-
Andreas Sæbjørnsen
-
Hartmut Kaiser