RE: [boost] build_monitor.exe: a cure worse than the disease

At Sunday 2004-08-01 16:37, you wrote:
Victor A. Wagner Jr. writes:
I have NO idea what criteria build_monitor uses to killing processes,
Two conditions:
1) They have to be spawned by 'bjam.exe' executable.
how do you know?
2) They have to be running for more than 5 minutes or show an error dialog window.
these other things run 100% of the time on my system. in the middle of the night they may even go for 5 minutes with NO I/O.
but when I awoke this morning, it had managed to kill: the dnet "run in the background" ORG process, Pirch (my IRC client), a proxy that lets me get dcc's through my hardware firewall, Trillian (the instant message program), SpyBot resident..... I think you get the picture.
How did you determine that it's build_monitor that caused all that? While there is always a possibility of a program having bugs, it's hard to believe that this particular one started to misbehave all of a sudden after a year of stable work and no changes to the source code.
this thing just walked through my system like a hoard of Vandals (or was it Visigoths). whatever
I'm sorry to hear about the whole thing, but let's get to the roots of it. First of all, why do you think it was build_monitor? Second, what operating system are you running?
1) My system has run flawlessly for over a year also and two days ago was when I first installed monitoring..... the first time it ran a bunch of stuff wasn't working in the morning, but I'd been very tired when I went to bed and figured MAYBE I'd shut them down (unlikely as I _never_ _NEVER_ shut down the dnet client)...but I let it go... I made some changes to regression.py so it would work better here then manually ran the system that afternoon...no problems. I let it run automatically again overnight and "bam" all these things that are ONLY started at login time and NEVER touched by me are missing again.... the _logical_ place to look is at "what's new?"... what's new is -monitored on the test line in my script. btw, this morning I also noticed that build_monitor had it's own window open and was spewing out TONS or data to it... is that normal? I took out the teststreams test (I think) but this afternoon's run had already started... but I chose ignore when the modal dialog (may the .....) popped up. btw, tho my log clearly shows them sent to the ftp site, they're not on the website yet... the date is Sunday the time 21:30:03 in the xml file I'd be more than happy to help find this as any problems in running the regression tests will just mean less folks willing to run them. if you get back to me quickly, I can turn on the "-monitored" again... the run starts in just over 2 hours (on the 1/2 hour) (still missing teststreams at this point). 2) WinXPpro sp1
-- Aleksey Gurtovoy MetaCommunications Engineering _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Victor A. Wagner Jr. http://rudbek.com The five most dangerous words in the English language: "There oughta be a law"

On Mon, 02 Aug 2004 00:16:54 -0700, Victor A. Wagner Jr. <vawjr@rudbek.com> wrote:
2) WinXPpro sp1
If it's XP (or any other NT-based OS, though XP makes this easier) then, as a temporary measure, you could run the tests and the build monitor under a separate user account. The separate user would have no permissions to terminate any of your processes. Use the "Run as user" option that appears e.g. if you right-click on cmd.exe. /Mattias

I'll give that a go tomorrow, testing starts in about 2 minutes, I don't think I could set it up fast enough and it's 0230 here time for me to hit the sack At Monday 2004-08-02 01:35, you wrote:
On Mon, 02 Aug 2004 00:16:54 -0700, Victor A. Wagner Jr. <vawjr@rudbek.com> wrote:
2) WinXPpro sp1
If it's XP (or any other NT-based OS, though XP makes this easier) then, as a temporary measure, you could run the tests and the build monitor under a separate user account. The separate user would have no permissions to terminate any of your processes. Use the "Run as user" option that appears e.g. if you right-click on cmd.exe.
/Mattias _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Victor A. Wagner Jr. http://rudbek.com The five most dangerous words in the English language: "There oughta be a law"

Victor A. Wagner Jr. writes:
At Sunday 2004-08-01 16:37, you wrote:
Victor A. Wagner Jr. writes:
I have NO idea what criteria build_monitor uses to killing processes,
Two conditions:
1) They have to be spawned by 'bjam.exe' executable.
how do you know?
By recursively obtaining their parent processes using 'NtQueryInformationProcess' (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/bas...) and checking if any of them is named 'bjam.exe'.
2) They have to be running for more than 5 minutes or show an error dialog window.
these other things run 100% of the time on my system. in the middle of the night they may even go for 5 minutes with NO I/O.
In case I wasn't clear, *both* of the above conditions have to hold for the process to become a candidate for termination.
but when I awoke this morning, it had managed to kill: the dnet "run in the background" ORG process, Pirch (my IRC client), a proxy that lets me get dcc's through my hardware firewall, Trillian (the instant message program), SpyBot resident..... I think you get the picture.
How did you determine that it's build_monitor that caused all that? While there is always a possibility of a program having bugs, it's hard to believe that this particular one started to misbehave all of a sudden after a year of stable work and no changes to the source code.
this thing just walked through my system like a hoard of Vandals (or was it Visigoths). whatever
I'm sorry to hear about the whole thing, but let's get to the roots of it. First of all, why do you think it was build_monitor? Second, what operating system are you running?
1) My system has run flawlessly for over a year also and two days ago was when I first installed monitoring..... the first time it ran a bunch of stuff wasn't working in the morning, but I'd been very tired when I went to bed and figured MAYBE I'd shut them down (unlikely as I _never_ _NEVER_ shut down the dnet client)...but I let it go... I made some changes to regression.py so it would work better here then manually ran the system that afternoon...no problems. I let it run automatically again overnight and "bam" all these things that are ONLY started at login time and NEVER touched by me are missing again.... the _logical_ place to look is at "what's new?"... what's new is -monitored on the test line in my script.
Understood. Well, there is a slight chance that the layout of PROCESS_BASIC_INFORMATION structure on WindowsXP is different from the one on Windows 2000 (the stucture is "for internal use"), and that could cause the kind of reckless behavior you've seen.
btw, this morning I also noticed that build_monitor had it's own window open and was spewing out TONS or data to it... is that normal?
Yep. It simply logs everything it does.
I took out the teststreams test (I think) but this afternoon's run had already started... but I chose ignore when the modal dialog (may the .....) popped up. btw, tho my log clearly shows them sent to the ftp site, they're not on the website yet... the date is Sunday the time 21:30:03 in the xml file
There are there now.
I'd be more than happy to help find this as any problems in running the regression tests will just mean less folks willing to run them.
Thank you.
if you get back to me quickly, I can turn on the "-monitored" again... the run starts in just over 2 hours (on the 1/2 hour) (still missing teststreams at this point).
Let us test on Windows XP locally first.
2) WinXPpro sp1
Thanks! -- Aleksey Gurtovoy MetaCommunications Engineering
participants (3)
-
Aleksey Gurtovoy
-
Mattias Flodin
-
Victor A. Wagner Jr.