Message boards : Number crunching : Status 10th May, Version 451.07
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26411 - Posted: 10 May 2014, 9:25:33 UTC

We have gone back to older version with respect to BOINC_API calls.
Version 451.07 no longer calls zipitall and returns only the fort.10
as in the past. This is looking much much better but now have too many
errors "EXIT TIME LIMIT EXCEEDED" on Windows 7 even on very very short
tests. There are some other errors but some are probably a HOST issue,
finger trouble, and just some noise. (I believe the zipitall issue was because
I was not ensuring all files were close before the call. We shall get back to
testing that once production is restarted.)

The overall failure rate on boinctest is a few %, but I am not happy. Still I
might give production a try before we lose hosts and before the CERN
users give up.

I'll keep testing over the weekend and analysing the errors and optimistically
try production Sunday evening or Monday morning (CST). Eric.
ID: 26411 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,773,333
RAC: 231,805
Message 26413 - Posted: 10 May 2014, 13:19:24 UTC

Could you mix test with production and enable some verbose logging on the test to help in resolving the issues that your still concerned with?
ID: 26413 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26414 - Posted: 10 May 2014, 13:23:17 UTC - in response to Message 26413.  

Thanks Toby; I'll try and find out about verbose logging.
I am just posting more info in 5 minutes. Eric.
ID: 26414 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26415 - Posted: 10 May 2014, 13:36:58 UTC

Investigation so far shows main problem is:

EXIT_TIME_LIMIT_EXCEEDED
and stderr
Maximum elapsed time exceeded
GetLastError 126

This is Windows only, but on different machines and the hosts
concerned run other similar WUs OK.

I see Windows 7 problem with user Radek Polish message seems
to be "Access Denied" and may be a Host config problem.

Previous tests failed because we used Sixout.zip or a bad template file.
We have just the one input file now and only fort.10 returned.

There should be no result validation errors now that we have changed
the Intel ifort Windows compiler versions and added new tests for
power supply ripple.

I am running a few thousand more tests now, try and get a good night's
sleep :-) and we shall see tomorrow. Eric
ID: 26415 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,773,333
RAC: 231,805
Message 26416 - Posted: 10 May 2014, 13:48:26 UTC

Have a good night rest!

Try to post on the main boinc forums and get some help from Rom Walton et al?

ID: 26416 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26417 - Posted: 10 May 2014, 14:47:26 UTC

Seeing "Cannot Create Process" again with "Access is denied". Eric.
ID: 26417 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26418 - Posted: 10 May 2014, 14:48:58 UTC - in response to Message 26416.  

Thanks Toby; I'll try and figure out how to do that. Eric.
(I am afraid my CERN support id just really into VMs.)

ID: 26418 · Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 6 Jul 06
Posts: 108
Credit: 661,871
RAC: 196
Message 26419 - Posted: 10 May 2014, 23:42:00 UTC

All work for my 32 bit Windows and 64 bit Linux has completed without error and been validated. Pity the credits were so hap-hazard.

Conan
ID: 26419 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26420 - Posted: 11 May 2014, 8:12:51 UTC - in response to Message 26419.  

Thanks Conan; but why haphazard?
There are basically two types of test; one to
just check the complete chain but very short,
one hundred turns. The other more realistic
for 100,000 turns like production. I think there
should be only two values for credit. Eric.
ID: 26420 · Report as offensive     Reply Quote
Werinbert

Send message
Joined: 12 May 13
Posts: 8
Credit: 1,001,060
RAC: 0
Message 26422 - Posted: 11 May 2014, 9:18:02 UTC - in response to Message 26420.  

I think Conan is referring to the case where multiple WU all have about the same processing time but the granted credits are very different.

In my case (run, CPU, credit):
2,075.13 1,827.55 7.22
2,034.96 1,892.15 50.31
2,141.98 1,980.95 26.91
2,114.57 1,974.30 34.65

Conan's results appear to be similar.
ID: 26422 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 281
Credit: 11,859,285
RAC: 1
Message 26423 - Posted: 11 May 2014, 17:45:45 UTC

My 2 most random credit awards:
36517150 run time 3,992.10... cpu 2,346.44.... 10.17credits
36516858 run time 3,070.81... cpu 1,828.62... 100.28credits
ID: 26423 · Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 6 Jul 06
Posts: 108
Credit: 661,871
RAC: 196
Message 26427 - Posted: 12 May 2014, 12:00:33 UTC

Yes Eric,

Just what Werinbert and Ray Murray have shown.

There may be two basic different work units types that should give two values based on their run/CPU times but this is not happening very consistently.

I have examples of work units that each have run for around 1470 seconds but credit can vary from just over 1 credit to just over 60 credits.

The overall average is still low but tolerable.

The greatest differences seem to be with the longer running work units.

As these are test work units I not too worried at the moment but when fully working it will become a bit annoying I would think.

Thanks for all your work Eric, you do a Stirling job for us volunteers and it is appreciated.

Conan
ID: 26427 · Report as offensive     Reply Quote
Profile jay

Send message
Joined: 10 Aug 07
Posts: 54
Credit: 813,704
RAC: 116
Message 26428 - Posted: 12 May 2014, 14:11:30 UTC
Last modified: 12 May 2014, 14:12:07 UTC

Greetings!!

The server status [edit] says there is work. But can't get any - linux or windows
a request goes out, the reponse comes back "no tasks sent".

I turned on logging and got:

Mon 12 May 2014 10:02:32 AM EDT	LHC@home 1.0	update requested by user
Mon 12 May 2014 10:02:35 AM EDT	LHC@home 1.0	Sending scheduler request: Requested by user.
Mon 12 May 2014 10:02:35 AM EDT	LHC@home 1.0	Requesting new tasks
Mon 12 May 2014 10:02:35 AM EDT		[http_debug] HTTP_OP::init_post(): http://lhcathomeclassic.cern.ch/sixtrack_cgi/cgi
Mon 12 May 2014 10:02:35 AM EDT		[http_debug] [ID#1] Info:  About to connect() to lhcathomeclassic.cern.ch port 80 (#0)
Mon 12 May 2014 10:02:35 AM EDT		[http_debug] [ID#1] Info:    Trying 128.142.138.22... 
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Info:  Connected to lhcathomeclassic.cern.ch (128.142.138.22) port 80 (#0)
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Sent header to server: POST /sixtrack_cgi/cgi HTTP/1.1
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Sent header to server: User-Agent: BOINC client (i686-pc-linux-gnu 6.10.58)
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Sent header to server: Host: lhcathomeclassic.cern.ch
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Sent header to server: Accept: */*
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Sent header to server: Accept-Encoding: deflate, gzip
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Sent header to server: Content-Type: application/x-www-form-urlencoded
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Sent header to server: Content-Length: 11807
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Sent header to server: Expect: 100-continue
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Sent header to server: 
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Received header from server: HTTP/1.1 100 Continue
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Received header from server: HTTP/1.1 200 OK
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Received header from server: Date: Mon, 12 May 2014 14:02:36 GMT
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Received header from server: Server: Apache
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Received header from server: Connection: close
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Received header from server: Transfer-Encoding: chunked
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Received header from server: Content-Type: text/xml
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Received header from server: 
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Info:  Expire cleared
Mon 12 May 2014 10:02:36 AM EDT		[http_debug] [ID#1] Info:  Closing connection #0
Mon 12 May 2014 10:02:36 AM EDT	LHC@home 1.0	Scheduler request completed: got 0 new tasks
Mon 12 May 2014 10:02:36 AM EDT	LHC@home 1.0	Message from server: No tasks sent



Thanks,
Jay
ID: 26428 · Report as offensive     Reply Quote
Profile Robert Pick

Send message
Joined: 1 Dec 05
Posts: 62
Credit: 11,398,274
RAC: 261
Message 26429 - Posted: 12 May 2014, 15:52:13 UTC - in response to Message 26428.  

Same here! Pick
ID: 26429 · Report as offensive     Reply Quote
Profile Tom95134

Send message
Joined: 4 May 07
Posts: 250
Credit: 826,541
RAC: 0
Message 26430 - Posted: 12 May 2014, 16:32:46 UTC

Same "No tasks sent" here. Windows 7 x64, INTEL i7

I am also running Seti, T4T, and Enigma but those Projects are all Suspended in an effort to get LHC Tasks.

ID: 26430 · Report as offensive     Reply Quote
Profile yo2013
Avatar

Send message
Joined: 16 Oct 13
Posts: 59
Credit: 342,408
RAC: 0
Message 26431 - Posted: 12 May 2014, 20:22:20 UTC - in response to Message 26430.  

I don't get tasks either, and I'm running Linux.
ID: 26431 · Report as offensive     Reply Quote
alvin
Avatar

Send message
Joined: 12 Mar 12
Posts: 128
Credit: 20,013,377
RAC: 0
Message 26434 - Posted: 13 May 2014, 3:04:33 UTC

Have plenty (126) of these for SixTrack v451.07 (pni)

12 May 2014, 7:11:46 UTC Completed, validation inconclusive

ID: 26434 · Report as offensive     Reply Quote
Ano

Send message
Joined: 29 Nov 09
Posts: 42
Credit: 229,229
RAC: 0
Message 26436 - Posted: 13 May 2014, 12:06:13 UTC

"No tasks sent" here too.
Windows XP SP3 if that's useful.
ID: 26436 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26439 - Posted: 14 May 2014, 6:01:02 UTC

Right; work coming today.
When I calm down i will send a long story to the Cafe LHC.

I also understand there is a problem with credits....
Will look at that next if production is really OK.

Eric.

ID: 26439 · Report as offensive     Reply Quote
Profile yo2013
Avatar

Send message
Joined: 16 Oct 13
Posts: 59
Credit: 342,408
RAC: 0
Message 26444 - Posted: 14 May 2014, 18:51:11 UTC - in response to Message 26439.  

Got new tasks :)
ID: 26444 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Status 10th May, Version 451.07


©2024 CERN