Message boards : Theory Application : New version 263.95
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 337
Credit: 237,918
RAC: 0
Message 39245 - Posted: 2 Jul 2019, 21:57:07 UTC

This new version updates the CVMFS cache and will hopefully solve the x509 errors some have been seeing.
ID: 39245 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1307
Credit: 23,610,386
RAC: 8,734
Message 39252 - Posted: 3 Jul 2019, 18:53:45 UTC

well, the new version seems to create new problems, though,

see here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5075
ID: 39252 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1307
Credit: 23,610,386
RAC: 8,734
Message 39255 - Posted: 4 Jul 2019, 5:31:33 UTC

okay, I solved the problem on the one machine which obviously had a too old VB. I installed VB 5.2.30, and now the tasks got started well.

However, on the other machine which has a newer VB version and started picking up Theory 263.95 yesterday, the first task failed after 11 hrs. 52 minutes:
197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED

https://lhcathome.cern.ch/lhcathome/result.php?resultid=236590976

What kind of problem is this now?

Before, we had the DISK_LIMIT_EXCEEDED problem, now we have this :-)
ID: 39255 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 965
Credit: 40,925,070
RAC: 2,593
Message 39256 - Posted: 4 Jul 2019, 7:09:56 UTC

So far I have 5 of this version finished and ALL are Valid

And this is with Windows 7 and 10 and with 1,2, and 4 core tasks and I should have many more getting close to complete so I hope to see them all Valid

(glad that helped with yours Erich )
ID: 39256 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 965
Credit: 40,925,070
RAC: 2,593
Message 39257 - Posted: 4 Jul 2019, 7:16:19 UTC - in response to Message 39255.  

okay, I solved the problem on the one machine which obviously had a too old VB. I installed VB 5.2.30, and now the tasks got started well.

However, on the other machine which has a newer VB version and started picking up Theory 263.95 yesterday, the first task failed after 11 hrs. 52 minutes:
197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED

https://lhcathome.cern.ch/lhcathome/result.php?resultid=236590976

What kind of problem is this now?

Before, we had the DISK_LIMIT_EXCEEDED problem, now we have this :-)


Well you do have VB version. 5.2.8 on there so maybe a clean install with 5.2.30 on there too.

https://www.virtualbox.org/wiki/Download_Old_Builds_5_2
ID: 39257 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 252
Credit: 11,225,577
RAC: 2
Message 39258 - Posted: 4 Jul 2019, 7:21:33 UTC
Last modified: 4 Jul 2019, 7:24:51 UTC

Don't know why Erich's one exceeded time limit as it didn't even get as far as the 12hr check.
First 2 of mine completed and validated successfully here and here

Win10, Boinc 7.14.2, VBox 6.0.8
ID: 39258 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1307
Credit: 23,610,386
RAC: 8,734
Message 39259 - Posted: 4 Jul 2019, 7:27:03 UTC - in response to Message 39258.  

Don't know why Erich's one exceeded time limit as it didn't even get as far as the 12hr check.
yes, this is really strange.
There are several others running right now, one close to 9 hours, the other one close to 6 hours. So I'll wait and see what happens.
Maybe the VB version indeed needs to be updated.
ID: 39259 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1307
Credit: 23,610,386
RAC: 8,734
Message 39261 - Posted: 4 Jul 2019, 7:34:41 UTC - in response to Message 39257.  

Well you do have VB version. 5.2.8 on there so maybe a clean install with 5.2.30 on there too.

I just notice that on another of my machines, a 263.95 task got finished valid with VB version 5.2.8 (which is the one which currently comes along with BOINC).
see here: https://lhcathome.cern.ch/lhcathome/result.php?resultid=236579157

So obviously, this version is NOT the reason for EXIT_TIME_LIMIT_EXCEEDED error.
ID: 39261 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 979
Credit: 6,382,221
RAC: 437
Message 39263 - Posted: 4 Jul 2019, 8:40:24 UTC - in response to Message 39261.  

So obviously, this version is NOT the reason for EXIT_TIME_LIMIT_EXCEEDED error.

The reason is in the result log (and in your BOINC Manager event log).

exceeded elapsed time limit 42731.11 (2000000.00G/128.20G)</message> ===> 42731 seconds is 11.87 hours.

The problem is that somehow BOINC had info that your machine has measured a floating point speed of 128.2G.

Now your host reports 3.9 billion ops/sec. With your actual speed the task is allowed to use > 142 hours.
ID: 39263 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1307
Credit: 23,610,386
RAC: 8,734
Message 39265 - Posted: 4 Jul 2019, 9:13:08 UTC - in response to Message 39263.  

thanks, Crystal Pellet, for the explanation.
So the question is: what can be done in order to get this problem resolved?
ID: 39265 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 979
Credit: 6,382,221
RAC: 437
Message 39266 - Posted: 4 Jul 2019, 9:34:08 UTC - in response to Message 39265.  

So the question is: what can be done in order to get this problem resolved?
It's your machine reporting that's such a fast beast, so tame that beast ;)
ID: 39266 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1307
Credit: 23,610,386
RAC: 8,734
Message 39267 - Posted: 4 Jul 2019, 9:38:00 UTC

so, here the next 2 which failed, this time after 4 hrs 20 mins:

https://lhcathome.cern.ch/lhcathome/result.php?resultid=236600271
https://lhcathome.cern.ch/lhcathome/result.php?resultid=236602102

which is rather annoying.

Still my opinion is that the problem must be with the new version 263.95, since all tasks before under version 263.90 worked well.
So obviously version 263.95 need to be re-designed.
ID: 39267 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1520
Credit: 85,958,362
RAC: 71,532
Message 39268 - Posted: 4 Jul 2019, 9:38:01 UTC - in response to Message 39265.  

Did you recently play around with the #core setting (either on the web page or via app_config.xml)?
If yes, this could result in a very high peak FLOPS value that is stored in the computer record.
If this value is not too high you simply get more credits per task.

Another reason could be a couple of tasks with very short runtimes that validated somehow.

BTW:
The stored peak FLOPS is different from what you see at the computer page.
It's a calculated value based on your runtimes/CPU-times.
ID: 39268 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1307
Credit: 23,610,386
RAC: 8,734
Message 39269 - Posted: 4 Jul 2019, 9:42:14 UTC - in response to Message 39268.  

Did you recently play around with the #core setting (either on the web page or via app_config.xml)?
no, I did not
ID: 39269 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 979
Credit: 6,382,221
RAC: 437
Message 39270 - Posted: 4 Jul 2019, 9:55:15 UTC - in response to Message 39268.  
Last modified: 4 Jul 2019, 9:56:04 UTC

If yes, this could result in a very high peak FLOPS value that is stored in the computer record.
Good point!

In Erich's result 1830MB RAM is reserved (Setting Memory Size for VM. (1830MB)), what implies that 12 cores seem to be reserved.

Erich what happens when you set in your preferences # of cores to 1?
ID: 39270 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1307
Credit: 23,610,386
RAC: 8,734
Message 39271 - Posted: 4 Jul 2019, 10:47:20 UTC - in response to Message 39270.  
Last modified: 4 Jul 2019, 10:57:21 UTC

In the web settings, I have number of tasks and number of cores set to "unlimited".

This was the advice we got here in the forum, several months ago, after suddenly there ocurred problems with the number of tasks that could be downloaded at a time. From what I remember, with the host in question here I could download only 2 tasks at a time, although my CPU has 6+6HT cores (and I normally crunch between 5 and 7 tasks concurrently).
In order to limit the number of simultaneously running tasks plus limit the number of cores to 1 per task, I am using an app_config.xml (which was also published here at that time).

The same setting I have with another host, with which, as mentioned above, the tasks from version 263.95 finish sucessfully.
And, as also already stated, this setting worked well with version 263.90 until 2 days ago.

Nevertheless I will try CP's suggestion to set # cores to 1. So let's see what happens.
ID: 39271 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1520
Credit: 85,958,362
RAC: 71,532
Message 39272 - Posted: 4 Jul 2019, 11:15:36 UTC - in response to Message 39271.  

... So let's see what happens.

At least ATLAS will also limit the # of tasks you can download.
As mentioned a couple of times this is caused by the fact that ATLAS doesn't correctly respect the #cores parameter (as it was originally introduced).

This affects computers with lots of cores more than computers with less cores.
ID: 39272 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1307
Credit: 23,610,386
RAC: 8,734
Message 39273 - Posted: 4 Jul 2019, 11:19:30 UTC - in response to Message 39272.  

At least ATLAS will also limit the # of tasks you can download.
yes, I know :-)
ID: 39273 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1307
Credit: 23,610,386
RAC: 8,734
Message 39274 - Posted: 4 Jul 2019, 13:15:03 UTC - in response to Message 39271.  

Nevertheless I will try CP's suggestion to set # cores to 1. So let's see what happens.
after I saw that all currently downloaded tasks failed after a certain time, I aborted the remaining ones, changed the # cores to "1" in the web preferences, and downloaded new tasks.
So I'm curious what I'll see.
ID: 39274 · Report as offensive     Reply Quote
bronco

Send message
Joined: 13 Apr 18
Posts: 443
Credit: 8,438,885
RAC: 0
Message 39275 - Posted: 4 Jul 2019, 15:29:46 UTC - in response to Message 39272.  

... So let's see what happens.

At least ATLAS will also limit the # of tasks you can download.
As mentioned a couple of times this is caused by the fact that ATLAS doesn't correctly respect the #cores parameter (as it was originally introduced).


This is an ongoing problem that can be compensated for but it's confusing for a lot of volunteers. It really needs to be fixed.
Laurence, while we have your attention?
ID: 39275 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Theory Application : New version 263.95


©2020 CERN