Message boards :
Theory Application :
197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED
Message board moderation
Author | Message |
---|---|
Send message Joined: 14 Jan 10 Posts: 1280 Credit: 8,493,530 RAC: 2,126 |
A long running Pythia8 was killed. It had maybe done 35000 events out of the 81000 to do :-( Job: ===> [runRivet] Mon May 20 16:23:17 CEST 2019 [boinc pp jets 13000 150,-,1860 - pythia8 8.235 cr1 81000 53] The estimated <rsc_fpops_bound> was way loo low for this task resulting in LHC@home 21 May 18:21:21 Aborting task Theory_33081_1558334952.886734_0: exceeded elapsed time limit 69714.17 (2000000.00G/28.69G) CPU time used 16 hours 58 min 48 sec. The Pythia8 job was the 4th job of that task. |
Send message Joined: 24 Oct 04 Posts: 1127 Credit: 49,750,905 RAC: 9,376 |
|
Send message Joined: 24 Oct 04 Posts: 1127 Credit: 49,750,905 RAC: 9,376 |
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4548#33456 Looks like that happened with Sixtracks before as you know. But I did just run a benchmark on this one anyway just in case. (and looked around here too https://boinc.berkeley.edu/trac/search?q=rsc_fpops_bound ) |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 |
I got a number of these too. They occurred on tasks that I had pushed beyond the 18 hour limit. It's interesting that I did NOT get the error on numerous other tasks I pushed well beyond 18 hours before the pentathlon. This started happening around the time of the "adjustments" that occurred to accommodate the pentathlon. I'm guessing a config file got accidentally altered during those "adjustments" and now the <rsc_fpops_bound> is far too low. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,851,570 RAC: 121,394 |
with the change from v263.95 to v263.97, I am getting the 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED error again: https://lhcathome.cern.ch/lhcathome/result.php?resultid=237173131 Why so? I had made no changes in my settings. |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,859,285 RAC: 0 |
One for me too https://lhcathome.cern.ch/lhcathome/result.php?resultid=237142844 which I had extended to let a healthy but long sherpa run. As usual, it had started just before the 12hr limit with no chance of finishing before 18hrs or even my standard 24hr extension. Actually, that was a 293.95 |
Send message Joined: 20 Jun 14 Posts: 374 Credit: 238,712 RAC: 0 |
I have pushed out a new version (263.98) which doubles the lifetime of the VM. This should allow more time for the last job to run. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,851,570 RAC: 121,394 |
I have pushed out a new version (263.98) which doubles the lifetime of the VM. This should allow more time for the last job to run.my host is still downloading v263.97 tasks. Only those, no v263.98 |
Send message Joined: 14 Jan 10 Posts: 1280 Credit: 8,493,530 RAC: 2,126 |
I have pushed out a new version (263.98) which doubles the lifetime of the VM. This should allow more time for the last job to run. You extended the lifetime (job_duration) to 129600 seconds = 36 hours. That's not the problem! The problem is <rsc_fpops_bound>2000000000000000.000000</rsc_fpops_bound>. Could you tenfold that value? |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,851,570 RAC: 121,394 |
A minute ago, my host downloaded another v263.97 task. How is this now with v263.98 - has this one been called back? |
Send message Joined: 27 Sep 08 Posts: 807 Credit: 652,436,425 RAC: 280,287 |
I have 100% failure with Theory on these tasks, for about 1 week. even going back to the .95 |
Send message Joined: 20 Jun 14 Posts: 374 Credit: 238,712 RAC: 0 |
I have pushed out a new version (263.98) which doubles the lifetime of the VM. This should allow more time for the last job to run.my host is still downloading v263.97 tasks. Only those, no v263.98 I have restarted the server. Please try again. |
Send message Joined: 20 Jun 14 Posts: 374 Credit: 238,712 RAC: 0 |
I have pushed out a new version (263.98) which doubles the lifetime of the VM. This should allow more time for the last job to run. I have just done this. Thanks. |
Send message Joined: 20 Jun 14 Posts: 374 Credit: 238,712 RAC: 0 |
I have 100% failure with Theory on these tasks, for about 1 week. even going back to the .95 Do you have any examples? I see some aborted but they do not give any details. |
Send message Joined: 27 Sep 08 Posts: 807 Credit: 652,436,425 RAC: 280,287 |
The ones I didn't abort were all 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED https://lhcathome.cern.ch/lhcathome/result.php?resultid=237228447 https://lhcathome.cern.ch/lhcathome/result.php?resultid=237245874 https://lhcathome.cern.ch/lhcathome/result.php?resultid=237243675 https://lhcathome.cern.ch/lhcathome/result.php?resultid=237243861 In general things are unstable in the last week or so. I have 1 good one this morning |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,851,570 RAC: 121,394 |
I am still getting the EXIT_TIME_LIMIT_EXCEEDED error, after almost exactly 18 hrs: https://lhcathome.cern.ch/lhcathome/result.php?resultid=237251435 https://lhcathome.cern.ch/lhcathome/result.php?resultid=237243711 how come? |
Send message Joined: 27 Sep 08 Posts: 807 Credit: 652,436,425 RAC: 280,287 |
Looks like things are back on track for me. |
Send message Joined: 24 Oct 04 Posts: 1127 Credit: 49,750,905 RAC: 9,376 |
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=119378610 20 hours later....... Grrrrrrrr #801 |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,851,570 RAC: 121,394 |
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=119378610although it says "194 (0x000000C2) EXIT_ABORTED_BY_CLIENT" - error 194, NOT error 197 whatever this now means ??? |
Send message Joined: 14 Jan 10 Posts: 1280 Credit: 8,493,530 RAC: 2,126 |
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=119378610 2019-07-25 15:31:52 (9028): VM Heartbeat file specified, but missing heartbeat. That sometimes happens :( |
©2024 CERN