Message boards : Sixtrack Application : EXIT_DISK_LIMIT_EXCEEDED
Message board moderation

To post messages, you must log in.

AuthorMessage
mmonnin

Send message
Joined: 22 Mar 17
Posts: 44
Credit: 3,801,950
RAC: 1
Message 35102 - Posted: 28 Apr 2018, 2:03:00 UTC

https://lhcathome.cern.ch/lhcathome/result.php?resultid=188317371

Getting some of these with the tasks released today. Errors around 20min for everyone. 15 completed (13 of those very short) and 9 errors so a poor ratio and ever worse when considering the time.
ID: 35102 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1144
Credit: 21,928,482
RAC: 12,447
Message 35114 - Posted: 29 Apr 2018, 17:00:25 UTC

same happened here with 2 tasks that I started this afternoon:

https://lhcathome.cern.ch/lhcathome/result.php?resultid=188577009 - task name: w-c4_job.B1topenergy...
and
https://lhcathome.cern.ch/lhcathome/result.php?resultid=188576839 - task name: w-c6_job.B1topenergy...

Both tasks failed after about 51 minutes. What's going on there?
ID: 35114 · Report as offensive     Reply Quote
Ano

Send message
Joined: 29 Nov 09
Posts: 42
Credit: 229,229
RAC: 0
Message 35117 - Posted: 30 Apr 2018, 6:18:28 UTC

Got it too:
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=91831921
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=91967947

The irony is that I was coming here to check if a task I was running (also a B1topenergy something) got errors from other users, because it was staying a long time on 100% and I was expecting it to fail, but by the time I found that task in my list, it completed properly.
So I guess it may not be all tasks of a specific series that error out, and the good news for us users is that everybody error the same way when there's error, so it's not coming from our side.
ID: 35117 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 776
Credit: 28,087,580
RAC: 29,975
Message 35118 - Posted: 30 Apr 2018, 8:13:41 UTC

There is a old thread with the same problem:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=3944
ID: 35118 · Report as offensive     Reply Quote
glennpat

Send message
Joined: 16 Feb 07
Posts: 4
Credit: 6,549,932
RAC: 386
Message 35143 - Posted: 3 May 2018, 3:57:14 UTC - in response to Message 35118.  

Has anyone able to stop getting these errors? I read the old thread and I don't have the vm_image.vdi file. I am running Linux. I checked all the slots and didn't see any really large files. I am getting these errors on several computers. If there is some file I need to delete I need to know what it is.
ID: 35143 · Report as offensive     Reply Quote
Lorenz Millinger

Send message
Joined: 15 Jul 09
Posts: 3
Credit: 6,352,798
RAC: 3
Message 35144 - Posted: 3 May 2018, 5:01:55 UTC

I have 345 errrored WUs! A failure Ratio of 40%!!! does anyone work to prevent bad WUs to be sent? We are wasting valueable computing power for nothing.
ID: 35144 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 776
Credit: 28,087,580
RAC: 29,975
Message 35146 - Posted: 3 May 2018, 6:03:40 UTC

Therefore we need some Informations from the Sixtrack-Team.
ID: 35146 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 564
Credit: 352,086,107
RAC: 178,158
Message 35147 - Posted: 3 May 2018, 6:37:08 UTC

The team is working on it as we speak
ID: 35147 · Report as offensive     Reply Quote
Alessio Mereghetti
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 29 Feb 16
Posts: 148
Credit: 1,953,804
RAC: 4,802
Message 35148 - Posted: 3 May 2018, 7:35:34 UTC - in response to Message 35147.  

Dear all,
sorry for the late reply. Apparently the user has requested SixTrack to dump detailed information about particle dynamics, with the consequent increase of disk usage beyond the requirements.
We are deleting the WUs - most probably they will come again, this time with more appropriate parameters.
Happy crunching to everyone and sorry for the disturbance.
Cheers,
A.
ID: 35148 · Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 7 Aug 14
Posts: 14
Credit: 7,354,615
RAC: 8,130
Message 40045 - Posted: 29 Sep 2019, 12:47:27 UTC - in response to Message 35148.  

ID: 40045 · Report as offensive     Reply Quote
Alessio Mereghetti
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 29 Feb 16
Posts: 148
Credit: 1,953,804
RAC: 4,802
Message 40065 - Posted: 2 Oct 2019, 8:31:59 UTC - in response to Message 40045.  

Thanks, PDW, for spotting this problem again.

The failures are due to a (log) file growing beyond the DISK request.
The user was not aware that he should have increased the request if he was submitting extremely long jobs (1e7 turns, when we tipically simulate a factor 10 less).

On the code side, the next release won't generate this (log) file unless explicitly requested by the user.
For the affected tasks, I am looking into the possibility of anyway granting some credit for the CPU time even if the results are not going to be valdated...

I apology for the inconvenience, and thanks again for the support!
A.
ID: 40065 · Report as offensive     Reply Quote
Alessio Mereghetti
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 29 Feb 16
Posts: 148
Credit: 1,953,804
RAC: 4,802
Message 40086 - Posted: 7 Oct 2019, 16:11:53 UTC - in response to Message 40065.  

An update on this issue - I managed to grant credit to tasks failing because of the EXIT_DISK_LIMIT_EXCEEDED issue on the specific study due to the specific inconsistent setting value.
The credit does not represent the full credit that would be acknowledge if the task was run till the end and validated to avoid cheating - in the end, all the tasks failed before coming to conclusion and there was no way to validate the partial result.

Please post here if something odd related to this study happens.

Happy crunching,
A.
ID: 40086 · Report as offensive     Reply Quote

Message boards : Sixtrack Application : EXIT_DISK_LIMIT_EXCEEDED


©2019 CERN