Message boards :
Theory Application :
196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED - how come?
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,948,171 RAC: 82,479 ![]() ![]() ![]() |
This task https://lhcathome.cern.ch/lhcathome/result.php?resultid=208859885 errored out after about 5 hours with 196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED. The report shows a Peak disk usage of 18,312.37 MB which is totally unusual; I checked other tasks, they had about 1 GB. But still the BOINC settings should have allowed a even higher disk usage: 90% of 195GB are for use with BOINC. Does anyone have an explanation what happened? |
Send message Joined: 27 Sep 08 Posts: 880 Credit: 746,962,684 RAC: 326,151 ![]() ![]() ![]() |
The project team set the limit wrong when they submitted the WU's, you have to just let them die they will fix themselves over time |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
But still the BOINC settings should have allowed a even higher disk usage: 90% of 195GB are for use with BOINC.There are 2 "disk limits". You are confusing the two. That "90% of 195 GB" is the limit for the total disk usage by all BOINC tasks. This limit is user configurable in BOINC manager. The 196 error refers to the disk limit placed on a single task. This limit is NOT user configurable, it is NOT the disk limit referred to above. This limit is set by the server. |
Send message Joined: 14 Jan 10 Posts: 1461 Credit: 9,859,193 RAC: 2,531 ![]() ![]() |
This task My explanation: The working slot (incl. subdirs) of a Theory task may not exceed 8000000000 bytes eqs 7629.39453125 MB. Somehow the size of all files together grew abnormal. The task did 1 job and was not suspended creating snapshots, I suppose Vbox.log or VBoxHardening.log is responsible for the enormous size, maybe loop writing an error situation. |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,948,171 RAC: 82,479 ![]() ![]() ![]() |
This morning, again a task errored out, after 4 1/2 hours, with the exit disk limit failure: https://lhcathome.cern.ch/lhcathome/result.php?resultid=211563104 What's wrong this time? |
Send message Joined: 14 Jan 10 Posts: 1461 Credit: 9,859,193 RAC: 2,531 ![]() ![]() |
This morning, again a task errored out, after 4 1/2 hours, with the exit disk limit failure: https://lhcathome.cern.ch/lhcathome/result.php?resultid=211563104 2018-12-08 11:51:03 (1464): Guest Log: [INFO] Job finished in slot1 with 1. 2018-12-08 11:51:08 (1464): Guest Log: [INFO] New Job Starting in slot1 2018-12-08 11:51:08 (1464): Guest Log: [INFO] Condor JobID: 482880.13 in slot1 2018-12-08 11:51:08 (1464): Guest Log: [INFO] Job finished in slot1 with 2. Strange that the Job finished within a second even without issuing a MCPlots ID. Finished with 2 normally means something like 'file not found'. What's about that time in BOINC's event log? |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,948,171 RAC: 82,479 ![]() ![]() ![]() |
What's about that time in BOINC's event log?nothing particular |
![]() Send message Joined: 28 Sep 04 Posts: 780 Credit: 59,998,073 RAC: 47,174 ![]() ![]() ![]() |
I've got one of these errors during last night, here: https://lhcathome.cern.ch/lhcathome/result.php?resultid=211545734 It ran 18 hours and 5 minutes and was propably just about to end when the error hit. Peak disk usage was 16,501.43 MB. ![]() |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,948,171 RAC: 82,479 ![]() ![]() ![]() |
same problem happened yesterday: https://lhcathome.cern.ch/lhcathome/result.php?resultid=212297479 anyone any idea why so? |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,948,171 RAC: 82,479 ![]() ![]() ![]() |
The next 196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED error, after about 7 1/2 hours runtime: Anyone any explanation how come? |
![]() Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,887,455 RAC: 54,539 ![]() ![]() |
Most likely misbehaving sherpa jobs that create huge logs until the disk limit is reached. They are hard to debug as the logs are lost as soon as the VM is shut down. See the discussion at LHC-dev: https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=438 |
Send message Joined: 14 Jan 10 Posts: 1461 Credit: 9,859,193 RAC: 2,531 ![]() ![]() |
LHC@home 22 Dec 12:00:39 UTC Aborting task Theory_1543818_1545461975.206982_0: exceeded disk limit: 9596.10MB > 7629.39MB Task: https://lhcathome.cern.ch/lhcathome/result.php?resultid=212747277 |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,948,171 RAC: 82,479 ![]() ![]() ![]() |
I had two more yesterday: https://lhcathome.cern.ch/lhcathome/result.php?resultid=212811309 https://lhcathome.cern.ch/lhcathome/result.php?resultid=212826747 and one the day before yesterday: https://lhcathome.cern.ch/lhcathome/result.php?resultid=212743711 they failed between 13 and 18 hours processing time, which is a shame :-( |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,948,171 RAC: 82,479 ![]() ![]() ![]() |
Can anyone from the LHC@home people please explain why there are that many tasks lately erroring out with "196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED" latest case from one of my PCs: https://lhcathome.cern.ch/lhcathome/result.php?resultid=212965047 again, 6 1/2 hours of CPU time for nothing :-( |
Send message Joined: 2 May 07 Posts: 2277 Credit: 178,709,076 RAC: 100,489 ![]() ![]() |
Most likely misbehaving sherpa jobs that create huge logs until the disk limit is reached. |
![]() ![]() Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,888,115 RAC: 831 ![]() ![]() |
Got one here,11hrs+ in, with running.log at 4.4GB In Console window, stuff is whizzing past too fast to read. I'm going to reset the VM so as not to waste any more time on it as it's likely to grow too big and fail but hopefully the identification details captured below will be helpful. From stdout.log 06:31:00 +0000 2018-12-28 [INFO] Condor JobID: 484579.129 in slot1 06:31:06 +0000 2018-12-28 [INFO] MCPlots JobID: 47878534 in slot1 Top line of running.log ===> [runRivet] Fri Dec 28 06:31:01 GMT 2018 [boinc pp jets 8000 180,-,3560 - sherpa 2.2.0 default 100000 8] |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,948,171 RAC: 82,479 ![]() ![]() ![]() |
this specific problem has been occuring in the recent past only, as far as I could see. So something must have been altered with these tasks at LHC@home. |
Send message Joined: 14 Jan 10 Posts: 1461 Credit: 9,859,193 RAC: 2,531 ![]() ![]() |
Got another one too. Could not watch the running job, but most probably a sherpa. Stderr output: 2018-12-28 19:02:34 UTC (4528): Guest Log: [INFO] New Job Starting in slot1 BOINC event: 2018-12-29 05:02:23 UTC Aborting task Theory_1102016_1545999587.637140_0: exceeded disk limit: 8970.10MB > 7629.39MB https://lhcathome.cern.ch/lhcathome/result.php?resultid=213019449 |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,948,171 RAC: 82,479 ![]() ![]() ![]() |
Got another one too.really annoying if this happens after so many hours :-( Total waste of CPU time. |
![]() Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,887,455 RAC: 54,539 ![]() ![]() |
This is still an issue on the volunteer's side. I wonder if it is under investigation on the project's side. Could anyone from the project team be so kind as to give a short summary? |
©2025 CERN