Thread 'error -177 resource limit exceeded'

Author	Message
grumpy Send message Joined: 1 Sep 04 Posts: 57 Credit: 2,835,005 RAC: 0	Message 23125 - Posted: 19 Sep 2011, 0:59:37 UTC Somehting is wrong I got the same message! I've got 100 gig of disk allocated to boinc. 18 gigs of memory, win 7 64 Can't be running out! LHC is taking 53 meg of disk. ID: 23125 · Reply Quote

jujube Send message Joined: 25 Jan 11 Posts: 179 Credit: 83,858 RAC: 0	Message 23126 - Posted: 19 Sep 2011, 4:44:00 UTC - in response to Message 23125. You aren't running out, marmaduke. Each task you receive has a limit on how much disk space it is allowed to use. If that limit is exceeded the task will crash. You cannot fix this problem by allocating more disk space to BOINC. Only the project can fix the problem, by increasing the disk space limit they put in their tasks. You might want to set Sixtrack to No New Tasks until the admins increase the disk space limit. Has bigmac been alerted? ID: 23126 · Reply Quote

Igor Zacharov Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 16 May 11 Posts: 79 Credit: 111,419 RAC: 0	Message 23127 - Posted: 19 Sep 2011, 5:15:38 UTC - in response to Message 23124. The beam-beam interaction jobs which we run now are new studies. They simulate influence of beam particles on each other and need large number of simulated turns to unveil the effects. Of course, we have tested on our computers before submitting, but we may not have corrected everything. The problem with resource limits is serios and I will discuss with Eric McIntosh when he comes to CERN in the morning. We will review the settings asap. skype id: igor-zacharov ID: 23127 · Reply Quote

Speedy Send message Joined: 28 Jul 05 Posts: 42 Credit: 767,068 RAC: 0	Message 23128 - Posted: 19 Sep 2011, 5:29:31 UTC - in response to Message 23124. I found the problem I think in the init_data.ini file <rsc_disk_bound>30000000.000000</rsc_disk_bound>, This is 28MB Next 58% done is at 27.7 MB (29,048,832 bytes) 05:35:30 with 3 hours to go = i bet this would error, except i stopped boinc and manually bumbped up the limit and it got to 05:51:07 at 28.7 MB (30,146,560 bytes) and counting. So this is the problem - The project needs to up the limit. That's wonderful to hear you have manually fixed the problem. Can you post instructions on what to do to correct linit? Have A Crunching Good day ID: 23128 · Reply Quote

jujube Send message Joined: 25 Jan 11 Posts: 179 Credit: 83,858 RAC: 0	Message 23130 - Posted: 19 Sep 2011, 6:45:55 UTC - in response to Message 23128. Last modified: 19 Sep 2011, 6:49:38 UTC n fix the problem manually on your end but you have to do it for every Sixtrack task you receive. The only way to have ALL your tasks fixed is for the admin to fix the problem on the server. That's why I suggested just setting Sixtrack to No New Tasks until they fix it on the server. If you want to fix the problem for the tasks you have now... 1. exit BOINC client 2. open client_state.xml in the BOINC data directory in a text editor, do NOT use Word or Wordpad, use Notepad 3. search the text for <rsc_disk which will find <rsc_disk_bound> for EVERY task you have for EVERY project you have, one after the other 4. you want to edit ONLY the instances of <rsc_disk_bound> for your Sixtrack tasks, those will be in a block of text like this and have sixtrack between the <app_name> and </app_name> tags... [pre]<workunit> <name>w3_weak3_collision_err_bb__16__s__64.31_59.32__4_6__6__55.5_1_sixvf_boinc183751</name> <app_name>sixtrack</app_name> <version_num>53008</version_num> <rsc_fpops_est>30000000000000.000000</rsc_fpops_est> <rsc_fpops_bound>300000000000000.000000</rsc_fpops_bound> <rsc_memory_bound>100000000.000000</rsc_memory_bound> <rsc_disk_bound>30000000.000000</rsc_disk_bound> <file_ref> <file_name>w3_weak3_collision_err_bb__16__s__64.31_59.32__4_6__6__55.5_1_sixvf_boinc183751.zip</file_name> <open_name>fort.zip</open_name> </file_ref> </workunit> [/pre] 5. When you find a block like the one above it will have 30000000.000000 between the <rsc_disk_bound> and </rsc_disk_bound> tags. That's the number you need to change. Multiply that number by 10 by adding another 0 to the left of the decimal point. 6. Now find the next block that has Sixtrack for the app_name and 30000000.000000 for the rsc_disk_bound. Add another 0 there, repeat until you find and edit all such blocks. 7. Save the file, exit Notepad, restart BOINC client. You have to do that for every Sixtrack task you receive or there is a chance they'll crash. I've done that for tasks I've already started but I'm sure as heck not gonna keep doing it. I've set Sixtrack to NNT until they fix this server side. ID: 23130 · Reply Quote

Speedy Send message Joined: 28 Jul 05 Posts: 42 Credit: 767,068 RAC: 0	Message 23131 - Posted: 19 Sep 2011, 8:06:30 UTC Thanks for steps. I had a look & decided not to modify anything. Will let my 98 remaining tasks run (12 at a time) & hope they don't all error out. Thanks again Have A Crunching Good day ID: 23131 · Reply Quote

Igor Zacharov Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 16 May 11 Posts: 79 Credit: 111,419 RAC: 0	Message 23132 - Posted: 19 Sep 2011, 9:00:15 UTC - in response to Message 23131. I have put the value of 90MB into the data base for all work units and restarted the boinc server. Also, I have changed the distribution rate: 5 3 (was 10 before) such that not too many are sitting on a single machine. The daily quota is still at 40, so that nobody should be short of work. I suggest, if you abort the jobs which are waiting or consummed little time you will get new jobs with corrected disk size. Please, report any other problems you see. Thank you. skype id: igor-zacharov ID: 23132 · Reply Quote

Krunchin-Keith [USA] Volunteer moderator Project tester Volunteer developer Volunteer tester Send message Joined: 2 Sep 04 Posts: 209 Credit: 1,482,496 RAC: 0	Message 23133 - Posted: 19 Sep 2011, 9:14:42 UTC Thanks Igor. 90 should do it. I'm up to one at about 90%, 8:44:28 hours with 53 minutes to go and it is only 42.5 MB (44,625,920 bytes). Thanks for the fix. --- volunterrs should not modify client_state.xml unless you are very careful, one mistake and your client could trash all your work and settings. It is not a file you should mess with normally. Also if you do, that setting is only good for the tasks you modify, every time the cleint gets new work, it will have project supplied settings. Make a backup first ! I'm an alpha tester and do things like this to find bugs, loss of work is not a concern to me. ID: 23133 · Reply Quote

superempie Send message Joined: 28 Jul 05 Posts: 24 Credit: 6,603,623 RAC: 0	Message 23135 - Posted: 19 Sep 2011, 10:48:50 UTC After a few wu's errored out with the disk space issue, I decided to halt this project and abort all wu's running until the problem is solved. Could you leave a message on the front page when it is fixed? ID: 23135 · Reply Quote

Krunchin-Keith [USA] Volunteer moderator Project tester Volunteer developer Volunteer tester Send message Joined: 2 Sep 04 Posts: 209 Credit: 1,482,496 RAC: 0	Message 23136 - Posted: 19 Sep 2011, 14:02:41 UTC - in response to Message 23135. Last modified: 19 Sep 2011, 14:03:58 UTC After a few wu's errored out with the disk space issue, I decided to halt this project and abort all wu's running until the problem is solved. Could you leave a message on the front page when it is fixed? It was fixed, two posts before yours. Unfortuantely they cannot fix work you already have. Reenable and you should get good work what will not produce this error. ID: 23136 · Reply Quote

superempie Send message Joined: 28 Jul 05 Posts: 24 Credit: 6,603,623 RAC: 0	Message 23138 - Posted: 19 Sep 2011, 14:42:01 UTC - in response to Message 23136. Thanks, will do. ID: 23138 · Reply Quote

Michael Becker Send message Joined: 15 Jul 05 Posts: 8 Credit: 1,036,470 RAC: 0	Message 23139 - Posted: 19 Sep 2011, 15:53:00 UTC Last modified: 19 Sep 2011, 16:05:09 UTC hi there, i dont know whats up, i get a lot of errors: 183921 83243 18 Sep 2011 13:57:14 UTC 18 Sep 2011 20:37:33 UTC Error while computing 22,612.19 22,501.96 101.58 --- SixTrack v530.08 183913 83239 18 Sep 2011 13:54:25 UTC 18 Sep 2011 19:58:50 UTC Error while computing 18,423.98 18,332.83 82.76 --- SixTrack v530.08 183908 83236 18 Sep 2011 13:57:14 UTC 19 Sep 2011 4:45:09 UTC Error while computing 20,420.03 20,300.75 91.64 --- SixTrack v530.08 183904 83234 18 Sep 2011 13:57:14 UTC 19 Sep 2011 9:16:26 UTC Error while computing 20,120.55 20,022.49 90.39 --- SixTrack v530.08 183900 83232 18 Sep 2011 13:54:25 UTC 18 Sep 2011 20:10:05 UTC Error while computing 21,728.14 21,576.06 97.40 --- SixTrack v530.08 183852 83208 18 Sep 2011 13:57:14 UTC 19 Sep 2011 11:16:51 UTC Error while computing 20,050.06 19,964.62 90.12 --- SixTrack v530.08 183816 83190 18 Sep 2011 13:54:25 UTC 18 Sep 2011 19:58:50 UTC Error while computing 18,417.01 18,294.77 82.59 --- SixTrack v530.08 183707 83136 18 Sep 2011 13:57:14 UTC 19 Sep 2011 15:21:06 UTC Error while computing 19,455.11 19,388.66 87.52 --- SixTrack v530.08 http://lhcathomeclassic.cern.ch/sixtrack/results.php?userid=7259&offset=0&show_names=0&state=5 a long time no work, then de/reattach, get some work but a lot cpu-time for nothing EDIT: found the information - - - 'Maximum disk usage exceeded' thought 10GB was enough, now changed to 25GB, hope it helps ID: 23139 · Reply Quote

Michael Becker Send message Joined: 15 Jul 05 Posts: 8 Credit: 1,036,470 RAC: 0	Message 23140 - Posted: 19 Sep 2011, 17:22:55 UTC - in response to Message 23139. need help, please found the information - - - 'Maximum disk usage exceeded' thought 10GB was enough, now changed to 25GB, hope it helps i cant beleave that 'Maximum disk usage exceeded' couses the problem. the next wu finished with error while computing 5mins prior i see 95MB hard disk usage by lhc, and 23GB free for boinc applications. now only three lhc tasks ar running and i have 70MB disk usage by lhc anny idea ? ID: 23140 · Reply Quote

Filipe Send message Joined: 9 Aug 05 Posts: 36 Credit: 8,178,084 RAC: 0	Message 23141 - Posted: 19 Sep 2011, 17:31:32 UTC Last modified: 19 Sep 2011, 17:31:45 UTC Please, see the thread bellow: error -177 resource limit exceeded You question are already been answered. Filipe. ID: 23141 · Reply Quote

jujube Send message Joined: 25 Jan 11 Posts: 179 Credit: 83,858 RAC: 0	Message 23142 - Posted: 19 Sep 2011, 17:33:06 UTC - in response to Message 23140. Last modified: 19 Sep 2011, 17:33:56 UTC This was caused by a problem on the server, not a problem on your end. The problem has been fixed. Now you should detach and reattach to get fresh tasks. ID: 23142 · Reply Quote

Tom95134 Send message Joined: 4 May 07 Posts: 250 Credit: 826,541 RAC: 0	Message 23144 - Posted: 19 Sep 2011, 18:05:42 UTC - in response to Message 23141. Please, see the thread bellow: error -177 resource limit exceeded You question are already been answered. Filipe. Bad link. ID: 23144 · Reply Quote

Michael Becker Send message Joined: 15 Jul 05 Posts: 8 Credit: 1,036,470 RAC: 0	Message 23145 - Posted: 19 Sep 2011, 18:15:29 UTC - in response to Message 23144. Please, see the thread bellow: error -177 resource limit exceeded thank you so much i make the changes in the file, so the last tasks can be finished normaly ID: 23145 · Reply Quote

microchip Send message Joined: 27 Jun 06 Posts: 11 Credit: 3,617,774 RAC: 0	Message 23410 - Posted: 8 Oct 2011, 10:44:20 UTC I get a similar, albeit not the same error <core_client_version>6.12.34</core_client_version> <![CDATA[ <message> Maximum elapsed time exceeded </message> <stderr_txt> </stderr_txt> ]]> ID: 23410 · Reply Quote

Ananas Send message Joined: 17 Jul 05 Posts: 102 Credit: 542,016 RAC: 0	Message 23506 - Posted: 14 Oct 2011, 16:33:56 UTC - in response to Message 23410. Last modified: 14 Oct 2011, 16:35:26 UTC ... Maximum elapsed time exceeded ... Might be a configuration error on project side, each result has a value "rsc_fpops_bound" (or similar), where the project guys can configure when a result is aborted. This is thought to avoid endless loops / iterations that never reach their desired target value but it should not knock out a result that still is working properly. Afaik. the benchmark results influence the value that is compared against this rsc_fpops_bound but your benchmark values do not look unusually high. One possible cause on client side would be a power saving mode, where the host runs at a reduced clock speed. Dynamic turbo mode on some CPU cores could have a similar effect, if the benchmark has been carried out on a higher clocked core. I have read somewhere that AMD has something like that on later Bulldozer chips but I'm not sure what this exactly means for BOINC. ID: 23506 · Reply Quote

Fred J. Verster Send message Joined: 4 Aug 08 Posts: 14 Credit: 278,575 RAC: 0	Message 23746 - Posted: 27 Nov 2011, 14:17:53 UTC - in response to Message 23506. ... Maximum elapsed time exceeded ... Might be a configuration error on project side, each result has a value "rsc_fpops_bound" (or similar), where the project guys can configure when a result is aborted. This is thought to avoid endless loops / iterations that never reach their desired target value but it should not knock out a result that still is working properly. Afaik. the benchmark results influence the value that is compared against this rsc_fpops_bound but your benchmark values do not look unusually high. One possible cause on client side would be a power saving mode, where the host runs at a reduced clock speed. Dynamic turbo mode on some CPU cores could have a similar effect, if the benchmark has been carried out on a higher clocked core. I have read somewhere that AMD has something like that on later Bulldozer chips but I'm not sure what this exactly means for BOINC. Isn't a -177 error a timing error? I remember this happening at SETI@home, when using a GPU. I also run Rosetta and was amazed by the >700MByte WU RAM use. This host, now only does CPU jobs, a.t.m. (It has 2 EAH5870 GPU's) Sandy Bridge CPU's, use a dynamic turbo, too, if CPU load is low, clock frquency goes down. Knight Who Says Ni N! ID: 23746 · Reply Quote