Message boards : News : DISK LIMIT EXCEEDED
Message board moderation

To post messages, you must log in.

AuthorMessage
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 27455 - Posted: 16 May 2015, 19:03:40 UTC

Please note that this may occur if you are also subscribed
to the LHC experiment projects ATLAS or CMS using vLHCathome.
A workround is to delete the remaining files yourself.
ID: 27455 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27456 - Posted: 16 May 2015, 19:36:47 UTC - in response to Message 27455.  

To amplify Eric's warning:

We believe that this only happens on the Windows platform - users of other operating systems should not experience this problem.

The problem is caused by any file larger than 4 GB in a 'slot' directory (the working folder that BOINC uses to hold working files for active tasks). These 'slot' directories can be found within the BOINC data directory structure. Large files in other places, such a project folders, don't cause this error message.

The files which led to the discovery of this problem are called 'vm_image.vdi', and files large enough to cause problems have only been seen 'in the wild' on machines running CERN's experimental CMS-dev project - though similar problems might also crop up with ATLAS. But once the file or file exists on your computer, tasks from any/every other BOINC project may fail with this error until the file is cleared.

The problem is caused by the BOINC client failing to delete these very large files as it should, and every BOINC client version to date is affected. A corrected version of BOINC is being tested, but so far only as a hotfix to the already experimental BOINC v7.5.0 development line. I'm awaiting news about whether the current BOINC v7.4.xx line will be updated with a fix for this problem (and one or two other problems which came to light while were were tracking it down). In the meantime, I can supply links to the hotfix version if anyone needs them.
ID: 27456 · Report as offensive     Reply Quote
Profile David Duvall

Send message
Joined: 2 Sep 04
Posts: 2
Credit: 9,357,092
RAC: 0
Message 27460 - Posted: 19 May 2015, 4:13:59 UTC - in response to Message 27456.  

I'd like to thank both you and Eric for the timely update. Again, thank you.
ID: 27460 · Report as offensive     Reply Quote
Dirk Broer

Send message
Joined: 20 Sep 05
Posts: 31
Credit: 1,211,960
RAC: 9
Message 27462 - Posted: 19 May 2015, 21:12:22 UTC - in response to Message 27456.  

files large enough to cause problems have only been seen 'in the wild' on machines running CERN's experimental CMS-dev project


Dou you guys at CERN ever test your own software running all your applications simultaniously, meaning e.g. testing CMS-Dev whilen running together with

LHC@Home Classic
vLHCathome
Atlas@Home
Beauty@LHC
LHC Dev@Home


You might encounter them already testing instead of 'in the wild'...
ID: 27462 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27463 - Posted: 19 May 2015, 21:52:38 UTC - in response to Message 27462.  

Until just one of the files (CMS-dev) grew above 4 GB, it was not known that third-party software (the BOINC framework) had a millenium-style bug: too little space allocated to hold an intermediate value, in this case for file sizes. I don't think this one would have been caught by internal testing: it needed to be tested under the final infrastructure.
ID: 27463 · Report as offensive     Reply Quote
Dirk Broer

Send message
Joined: 20 Sep 05
Posts: 31
Credit: 1,211,960
RAC: 9
Message 27464 - Posted: 19 May 2015, 21:58:31 UTC - in response to Message 27463.  

If you want to use an infrastructure, test the infrastructure.
Run as many outside BOINC apps as you see fit and try to run your own apps too in test.
Then, when all bugs are ironed out, let it loose 'in the wild'...
ID: 27464 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27465 - Posted: 19 May 2015, 22:21:22 UTC - in response to Message 27464.  

Sadly, in this case that wouldn't have worked - CMS-dev was the first project to exceed 4 GB file sizes. You wouldn't have found one from any other BOINC project.

Sure, you could test BOINC by systematically throwing every possible eventuality in its direction: I suspect they would have run out of time and money before exhausting the list.

CMS-dev isn't actually 'in the wild' as yet - not fully, at least. It's a pre-Alpha project (they say), accessible by invitation only. By my estimation, there are between 30 and 50 active Windows users testing it at the moment (the uncertainty is because the 20 anonymous test hosts might, or might not, all be owned by the same person). They are all now being urged, with some strength, to update their BOINC client.
ID: 27465 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1112
Credit: 49,477,289
RAC: 6,389
Message 27466 - Posted: 19 May 2015, 22:45:59 UTC

I have been running all of the Cern tasks since they first started and run them all at the same time on Win7 and one Win8.1

I don't have this problem with Win7 and even run them all on my laptop with only 8GB ram and a SSD (vLHC X2,CMS-dev,and LHC X3,Atlas X2 along with a Einstein GPU) on this 8-core.

The only one that was having this problem for me is the Win8.1

It has plenty of disk space and 18GB ram so that isn't the problem.

So what I started doing is when I have a CMS-dev and vLHC X2 running and I want to add the Atlas task without this problem crashing the Atlas what I do first is I suspend the vLHC X2 and let the CMS keep running and then start the Atlas task and watch it start running on the Boinc manager and the VB manager........then *resume that vLHC X2 and THEN they all are running together again without getting the Disk Limit Error.

Of course I am sure most crunchers would rather not have to do this and expect it to just work by itself.......but I have been doing this at home daily since these projects first started and just try to figure out how to make them run on my hosts.
Volunteer Mad Scientist For Life
ID: 27466 · Report as offensive     Reply Quote
Phil
Avatar

Send message
Joined: 26 Jul 05
Posts: 63
Credit: 4,083,755
RAC: 0
Message 27467 - Posted: 19 May 2015, 23:01:18 UTC - in response to Message 27466.  
Last modified: 19 May 2015, 23:03:49 UTC

I have been running all of the Cern tasks since they first started and run them all at the same time on Win7 and one Win8.1

I don't have this problem with Win7 and even run them all on my laptop with only 8GB ram and a SSD (vLHC X2,CMS-dev,and LHC X3,Atlas X2 along with a Einstein GPU) on this 8-core.

The only one that was having this problem for me is the Win8.1

It has plenty of disk space and 18GB ram so that isn't the problem.

Yes and as Richard says, its one of those unexpected bugs.

So what I started doing is when I have a CMS-dev and vLHC X2 running and I want to add the Atlas
.....then *resume that vLHC X2 and THEN they all are running together again without getting the Disk Limit Error.

Of course I am sure most crunchers would rather not have to do this and expect it to just work by itself......


And to be fair, CMS-dev is in its early stages of testing and needs a little debugging or at least nursemaiding.
Personally I think the BOINC team are doing very well to get all these new ideas working.
ID: 27467 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1263
Credit: 8,420,582
RAC: 5,321
Message 27468 - Posted: 20 May 2015, 12:18:29 UTC - in response to Message 27466.  

MAGIC wrote:
I don't have this problem with Win7 .....

It may happen on every Windows machine and with all BOINC-projects, not only CERN-projects.
I've seen 'disk limit exceeded' errors caused by a CMS remnant vdi-file on World Community Grid, Asteroids and SRBase possibly shredding
hundreds/thousands of tasks cause they all fail just after the start when using a BOINC-slot directory where that >4GB-file is still present.
As Richard explained due to an unexpected BOINC-bug, never seen while BOINC never had to handle >4GB-files before.
So it's not a problem by untested CERN-software like Dirk Broer thought.

MAGIC's message on ATLAS -> http://atlasathome.cern.ch/forum_thread.php?id=276&postid=2297
That error probably caused by this CMS-task: http://boincai05.cern.ch/CMS-dev/result.php?resultid=57098 See peak disk usage.

Windows users running CMS-dev should use the fix mentioned in the thread http://boincai05.cern.ch/CMS-dev/forum_thread.php?id=34.
Before copying the boinc-7.5.1 files, one first should install BOINC 7.4.42 if not yet done.
ID: 27468 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1112
Credit: 49,477,289
RAC: 6,389
Message 27469 - Posted: 20 May 2015, 20:38:51 UTC - in response to Message 27468.  

And as I said it did not happen on my Win7 hosts (I have 4 of them) and I don't use my XP Pro for Atlas or CMS since it is a 3-core that always runs vLHC

The only one was the Win8.1 and it no longer gave me that Disk Limit error after I started Atlas with the vLHC suspended for a couple minutes.

I only run Cern and Einstein tasks so all the other ones do not apply to what I said here or any other thread.

And I started another Atlas again today as I explained and it is still working.

Btw the vLHC tasks are now 367MB when you look at the VB Manager *Actions
Volunteer Mad Scientist For Life
ID: 27469 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1112
Credit: 49,477,289
RAC: 6,389
Message 27470 - Posted: 20 May 2015, 21:17:58 UTC

Example of my Win7 running all of the tasks with no errors



And my one Win8.1 now completing Atlas tasks while running CMC-dev and vLHC at the same time when I tried using the method I mentioned (and first told another Atlas member via pm when I started trying that)

So since doing that it completed 3 consecutive tasks and is now running its 4th Atlas with no error while at the same time running vLHC X2 and CMS-dev (along with GPU X2 at Einstein) on the Win8.1 )

So I won't be trying that *fix on the Win7's and my Win8.1 does have the newest Boinc version so I may try that *fix when I have the time but for now I will just do it the way I found that works here.
Volunteer Mad Scientist For Life
ID: 27470 · Report as offensive     Reply Quote

Message boards : News : DISK LIMIT EXCEEDED


©2024 CERN