Message boards : Theory Application : 196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED - how come?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,979,669
RAC: 123,740
Message 38993 - Posted: 29 May 2019, 7:45:35 UTC - in response to Message 38992.  

Thats the reason why I left crunching this project.

Too many "errors" in several ways to be considered "serious".

Vbox is simply "not working" for a normal user (can't stops & resume jobs with no errors) and we all have seen all kind of faulty tasks,

It's ok to see the own frontieres to help this projects.
It is not the mainstream of Volunteers with more than Years to be here and helping to reduce the Errors.
ID: 38993 · Report as offensive     Reply Quote
bronco

Send message
Joined: 13 Apr 18
Posts: 443
Credit: 8,438,885
RAC: 0
Message 38996 - Posted: 29 May 2019, 16:10:14 UTC - in response to Message 38992.  

Vbox is simply "not working" for a normal user (can't stops & resume jobs with no errors)

I used to think that too. I was wrong. They stop and resume just fine now after fixing my VBox installation. In fact if you take the time to look through other users' result reports you'll see (in the stderr text for successful tasks) that their tasks pause/resume several times.

and we all have seen all kind of faulty tasks,

Not really. Just 1 kind... Sherpa... and only a small percentage of those fail.
ID: 38996 · Report as offensive     Reply Quote
Guiri-One[Andalucia]

Send message
Joined: 1 Feb 06
Posts: 66
Credit: 9,723
RAC: 0
Message 39002 - Posted: 30 May 2019, 7:13:22 UTC - in response to Message 38996.  

Not true. I have had problems while using windows. Vbox even shows infame message "can't handle job"...
More:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5015

See forums, plenty of user complaining :)
ID: 39002 · Report as offensive     Reply Quote
bronco

Send message
Joined: 13 Apr 18
Posts: 443
Credit: 8,438,885
RAC: 0
Message 39013 - Posted: 31 May 2019, 19:18:02 UTC - in response to Message 39002.  

Not true. I have had problems while using windows. Vbox even shows infame message "can't handle job"...
More:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5015

See forums, plenty of user complaining :)

Meh, just a few complainers who, like you, aren't "serious" about crunching LHC, just serious about complaining.
Again, take a look at the result reports from the many hundreds of users who are returning tasks that were paused/resumed, Linux as well as Windows. You'll see that their tasks validate. Your claim that pause/resume doesn't work is complete BS. If pause/resume doesn't work on your hosts it's because your hosts are misconfigured or you don't follow the necessary procedure.
If you ever decide to get "serious" about it you'll be able to crunch LHC too. Until then you should stick to the easy projects.
ID: 39013 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 281
Credit: 11,859,285
RAC: 0
Message 39014 - Posted: 31 May 2019, 20:00:17 UTC - in response to Message 38992.  
Last modified: 31 May 2019, 20:02:26 UTC

You could try updating to the latest VBox version (it's up to 6.0.8, now) which might resolve the issue you were having. Or limiting the number of cores used in your Preferences so as not to overstretch your machine, which hasn't had any tasks this year. I run only single-core tasks (when I'm not doing work for the -dev site) and have had no problem with suspending and resuming tasks.

Yes, there are some faulty tasks, and they're very annoying, but that's not a VBox issue and most run just fine.
ID: 39014 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1118
Credit: 49,730,317
RAC: 13,013
Message 39063 - Posted: 6 Jun 2019, 2:48:31 UTC

I don't have any problem with a VB *pause/resume* and only get these once in a while out of the thousands I have done.
(but then I do make sure it is suspended before I try to reboot)

Still use versions 5.2.16 to 5.2.28 with no problems
ID: 39063 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1276
Credit: 8,481,858
RAC: 1,977
Message 41059 - Posted: 24 Dec 2019, 18:38:24 UTC

Please increase the rsc_disk_bound for the (vbox) Theory-tasks.

LHC@home 24 Dec 19:09:32 Aborting task Theory_2279-804599-198_0: exceeded disk limit: 1938.75MB > 1907.35MB

https://lhcathome.cern.ch/lhcathome/result.php?resultid=256757335
ID: 41059 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1688
Credit: 103,130,648
RAC: 116,469
Message 41060 - Posted: 24 Dec 2019, 19:03:27 UTC

now we are back to this nonsense:
196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED

after 1 day 11 hours 38 min. 57 sec. crunching time. What a waste :-(

https://lhcathome.cern.ch/lhcathome/result.php?resultid=256479088

how come?
ID: 41060 · Report as offensive     Reply Quote
PekkaH

Send message
Joined: 23 Dec 19
Posts: 15
Credit: 30,667,499
RAC: 56,878
Message 41116 - Posted: 30 Dec 2019, 15:03:32 UTC - in response to Message 41060.  

The same, big number of theory app instances failing
https://lhcathome.cern.ch/lhcathome/result.php?resultid=256633820
Annoying ....
ID: 41116 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1688
Credit: 103,130,648
RAC: 116,469
Message 41117 - Posted: 30 Dec 2019, 15:38:38 UTC

could anyone back there please increase the disk limit, so that failures like the ones described above do not re-occur - thanks a lot!
ID: 41117 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1118
Credit: 49,730,317
RAC: 13,013
Message 41118 - Posted: 30 Dec 2019, 15:44:12 UTC

I have been testing Theory Simulation v5.18 (vbox64_theory) windows_x86_64 for a while now without this particular problem/

The only problem I have is the typical VB needing high-speed internet just to start up the tasks in the first 3 minutes and after that it doesn't matter.

So maybe it is time to move this version over here and see how you all do with these.
ID: 41118 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,979,669
RAC: 123,740
Message 41119 - Posted: 30 Dec 2019, 16:25:36 UTC - in response to Message 41118.  
Last modified: 30 Dec 2019, 16:33:26 UTC

Is it possible, only in VM-Theory and not in -native?
ATM 450 -native for me without this Error!

Windows Theory is from October in -dev.
Windows Theory is from November in Production.

We have to wait up to next week.
ID: 41119 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1276
Credit: 8,481,858
RAC: 1,977
Message 41121 - Posted: 30 Dec 2019, 17:09:16 UTC - in response to Message 41119.  

Is it possible, only in VM-Theory and not in -native?
The rsc_disk_bound of 2,000,000,000 bytes (1907.35 MB) for Theory-VBox and native is the same. Both versions are now seen as one application,
but for the VBox-version 2000000000 bytes disk space is tightly sized specially when one has to save the VM-state to disk while suspending a task.
ID: 41121 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1118
Credit: 49,730,317
RAC: 13,013
Message 41122 - Posted: 30 Dec 2019, 21:46:18 UTC

I just got home so I decided to look through that other version 5.18 I have been running and I did see a few of these waste of time tasks that looked like they would be Valid all the times I checked them but ended up like this.

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2850450

But many,many of Valids over 30 hours of running time and almost the same in CPU time.

So I guess this version does the same if they try running that long.

BUT then this one is Valid https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2849822 over 76 hours. ( the credits are a different story as you can see)
ID: 41122 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1276
Credit: 8,481,858
RAC: 1,977
Message 41123 - Posted: 31 Dec 2019, 9:55:29 UTC - in response to Message 41122.  
Last modified: 31 Dec 2019, 11:57:08 UTC

I just got home so I decided to look through that other version 5.18 I have been running and I did see a few of these waste of time tasks that looked like they would be Valid all the times I checked them but ended up like this.

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2850450
Although it was on the dev-server, the tasks are coming from the same pool. Thanks for pointing to the result. Peak disk usage 3.48 GB without any snapshot written to disk. It must have been a lot of sherpa logging to the virtual disk to grow that big.
At least we know now that the rsc_disk_bound should at least be doubled to 4000000000 bytes, but maybe even that's not enough. I personally tend to 8000000000.

EDIT: Magic, I fetched the retry for that task:https://lhcathomedev.cern.ch/lhcathome-dev/workunit.php?wuid=1963768

===> [runRivet] Tue Dec 31 11:03:14 UTC 2019 [boinc ee zhad 43.6 - - sherpa 2.2.5 default 2000 197]

We'll see how it goes. I've already seen a lot of

Poincare::Poincare(): Inaccurate rotation {
a = (0.536749,-0.680524,-0.498785)
b = (0,0,1)
a' = (0.46357,0.714657,0.523803) -> rel. dev. (inf,inf,-0.476197)
m_ct = -0.498785
m_st = -0.866725
m_n = (0,-4.85017e-07,6.61739e-07)
}


during full optimization phase.
Meanwhile:
integration time: ( 28m 28s elapsed / 174d 11h 27m 56s left ) [11:54:39]
ID: 41123 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1276
Credit: 8,481,858
RAC: 1,977
Message 41126 - Posted: 31 Dec 2019, 13:31:39 UTC

@Magic: Another one of yours: ===> [runRivet] Wed Dec 25 10:07:54 UTC 2019 [boinc ee zhad 200 - - sherpa 2.2.4 default 3000 198]

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2851401 -- Peak disk usage 2.98 GB

And another one: ===> [runRivet] Wed Dec 25 12:20:40 UTC 2019 [boinc ee zhad 29 - - sherpa 2.2.4 default 2000 198]

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2851452 -- Peak disk usage 2.92 GB
ID: 41126 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1276
Credit: 8,481,858
RAC: 1,977
Message 41165 - Posted: 5 Jan 2020, 8:23:27 UTC

LHC@home 05 Jan 08:40:38 Aborting task Theory_2279-794231-202_0: exceeded disk limit: 1945.56MB > 1907.35MB

https://lhcathome.cern.ch/lhcathome/result.php?resultid=257666299
ID: 41165 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1276
Credit: 8,481,858
RAC: 1,977
Message 41235 - Posted: 10 Jan 2020, 15:28:55 UTC

LHC@home 10 Jan 16:24:35 Aborting task Theory_2363-897726-14_0: exceeded disk limit: 1964.13MB > 1907.35MB

https://lhcathome.cern.ch/lhcathome/result.php?resultid=258964496
ID: 41235 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1276
Credit: 8,481,858
RAC: 1,977
Message 41236 - Posted: 10 Jan 2020, 15:48:03 UTC

LHC@home 10 Jan 16:44:38 Aborting task Theory_2363-916251-14_0: exceeded disk limit: 2101.13MB > 1907.35MB

https://lhcathome.cern.ch/lhcathome/result.php?resultid=258964499
ID: 41236 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1688
Credit: 103,130,648
RAC: 116,469
Message 41238 - Posted: 10 Jan 2020, 17:13:58 UTC

it looks like the Theory tasks have a bad run these days :-(
ID: 41238 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Theory Application : 196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED - how come?


©2024 CERN