21) Message boards : Number crunching : Error while computing/too many errors (Message 34093)
Posted 26 Jan 2018 by Profile adrianxw
Post:
Picked up 3 recent Error while computing errors again.
22) Message boards : News : Short interruptions Tuesday (Message 33880)
Posted 16 Jan 2018 by Profile adrianxw
Post:
Best of luck.
23) Message boards : Number crunching : Error while computing/too many errors (Message 33123)
Posted 23 Nov 2017 by Profile adrianxw
Post:
>>> The filename or extension is too long.
>>> (0xce) - exit code 206 (0xce)

I'm seeing a few exits again now. Saw that in the error log.
24) Message boards : Number crunching : Error while computing/too many errors (Message 33055)
Posted 13 Nov 2017 by Profile adrianxw
Post:
I had a small group of "Error while computing" last weekend. An LHCb, a CMS and two Theory Sim's.
25) Message boards : Number crunching : VM Hypervisor (Message 32471)
Posted 19 Sep 2017 by Profile adrianxw
Post:
The job has completed and uploaded overnight, is flagged as valid and been credited.
26) Message boards : Number crunching : VM Hypervisor (Message 32468)
Posted 18 Sep 2017 by Profile adrianxw
Post:
The job that was stuck has continued to run here after the fix and is now 30.444% complete. It APPEARS to have corrected some issue but I wouldn't guarantee it. Try it again with an updated VirtualBox, but keep an eye on it. Perhaps an idea to keep hold of the older version installer so you can go back if necessary. I have not updated my other machine here yet, but then, I have not seen the problem on it, machine is almost identical to this one though, hardware and software.
27) Message boards : Number crunching : VM Hypervisor (Message 32465)
Posted 18 Sep 2017 by Profile adrianxw
Post:
It is the first time I have seen it, but looking around the projects, it has been seen by others at other projects. The updated VirtualBox APPEARS to have fixed it for me, the task that was stopped is running and currently is showing 25.402% done.
28) Message boards : Number crunching : VM Hypervisor (Message 32463)
Posted 18 Sep 2017 by Profile adrianxw
Post:
... but it has. I was fiddling with this to see what or where the problem was, which inevitably had me stop and start the system a few times. Now, I can see that the job IS running again, and has advanced to 13.207%, so perhaps the VirtualBox update, followed by a restart, is a fix.
29) Message boards : Number crunching : VM Hypervisor (Message 32462)
Posted 18 Sep 2017 by Profile adrianxw
Post:
I first noticed this yesterday morning, so it has already been at least 24 hours since whatever happened, happened. It was quiet, so I rebooted the system, which would, of course, stop and start BOINC. Upon restarting, the job started running again, no new progress was shown, it is still 12.666%, and after about 10 minutes, it went back into that strange state, where it is currently sitting. The elapsed dropped back a bit from the checkpoint load, but it has advanced until 2:33:06, almost identical to the time before where this happened. "Fixes" have involved some chmod and chown commands, but I am not running Linux.

It is a quorum and replication of 1 so I cannot see if a wingman has got anywhere with it.

I've not seen that status before.

<edit>
A net search has shown this same status showing up at other projects, Cosmology and RNA for example. Fixes have suggested updating VirtualBox, (to 5.1.28), which I, and others have tried, but it has not helped yet.
30) Message boards : Number crunching : VM Hypervisor (Message 32456)
Posted 18 Sep 2017 by Profile adrianxw
Post:
I have a job on this system, it is, and has been for at least the last 24 hours, in an unusual, to me, state. It's status is shown as...

Postponed: VM Hypervisor failed to enter an online state in a timely fashion.

... okay, so is this something that will, at some point, enter an online state, or is the job just hung?

The work unit is...

CMS_24868_1505498727.617335_0

... of application...

CMS Simulation 47.60 (vbox64)

... running here under Windows 8.1 x64. It has had 2:33:05 CPU and is showing 12.666% complete.
31) Message boards : Number crunching : Disk space. (Message 31723)
Posted 30 Jul 2017 by Profile adrianxw
Post:
Understood.
32) Message boards : Number crunching : Disk space. (Message 31718)
Posted 30 Jul 2017 by Profile adrianxw
Post:
I happened to look at the Disk tab in BOINC Manager. Most projects had a couple of hundred k. I was somehwat suprised to see the usage of LHC. My system, right now, has just a single LHC@Home work unit on it, yet the LHC disk usage is 7.64 GB. Is this actually in use? Are these files base data for applications? Am I seeing a failure to clean up properly?
33) Message boards : Number crunching : ATLAS Simulation. (Message 31082)
Posted 26 Jun 2017 by Profile adrianxw
Post:
>>> Who/What will deny it?

Nobodys is trying to deny you anything. My gripe is that it seemed to be assumed that everyone would want to be like that. Like probably 99% of BOINC users, I am attached to multiple projects. Makes sense, sometimes a server is down, or job generation was interrupted, etc. Always makes sense to run a group of different projects from different servers. The alternative can be periods running nothing just consuming electricity for no reason.

>>> Why not?

Because on systems with multiple projects attached running at comparable quota levels the schedulaer doesn't do that. You tell me how or why it should.
34) Message boards : Number crunching : ATLAS Simulation. (Message 31079)
Posted 26 Jun 2017 by Profile adrianxw
Post:
As you can see, I have already done that by changing it to use at most 25% of CPU's, it should only use 2 cores, I can live with that.
35) Message boards : Number crunching : ATLAS Simulation. (Message 31073)
Posted 26 Jun 2017 by Profile adrianxw
Post:
>>> Example:
>>> 8 1-core apps request 8x3.4 GB => 27.2 GB => needs a computer with 32 GB RAM
>>> 2 4-core apps request 2x5.8 GB => 11.6 GB => runs on a computer with 16 GB RAM

No. You are making a totally incorrect comparison here, you are assuming that the 8 1 core apps are all running at the same time. With a regular BOINC set up with a collection of projects running at the same priority, that would not happen.

I have the checklist open in another tab and have been working my way through it.

I have limited the project to 2 cores, at least, I believe I have understood the control mechanism sufficiently to have done that. I'll see. The ATLAS job class is re-enabled.

>>> Use at most 25% of the CPUs
36) Message boards : Number crunching : ATLAS Simulation. (Message 31041)
Posted 25 Jun 2017 by Profile adrianxw
Post:
Sorry, it would not let me edit my earlier post.

Why does the project make nultithreaded app? The deadlines are fairly long, so it can't be a time issue. The way the app works there are times when it cannot possibly use all the resources it has allocated, therefore wasting them and denying them to other projects. There are folk out there that see the issue I've seen and rather than resolving it, just drop the project.
37) Message boards : Number crunching : ATLAS Simulation. (Message 31038)
Posted 25 Jun 2017 by Profile adrianxw
Post:
I have changed the project to use at most 25% CPU's. I'll watch that and see if it is acceptable.
38) Message boards : Number crunching : ATLAS Simulation. (Message 31011)
Posted 24 Jun 2017 by Profile adrianxw
Post:
You might not notice any difference, but I might. The example I gave earlier in the thread, which you quote, illustrates a difference I want to find out more about, the task runs for an hour on 8 CPU's, stopping all other projects - does the work scheduler class that as an hour or eight. I can think of numerous situations which I would like to know more about.
39) Message boards : Number crunching : ATLAS Simulation. (Message 31006)
Posted 24 Jun 2017 by Profile adrianxw
Post:
That sounds good, I'll look for that, maybe I can continue the project.
40) Message boards : Number crunching : ATLAS Simulation. (Message 30962)
Posted 23 Jun 2017 by Profile adrianxw
Post:
I have only removed ATLAS from my sub-projects list, so far. I need to find out more about how BOINC handles multithreaded applications. If it treats a task running on 8 cores for an hour the same as a task running on one core for 8 hours for example.


Previous 20 · Next 20


©2024 CERN