Message boards : Number crunching : Current work - run times
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile adrianxw

Send message
Joined: 29 Sep 04
Posts: 187
Credit: 705,487
RAC: 0
Message 46890 - Posted: 15 Jun 2022, 5:46:58 UTC
Last modified: 15 Jun 2022, 5:48:34 UTC

Some, not all, of the work units that have arrived recently have very long run times, this machine, (4GHz i7), for example, has units running for 17 to 35 hours. The longest is showing 13.9% elapsed, the remaining is reducing, slowly . They probably will complete before the deadline, just, but a slower machine has little chance. Are these jobs running normally or are they wasting what amounts to weeks of work?

Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 46890 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1422
Credit: 9,484,585
RAC: 852
Message 46894 - Posted: 15 Jun 2022, 8:10:04 UTC

I suppose you are referring to Theory tasks.
From the server status page at the moment the runtime of the last 100 tasks are between 0.01 and 152.91 hours.
The % progress showed by BOINC is useless.
If you want to see the real progress use the VM Console with keystroke ALT-F2
ID: 46894 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 732
Credit: 49,373,095
RAC: 13,741
Message 46895 - Posted: 15 Jun 2022, 9:33:55 UTC

Here you can see history of average run times (and other stuff): https://grafana.kiska.pw/d/boinc/boinc?orgId=1&var-project=lhc%40home&from=now-7d&to=now
ID: 46895 · Report as offensive     Reply Quote
Profile adrianxw

Send message
Joined: 29 Sep 04
Posts: 187
Credit: 705,487
RAC: 0
Message 46896 - Posted: 15 Jun 2022, 9:45:10 UTC
Last modified: 15 Jun 2022, 9:49:40 UTC

Fine. I've not done anything with them, they are happily running. Just asking.

My comment about the deadline and slower machines still applies of course.

<edit>
0.01 hours
That is quite a machine, what is it?
</edit>

Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 46896 · Report as offensive     Reply Quote
Profile adrianxw

Send message
Joined: 29 Sep 04
Posts: 187
Credit: 705,487
RAC: 0
Message 46905 - Posted: 17 Jun 2022, 16:01:29 UTC
Last modified: 17 Jun 2022, 16:05:50 UTC

>>> I've not done anything with them

I have now though...

Error while computing 246,544.34 637.94 --- Theory Simulation v300.06 (vbox64_theory)
windows_x86_64

... almost quarter million seconds run time, 700 seconds of CPU, work unit crashes out. I really like the CERN centre and the projects of the LHC, but the jobs submitted for crunching through BOINC are absolute cr**.

No new tasks set, AGAIN, project will be deleted from my portfolio AGAIN when they are finished. Professional dealing with rank amateurs.

Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 46905 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 307
Message 46906 - Posted: 17 Jun 2022, 17:17:05 UTC - in response to Message 46905.  

Feel free to do this in your way, but
Boinc-Project is a small Window in Cern-IT.
There are some tasks with problems in runtime, of course.
ID: 46906 · Report as offensive     Reply Quote
Profile adrianxw

Send message
Joined: 29 Sep 04
Posts: 187
Credit: 705,487
RAC: 0
Message 46907 - Posted: 17 Jun 2022, 18:00:34 UTC - in response to Message 46906.  

There is no excusing this. They seem happy to act as if the crunchers are THEIR resource. That time THEY have wasted could be HIGHLY significant to other project scientists, who ARE careful with their applications. When the last units dumb out, crash, submit junk or whatever, the project gets deleted again, enough. This project has a real BAD taste.

Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 46907 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1422
Credit: 9,484,585
RAC: 852
Message 46908 - Posted: 17 Jun 2022, 19:52:37 UTC
Last modified: 17 Jun 2022, 19:53:27 UTC

I don't know why you let run a task for 2.5 days or even 3.5 days without using CPU-time.
It's no surprise that something must be wrong.
If you don't want to investigate what's wrong or don't have the time, I would abort such a task asap.
In my first reply, I told you how you could follow the progress of a task.

Looking to those error tasks, I noticed a network issue from your host to CERN: Probing /cvmfs/grid.cern.ch... Failed!

I agree with you, that the program should stop running by itself or retry a few times and then stop the task when no success.
ID: 46908 · Report as offensive     Reply Quote
Profile adrianxw

Send message
Joined: 29 Sep 04
Posts: 187
Credit: 705,487
RAC: 0
Message 46909 - Posted: 18 Jun 2022, 6:41:51 UTC

I visited the other machine, aborted all LHC tasks and removed the project from its portfolio. BOINC is "install and forget", it should not be a requirement to monitor the progress or otherwise individual tasks.

Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 46909 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 307
Message 46910 - Posted: 18 Jun 2022, 7:41:39 UTC - in response to Message 46909.  

Be so kind and stop your Avatar.
This have a bad performance inside your messages for us,
ID: 46910 · Report as offensive     Reply Quote
Profile adrianxw

Send message
Joined: 29 Sep 04
Posts: 187
Credit: 705,487
RAC: 0
Message 46911 - Posted: 18 Jun 2022, 8:03:55 UTC - in response to Message 46910.  

My avatar? I have no idea what you are talking about, please explain.

Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 46911 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 307
Message 46912 - Posted: 18 Jun 2022, 8:49:22 UTC - in response to Message 46911.  
Last modified: 18 Jun 2022, 9:48:34 UTC

Thank you
ID: 46912 · Report as offensive     Reply Quote
Profile adrianxw

Send message
Joined: 29 Sep 04
Posts: 187
Credit: 705,487
RAC: 0
Message 46913 - Posted: 18 Jun 2022, 9:18:05 UTC - in response to Message 46912.  
Last modified: 18 Jun 2022, 9:22:00 UTC

Ah, you mean the link in my sig. That USED to be a link to my teams website, which was a typical BOINC teams site, with a few pages of stats and stuff. The host was a team member. One day, he vanished, as did our site. I'd done a lot of work for the site doing the stats pages, it was very annoying. It now seems to have been re-purposed as a travel guide to Holland! I'll remove it.

<edit>
It seems to have gone now.
</edit>

Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 46913 · Report as offensive     Reply Quote
Profile adrianxw

Send message
Joined: 29 Sep 04
Posts: 187
Credit: 705,487
RAC: 0
Message 47518 - Posted: 12 Nov 2022, 15:50:09 UTC

I just tried the project again and crunched a couple of units, but then another came, it ran for 9 days before erroring out. There is an ongoing fault here.

Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 47518 · Report as offensive     Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 92
Credit: 16,008,656
RAC: 8,102
Message 47519 - Posted: 13 Nov 2022, 0:16:20 UTC - in response to Message 46894.  

....
The % progress showed by BOINC is useless.
If you want to see the real progress use the VM Console with keystroke ALT-F2

I have yet to figure out how to access the VM console with my configuration.
Boinc is running as a systemd process, as user "boinc" (since it is a system process, it has no login shell). The VMs run fine, but the boinc-manager gui is running under my personal account. I suspect this is the reason why I can't get into the VM Console, but there has to be a way around this. So far, no success in figuring that out.
ID: 47519 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2541
Credit: 254,608,838
RAC: 23,290
Message 47520 - Posted: 13 Nov 2022, 0:44:09 UTC - in response to Message 47519.  

1 out of a couple methods.
This works without Vbox extensions being installed.


From your personal account run
su boinc (+ enter the pw)

then
cd ~; VirtualBox >/dev/null 2>&1

Click on the VM you want to examine, then on "show".
ID: 47520 · Report as offensive     Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 92
Credit: 16,008,656
RAC: 8,102
Message 47522 - Posted: 13 Nov 2022, 2:49:08 UTC - in response to Message 47520.  

1 out of a couple methods.
This works without Vbox extensions being installed.


From your personal account run
su boinc (+ enter the pw)

then
cd ~; VirtualBox >/dev/null 2>&1

Click on the VM you want to examine, then on "show".

As a system process, Boinc's login shell is /sbin/nologin and it doesn't have a password -- when the boinc-client is started, systemd changes the process UID to "boinc" and sets its home to /var/lib/boinc. Therefore, su doesn't offer any solution. I could, I suppose, change the shell to bash and set a password, but I really would like to avoid this. Is there some setting in the boinc-manager config that might achieve the same end?
ID: 47522 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2541
Credit: 254,608,838
RAC: 23,290
Message 47523 - Posted: 13 Nov 2022, 6:30:11 UTC - in response to Message 47522.  

Looks like you know exactly why a login doesn't work on your system.
You also describe what would be necessary to solve it.
It's your system, just do it.

If you prefer another method that would require changes to the boinc service or elsewhere, just do it.
ID: 47523 · Report as offensive     Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 92
Credit: 16,008,656
RAC: 8,102
Message 47531 - Posted: 13 Nov 2022, 18:38:15 UTC - in response to Message 47523.  

Looks like you know exactly why a login doesn't work on your system.
You also describe what would be necessary to solve it.
It's your system, just do it.

There is a very good reason why a system-level service doesn't have a login shell -- giving it one would open up a potential security hole in the system; that is why I don't want to go down this road.
If you prefer another method that would require changes to the boinc service or elsewhere, just do it.

When you wrote "1 out of a couple of methods", I thought you meant you had two, you were just giving me the one what (on the surface) seemed most promising.
ID: 47531 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2541
Credit: 254,608,838
RAC: 23,290
Message 47534 - Posted: 14 Nov 2022, 8:15:52 UTC - in response to Message 47531.  

Some links that may give you an impression of how complex the interaction of BOINC client, BOINC Manager and VirtualBox can be and how long it takes between the first comment and a solution.
https://github.com/BOINC/boinc/issues/3105
https://github.com/BOINC/boinc/issues/3355
ID: 47534 · Report as offensive     Reply Quote

Message boards : Number crunching : Current work - run times


©2024 CERN