1) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 50055)
Posted 12 hours ago by Erich56
Post:
maeax wrote:
...
Laufzeit 6 Stunden 59 min. 28 sek.
CPU Zeit 1 Tage 0 Stunden 58 min. 31 sek.
Prüfungsstatus Gültig
Punkte (=credit points): 1,101.22

Excerpt from the finished task from colleague tazzduke, a few postings above, this morning:

Laufzeit 14 Stunden 14 min. 48 sek.
CPU Zeit 1 Tage 22 Stunden 41 min. 55 sek.
Prüfungsstatus Gültig
Punkte (=credit points): 31.21
2) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 50044)
Posted 1 day ago by Erich56
Post:
Grabbed a 4 core multi, just now, to test my setup.
you were lucky, there was obviously just a short time period around 8 a.m. when jobs were available.
From what I can see the task is still running
3) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49976)
Posted 8 days ago by Erich56
Post:
no jobs for several hours, but the automatic stop of tasks distribution does not seem to work :-(
Thus causing thousands of useless tasks being uploaded after about half an hour runtime without results for the science :-(
4) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49892)
Posted 23 days ago by Erich56
Post:
the queue ran dry :-(
5) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49841)
Posted 25 Mar 2024 by Erich56
Post:
thanks, computezrmle, for your thorough explanation :-)

When you say
"I expect the singlecore CMS on prod will be replaced by a multicore app"

I hope this will mean the same as for ATLAS, i.e. we volunteers can choose between 1 and n cores per task.
6) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49836)
Posted 25 Mar 2024 by Erich56
Post:
what's going wrong?
The multi core CMS is tested on the development system only, afaik ...
... and don't use app_config.xml for CMS.
no, I did not use app_config_xml; I made the setting for 2 cores in the web page. So the multicore tasks seem to be offered in the -dev system only, okay.

But even the usual 1-core tasks are (still) not working at this point, obviously no jobs are available. But why does the project status page then show the usual number of "unsent" tasks (close to 200) - could it be that this includes the test multicore-tasks from the -dev system? Or does the "automatic task distribution stop function in case of no jobs available" not work?

To me, everything seems to be a litte weird at the moment :-(
7) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49833)
Posted 25 Mar 2024 by Erich56
Post:
I changed my prefs to 1 task and max 2 CPUs.
The task created a dual core VM with 2792 MB memory.
After about 5 minutes a cmsRun appeared using up to 100% CPU and after another 2 minutes cmsRun started using up to 200% CPU.

First test task: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3310818
I tried the same prefs which CP is describing above. However, without success - no dual core VM created :-(
see here: https://lhcathome.cern.ch/lhcathome/result.php?resultid=408217329

what's going wrong?
8) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49829)
Posted 24 Mar 2024 by Erich56
Post:
... but I have single-core jobs in the queues.
Ivan, either they are used up already, or something else is going wrong.
My hosts downloaded new tasks, but they don't work :-(
Ivan, what's the current status on single-core jobs? Obviously, none available at this point :-(
9) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49825)
Posted 24 Mar 2024 by Erich56
Post:
... but I have single-core jobs in the queues.
Ivan, either they are used up already, or something else is going wrong.
My hosts downloaded new tasks, but they don't work :-(
10) Message boards : CMS Application : no new WUs available (Message 49778)
Posted 16 Mar 2024 by Erich56
Post:
new new tasks since last night :-(
sorry, should read "NO new tasks ..."
11) Message boards : CMS Application : no new WUs available (Message 49777)
Posted 16 Mar 2024 by Erich56
Post:
new new tasks since last night :-(
12) Message boards : CMS Application : since about 2 hours: all tasks failing after few minutes (SOLVED) (Message 49768)
Posted 13 Mar 2024 by Erich56
Post:
This morning CMS is running OK for me.
I re-started CMS about 1 hour ago, it's working fine now :-)
13) Message boards : CMS Application : since about 2 hours: all tasks failing after few minutes (SOLVED) (Message 49766)
Posted 12 Mar 2024 by Erich56
Post:
It should be back now.
but still not working:
"Could not get an x509 credential":

2024-03-12 19:58:11 (3552): Guest Log: [INFO] Reading volunteer information
2024-03-12 19:58:15 (3552): Guest Log: [INFO] Requesting an X509 credential from LHC@home
2024-03-12 19:58:16 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 19:58:17 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 19:58:47 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 19:58:48 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 19:59:18 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 19:59:19 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 19:59:49 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 19:59:50 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 20:00:20 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 20:00:21 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 20:00:53 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 20:00:55 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 20:01:30 (3552): Guest Log: [DEBUG] % Total % Received % Xferd Average Speed Time Time Time Current
2024-03-12 20:01:30 (3552): Guest Log: Dload Upload Total Spent Left Speed
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 100 196 100 196 0 0 92 0 0:00:02 0:00:02 --:--:-- 92
2024-03-12 20:01:30 (3552): Guest Log: [DEBUG] % Total % Received % Xferd Average Speed Time Time Time Current
2024-03-12 20:01:30 (3552): Guest Log: Dload Upload Total Spent Left Speed
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 100 196 100 196 0 0 92 0 0:00:02 0:00:02 --:--:-- 92
2024-03-12 20:01:30 (3552): Guest Log: [ERROR] Could not get an x509 credential
14) Message boards : CMS Application : since about 2 hours: all tasks failing after few minutes (SOLVED) (Message 49763)
Posted 12 Mar 2024 by Erich56
Post:
It looks like there is an issue with the proxy generated. I will put the old server back until we can find the cause of the issue.
Laurence, tasks still failing:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=407757073
15) Message boards : CMS Application : since about 2 hours: all tasks failing after few minutes (SOLVED) (Message 49751)
Posted 12 Mar 2024 by Erich56
Post:
I now detected the same problem at other volunteers' hosts.
So obviously the problem is not a local one, but rather at CERN :-(
16) Message boards : CMS Application : since about 2 hours: all tasks failing after few minutes (SOLVED) (Message 49750)
Posted 12 Mar 2024 by Erich56
Post:
On all my hosts, CMS tasks are failing after about 2-3 minutes - see here:

https://lhcathome.cern.ch/lhcathome/result.php?resultid=407746801

exerpt from stderr:

2024-03-12 09:22:16 (6252): Guest Log: Ncat: Could not resolve hostname "vccs.cern.ch": Name or service not known. QUITTING.
2024-03-12 09:22:16 (6252): Guest Log: [ERROR] Could not connect to vccs.cern.ch on port 443

So one would see a network .problem.
However, a ping to "vccs.cern.ch" works well.

Atlas and Theory are being processed without any problem.

Any idea what's going on?
17) Message boards : CMS Application : no new WUs available (Message 49748)
Posted 11 Mar 2024 by Erich56
Post:
queue is empty :-(
18) Message boards : CMS Application : no new WUs available (Message 49722)
Posted 7 Mar 2024 by Erich56
Post:
queue is empty :-(
19) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49664)
Posted 27 Feb 2024 by Erich56
Post:
Ivan wrote on Feb. 14:
We've been having some problems lately as we prepare to allow multi-core jobs to be run in CMS@Home (you've probably noticed...). Unfortunately some of the configurations are beyond our control, and we have to request changes as we find problems and determine a potential fix for them.
We ask for your patience at this time while we work through the difficulties, and would fully understand if you chose to pause your participation in the project while we try to get on top of things.
Ivan, any progress yet ?
20) Message boards : Theory Application : cranky: [ERROR] No output found - SOLVED (Message 49589)
Posted 17 Feb 2024 by Erich56
Post:
I was not aware that obviously ALL Theory tasks are faulty now.
I downloaded them on several hosts, and all tasks errored out after some time, up to after more than 1 hour :-(

So I am wondering why no-one back at LHC&Home stops the download of Theory tasks and empties the queue. The result is a real waste of ressources on the volunteers' side.


Next 20


©2024 CERN