1) Message boards : Theory Application : This gonna be long (Message 51111)
Posted 10 days ago by Toggleton
Post:
where do I find the "events" processed?

i use to monitor theory tasks
 tail -f /var/lib/boinc/slots/*/cernvm/shared/runRivet.log 

Herwig tasks did need a different command to filter out the spam.
2) Message boards : Theory Application : Truly long long task: Theory_2743-2822627-370_0 (Message 51040)
Posted 22 days ago by Toggleton
Post:
Have had many herwig7 tasks that took longer than the 10days deadline(running native on linux not sure how virtualbox behaves). Sometimes you have luck the task got not send to other users yet. They are accepted. Even if the task got send to an other user already it takes your result and gives you points for the valid results and cancels the task of the other user once you sent your result(just hope that the person has not yet started to do the task yet.)


I even gamble for longer acceptance for tasks that failed on 2 other computers and is with the 3. try no longer resend to new computers.
3) Message boards : Theory Application : Herwig7 7.2.1 nlo-dipole tasks run very slowly. (Message 50822)
Posted 16 Oct 2024 by Toggleton
Post:
The x of 760 is the 1. part of the workunit.
You are already in the 2. part of the task with "27.200 events processed" so my guess is that it will just take 1 to 3 days till done from now.
If the 1.194:58 are for the second part of the workunit, should be around 20hours per 28,000 events so a bit more than 40hours i guess.
And should be easy inside the 10day limit.
4) Message boards : Theory Application : Herwig7 7.2.1 nlo-dipole tasks run very slowly. (Message 50770)
Posted 10 Oct 2024 by Toggleton
Post:
They are long running tasks. Once it reached 760 of 760 it goes into the next stage that takes 1-2 days more on the tasks i do right now. Have 3 Finished 8 more are running since 4 days so far.
https://lhcathome.cern.ch/lhcathome/result.php?resultid=414779140
https://lhcathome.cern.ch/lhcathome/result.php?resultid=414777594
https://lhcathome.cern.ch/lhcathome/result.php?resultid=414775932
The herwig7 7.2.1 nlo 100000 (the batch from saturday/sunday) at least on my system take 3 to 5 days ts finish.
5) Message boards : ATLAS application : All tasks failing (Message 50652)
Posted 26 Sep 2024 by Toggleton
Post:
Looking at your tasks all fail with Exit status 196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED and have 12GB Peak disk usage. So it is not your fault, you just have gotten a lot of the 6.09GB task files that fail for everyone. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6214
Not sure if that 6GB tasks are still sent. Have not gotten a big one the last hours.
6) Message boards : ATLAS application : Download sometime between 20 and 50 kBps (Message 49358)
Posted 2 Feb 2024 by Toggleton
Post:
As we both have Telekom as ISPs. It could be a problem on that side with the ISP routing. I have no hybrid only VDSL 100/40mbit.

Not sure how the real cause of it can be found. So many moving parts that could be the cause. Will look into my logs if i find if such slow downloads give any hint of what is causing them.
7) Message boards : ATLAS application : Download sometime between 20 and 50 kBps (Message 49343)
Posted 1 Feb 2024 by Toggleton
Post:
I have the same slow download of some tasks on my machine(linux so not OS related likely) and VDSL too. Most downloads are fast but some take 3 hours cause of it. But so far no drop of the connection. The weird thing is that the task download with this low speed and at the same time the next task is downloading at full speed. So it is throttled per Workunit.

But i have a day full of workunits all the time so slow downloads are no problem for me. As we are both from Germany, maybe it is some aggressive throttling of the server that serves Germany
8) Message boards : ATLAS application : Zombie Thread after the Workunit is finished. (Message 49265)
Posted 25 Jan 2024 by Toggleton
Post:
Seems like the zombie Thread problem is gone on my device.
Somewhere around 22 Jan 2024, 16:58:50 UTC(time where the task was sent) and 24 Jan 2024, 17:28:26 UTC(where the task did run) was it fixed.
9) Message boards : ATLAS application : Zombie Thread after the Workunit is finished. (Message 49229)
Posted 22 Jan 2024 by Toggleton
Post:
I have the problem that Zombie Threads are still running after the Workunit has finished and the slot of the workunit is already deleted. It very likely started this weekend.

In htop i see them as /bin/bash ./runpilot2-wrapper.sh -q BOINC_MCORE -j managed --pilot-user ATLAS --harvester-submit-mode PUSH -w generic --job-type managed --resource-type SCORE_HIMEM --pilotversion 3.7.0.36 -z -t --piloturl local --mute --container

They take ~40% cputime per workunit so it takes up quite some CPU time after a few hours and i need to restart boinc to get rid of them
They gets started around 6-8minutes after start of the Workunit. Before the python runargs.EVNTtoHITS.py starts to use the CPU.

Could this be something that is broken with the current batch(the last 2 days) or could it be something that is broken by an Arch linux update or caused by my currently unstable internet.

Has anyone else the same problem?
10) Questions and Answers : Getting started : Broken Website FAQ's and GPU activation (Message 49215)
Posted 19 Jan 2024 by Toggleton
Post:
LHC@home has no GPU work, only CPU. There is an update to sixtrack in work that could be later used on the GPU https://youtu.be/3QLk6y2WNGs?t=731 but no idea how long it will take till it is released for boinc. We wait since months.
11) Message boards : Theory Application : New native version v300.08 (Message 49152)
Posted 7 Jan 2024 by Toggleton
Post:
I have such long running tasks right now too. one Sherpa (52hours so far)like you mcplots runspec: boinc pp winclusive 7000 10 - sherpa 2.2.5 default 100000 126 https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=218564767
and one with 30hours so far mcplots runspec: boinc ppbar mb-inelastic 900 - - pythia8 8.306 dire-default 100000 136
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=218596130

Both long running tasks have nearly the same Name as yours, so guess that are just long running experiments.

But i did have before theory tasks that have run for days and been successful. The only unusual thing with that 2 tasks is that they don't write how many events are done to /var/lib/boinc/slots/*/cernvm/shared/runRivet.log like all the other tasks have so far.

Runtime of recent Theory tasks in hours: average, min, max 3.09 (0.01 - 238.65)Theory Tasks can run very long.
12) Message boards : ATLAS application : queue is empty (Message 49037)
Posted 13 Dec 2023 by Toggleton
Post:
There are no ATLAS tasks for BOINC at the moment you can see that on the computing tab of the website -> Server status
https://lhcathome.cern.ch/lhcathome/server_status.php ATLAS Simulation Unsent 2 In progress 46(under 1000tasks in progress is usually when the queue is going empty)

And on the "Jobs" Tab -> ATLAS jobs https://lhcathome.cern.ch/lhcathome/atlas_job.php where you see that it is filled around once a week and is empty for a few days. Last week we got no new work.

But we had over the summer no ATLAS tasks at all, so getting some work is a upgrade already.

EDIT seems like we get ATLAS work right now.
13) Message boards : Theory Application : New native version v300.08 (Message 49000)
Posted 9 Dec 2023 by Toggleton
Post:
I get from time to time .7 too. but most are .8. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10819840&offset=0&show_names=0&state=6&appid=

Will likely be no problem when the beta is pushed to production and no new .7 work is sent anymore. "Run test applications?" is set too?
14) Questions and Answers : Windows : Not getting tasks for LHC@home but others works fine (Message 48882)
Posted 2 Nov 2023 by Toggleton
Post:
Maybe try to disable GPU work here https://lhcathome.cern.ch/lhcathome/prefs.php?subset=project There is no GPU work yet here but will maybe be there when xtrack is ready for it.
Not sure if that will change much but as long as there is no GPU work from LHC is it worth a try.
(CPU: job cache full; Intel GPU: ) this could indicate that it is looking to get work but you already have your job cache filled for the CPU maybe it is counting with Asteroids@home.

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10827194 you did get work in the last days ATLAS Simulation (4) · CMS Simulation (8) · Theory Simulation (9) But they all failed instantly. The Virtual box tasks need a bit more love before they run. here is a guide what needs to be done for Atlas. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161&postid=29359#29359

The easy workunits that run everywhere are sixtrack but they are still in development to be replaced by xtrack.So no workunits yet.
15) Message boards : Number crunching : Haven't been able to download tasks (Message 48724)
Posted 2 Oct 2023 by Toggleton
Post:
Depends what work you have done. Atlas (linux native or virtualbox) tasks are back since a week or two. https://lhcathome.cern.ch/lhcathome/atlas_job.php

The other projects like theory and CMS look like they had steady work too https://lhcathome.cern.ch/lhcathome/server_status.php

Sixtrack the easy to run work(no need to install virtual box or the stuff you need for native) is in a limbo since last year cause they develop a new version of if that is internally already used and will get work once it is ready for boinc. https://www.youtube.com/watch?v=3QLk6y2WNGs

No idea how long it will take till they are ready but at least they are working on it https://github.com/xsuite/xboinc/releases quite some changes 2 weeks ago.

The thread where updates are posted to xtrackhttps://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5902


If you are trying to get virtualbox work, the page of your Computer claims "Virtualbox (7.0.8) installed, CPU does not have hardware virtualization support" https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10837262
Maybe it got disabled by a Bios update or so?

And the checklist for virtualbox tasks https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161#29359
16) Message boards : ATLAS application : queue is empty (Message 48621)
Posted 22 Sep 2023 by Toggleton
Post:
The credit system that it AFAIK uses is a bit weird. https://boinc.berkeley.edu/trac/wiki/CreditNew
If i remember right did the credit per Workunit move around quite a bit when a new version was released. Guess right now with more users that come back cause of new ATLAS work and the different long tasks(2000events and it sounds like the new are shorter) is the change bigger. But AFIAK did it smooth out after a few days of Atlas running with constant flow of work.
17) Message boards : ATLAS application : queue is empty (Message 48615)
Posted 22 Sep 2023 by Toggleton
Post:
Also: where can one see how many events these tasks have?

This is line 172 in my current /var/lib/boinc/slots/0/pilotlog.txt (not 100% sure as i did not have a shorter one of this run yet)

payload execution command:

export ATHENA_CORE_NUMBER=12;export ATHENA_PROC_NUMBER=12;export PANDA_RESOURCE='BOINC_MCORE';export FRONTIER_ID=//...cutted out..// --maxEvents=2000 --..........

2023-09-22 11:37:52,557 | WARNING | container name not defined in CRIC
2023-09-22 11:37:48,914 | INFO | executing command: export ATHENA_CORE_NUMBER=12;export ATHENA_PROC_NUMBER=12;export PANDA_RESOURCE='BOINC_MCORE';export FRONTIER_ID= //did cut quite some stuff out// --inputEVNTFile=EVNT.123456789._000123.pool.root.1 --maxEvents=2000

and when you have one running you can look at /var/lib/boinc/slots/0/PanDA_Pilot-123456789/eventLoopHeartBeat.txt there you can see how many events of that workunit are already finished.
18) Message boards : Sixtrack Application : Will there be any more sixtrack jobs? (Message 48048)
Posted 28 Apr 2023 by Toggleton
Post:
They are developing on xtrack as replacement of Sixtrack. https://youtu.be/3QLk6y2WNGs?t=321 Many projects already switched to xtrack internally(means not much work is prepared for sixtrack). Once xtrack is released for BOINC there will be a more steady flow of Work.
19) Message boards : ATLAS application : Most error that I have encountered (Message 47998)
Posted 12 Apr 2023 by Toggleton
Post:
https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=614&postid=7968 The certificate error was in the test server already. it is something new with the Run 3 tasks since yesterday. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5978&postid=47994

I have the same. https://lhcathome.cern.ch/lhcathome/result.php?resultid=391664886

We will see when they fix it. But so far does it not hurt much. Only a bit longer idle time.
I was told that will likely run out of Run 2 simulation tasks to run on the prod project very soon, so I have gone ahead and released version 3 there so we can start running Run 3 tasks. Unfortunately I don't think we'll be able to resolve some of the remaining issues like the console monitoring before going live on prod but I think it's better to have something not quite perfect than no tasks at all.
https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=614&postid=8048
20) Message boards : ATLAS application : ATLAS vbox and native 3.01 (Message 47956)
Posted 1 Apr 2023 by Toggleton
Post:
Seems like the tasks since a few hours are some tasks 240MB again instead of the over 1GB. tasks with EVNT 321... are the smaller ones. there are still coming some 327... with 1GB.


Next 20


©2024 CERN