1) Message boards : ATLAS application : Download sometime between 20 and 50 kBps (Message 49358)
Posted 2 Feb 2024 by Toggleton
Post:
As we both have Telekom as ISPs. It could be a problem on that side with the ISP routing. I have no hybrid only VDSL 100/40mbit.

Not sure how the real cause of it can be found. So many moving parts that could be the cause. Will look into my logs if i find if such slow downloads give any hint of what is causing them.
2) Message boards : ATLAS application : Download sometime between 20 and 50 kBps (Message 49343)
Posted 1 Feb 2024 by Toggleton
Post:
I have the same slow download of some tasks on my machine(linux so not OS related likely) and VDSL too. Most downloads are fast but some take 3 hours cause of it. But so far no drop of the connection. The weird thing is that the task download with this low speed and at the same time the next task is downloading at full speed. So it is throttled per Workunit.

But i have a day full of workunits all the time so slow downloads are no problem for me. As we are both from Germany, maybe it is some aggressive throttling of the server that serves Germany
3) Message boards : ATLAS application : Zombie Thread after the Workunit is finished. (Message 49265)
Posted 25 Jan 2024 by Toggleton
Post:
Seems like the zombie Thread problem is gone on my device.
Somewhere around 22 Jan 2024, 16:58:50 UTC(time where the task was sent) and 24 Jan 2024, 17:28:26 UTC(where the task did run) was it fixed.
4) Message boards : ATLAS application : Zombie Thread after the Workunit is finished. (Message 49229)
Posted 22 Jan 2024 by Toggleton
Post:
I have the problem that Zombie Threads are still running after the Workunit has finished and the slot of the workunit is already deleted. It very likely started this weekend.

In htop i see them as /bin/bash ./runpilot2-wrapper.sh -q BOINC_MCORE -j managed --pilot-user ATLAS --harvester-submit-mode PUSH -w generic --job-type managed --resource-type SCORE_HIMEM --pilotversion 3.7.0.36 -z -t --piloturl local --mute --container

They take ~40% cputime per workunit so it takes up quite some CPU time after a few hours and i need to restart boinc to get rid of them
They gets started around 6-8minutes after start of the Workunit. Before the python runargs.EVNTtoHITS.py starts to use the CPU.

Could this be something that is broken with the current batch(the last 2 days) or could it be something that is broken by an Arch linux update or caused by my currently unstable internet.

Has anyone else the same problem?
5) Questions and Answers : Getting started : Broken Website FAQ's and GPU activation (Message 49215)
Posted 19 Jan 2024 by Toggleton
Post:
LHC@home has no GPU work, only CPU. There is an update to sixtrack in work that could be later used on the GPU https://youtu.be/3QLk6y2WNGs?t=731 but no idea how long it will take till it is released for boinc. We wait since months.
6) Message boards : Theory Application : New native version v300.08 (Message 49152)
Posted 7 Jan 2024 by Toggleton
Post:
I have such long running tasks right now too. one Sherpa (52hours so far)like you mcplots runspec: boinc pp winclusive 7000 10 - sherpa 2.2.5 default 100000 126 https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=218564767
and one with 30hours so far mcplots runspec: boinc ppbar mb-inelastic 900 - - pythia8 8.306 dire-default 100000 136
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=218596130

Both long running tasks have nearly the same Name as yours, so guess that are just long running experiments.

But i did have before theory tasks that have run for days and been successful. The only unusual thing with that 2 tasks is that they don't write how many events are done to /var/lib/boinc/slots/*/cernvm/shared/runRivet.log like all the other tasks have so far.

Runtime of recent Theory tasks in hours: average, min, max 3.09 (0.01 - 238.65)Theory Tasks can run very long.
7) Message boards : ATLAS application : queue is empty (Message 49037)
Posted 13 Dec 2023 by Toggleton
Post:
There are no ATLAS tasks for BOINC at the moment you can see that on the computing tab of the website -> Server status
https://lhcathome.cern.ch/lhcathome/server_status.php ATLAS Simulation Unsent 2 In progress 46(under 1000tasks in progress is usually when the queue is going empty)

And on the "Jobs" Tab -> ATLAS jobs https://lhcathome.cern.ch/lhcathome/atlas_job.php where you see that it is filled around once a week and is empty for a few days. Last week we got no new work.

But we had over the summer no ATLAS tasks at all, so getting some work is a upgrade already.

EDIT seems like we get ATLAS work right now.
8) Message boards : Theory Application : New native version v300.08 (Message 49000)
Posted 9 Dec 2023 by Toggleton
Post:
I get from time to time .7 too. but most are .8. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10819840&offset=0&show_names=0&state=6&appid=

Will likely be no problem when the beta is pushed to production and no new .7 work is sent anymore. "Run test applications?" is set too?
9) Questions and Answers : Windows : Not getting tasks for LHC@home but others works fine (Message 48882)
Posted 2 Nov 2023 by Toggleton
Post:
Maybe try to disable GPU work here https://lhcathome.cern.ch/lhcathome/prefs.php?subset=project There is no GPU work yet here but will maybe be there when xtrack is ready for it.
Not sure if that will change much but as long as there is no GPU work from LHC is it worth a try.
(CPU: job cache full; Intel GPU: ) this could indicate that it is looking to get work but you already have your job cache filled for the CPU maybe it is counting with Asteroids@home.

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10827194 you did get work in the last days ATLAS Simulation (4) · CMS Simulation (8) · Theory Simulation (9) But they all failed instantly. The Virtual box tasks need a bit more love before they run. here is a guide what needs to be done for Atlas. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161&postid=29359#29359

The easy workunits that run everywhere are sixtrack but they are still in development to be replaced by xtrack.So no workunits yet.
10) Message boards : Number crunching : Haven't been able to download tasks (Message 48724)
Posted 2 Oct 2023 by Toggleton
Post:
Depends what work you have done. Atlas (linux native or virtualbox) tasks are back since a week or two. https://lhcathome.cern.ch/lhcathome/atlas_job.php

The other projects like theory and CMS look like they had steady work too https://lhcathome.cern.ch/lhcathome/server_status.php

Sixtrack the easy to run work(no need to install virtual box or the stuff you need for native) is in a limbo since last year cause they develop a new version of if that is internally already used and will get work once it is ready for boinc. https://www.youtube.com/watch?v=3QLk6y2WNGs

No idea how long it will take till they are ready but at least they are working on it https://github.com/xsuite/xboinc/releases quite some changes 2 weeks ago.

The thread where updates are posted to xtrackhttps://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5902


If you are trying to get virtualbox work, the page of your Computer claims "Virtualbox (7.0.8) installed, CPU does not have hardware virtualization support" https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10837262
Maybe it got disabled by a Bios update or so?

And the checklist for virtualbox tasks https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161#29359
11) Message boards : ATLAS application : queue is empty (Message 48621)
Posted 22 Sep 2023 by Toggleton
Post:
The credit system that it AFAIK uses is a bit weird. https://boinc.berkeley.edu/trac/wiki/CreditNew
If i remember right did the credit per Workunit move around quite a bit when a new version was released. Guess right now with more users that come back cause of new ATLAS work and the different long tasks(2000events and it sounds like the new are shorter) is the change bigger. But AFIAK did it smooth out after a few days of Atlas running with constant flow of work.
12) Message boards : ATLAS application : queue is empty (Message 48615)
Posted 22 Sep 2023 by Toggleton
Post:
Also: where can one see how many events these tasks have?

This is line 172 in my current /var/lib/boinc/slots/0/pilotlog.txt (not 100% sure as i did not have a shorter one of this run yet)

payload execution command:

export ATHENA_CORE_NUMBER=12;export ATHENA_PROC_NUMBER=12;export PANDA_RESOURCE='BOINC_MCORE';export FRONTIER_ID=//...cutted out..// --maxEvents=2000 --..........

2023-09-22 11:37:52,557 | WARNING | container name not defined in CRIC
2023-09-22 11:37:48,914 | INFO | executing command: export ATHENA_CORE_NUMBER=12;export ATHENA_PROC_NUMBER=12;export PANDA_RESOURCE='BOINC_MCORE';export FRONTIER_ID= //did cut quite some stuff out// --inputEVNTFile=EVNT.123456789._000123.pool.root.1 --maxEvents=2000

and when you have one running you can look at /var/lib/boinc/slots/0/PanDA_Pilot-123456789/eventLoopHeartBeat.txt there you can see how many events of that workunit are already finished.
13) Message boards : Sixtrack Application : Will there be any more sixtrack jobs? (Message 48048)
Posted 28 Apr 2023 by Toggleton
Post:
They are developing on xtrack as replacement of Sixtrack. https://youtu.be/3QLk6y2WNGs?t=321 Many projects already switched to xtrack internally(means not much work is prepared for sixtrack). Once xtrack is released for BOINC there will be a more steady flow of Work.
14) Message boards : ATLAS application : Most error that I have encountered (Message 47998)
Posted 12 Apr 2023 by Toggleton
Post:
https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=614&postid=7968 The certificate error was in the test server already. it is something new with the Run 3 tasks since yesterday. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5978&postid=47994

I have the same. https://lhcathome.cern.ch/lhcathome/result.php?resultid=391664886

We will see when they fix it. But so far does it not hurt much. Only a bit longer idle time.
I was told that will likely run out of Run 2 simulation tasks to run on the prod project very soon, so I have gone ahead and released version 3 there so we can start running Run 3 tasks. Unfortunately I don't think we'll be able to resolve some of the remaining issues like the console monitoring before going live on prod but I think it's better to have something not quite perfect than no tasks at all.
https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=614&postid=8048
15) Message boards : ATLAS application : ATLAS vbox and native 3.01 (Message 47956)
Posted 1 Apr 2023 by Toggleton
Post:
Seems like the tasks since a few hours are some tasks 240MB again instead of the over 1GB. tasks with EVNT 321... are the smaller ones. there are still coming some 327... with 1GB.
16) Message boards : ATLAS application : Latest ATLAS jobs getting much larger in download size? (Message 47899)
Posted 25 Mar 2023 by Toggleton
Post:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5976&postid=47891
Probably a scientist added a batch to the ATLAS-queue not meant for BOINC, cause the root-file to download for 1 task is 1110MB.
17) Message boards : Number crunching : Computer Optimization (Message 47897)
Posted 24 Mar 2023 by Toggleton
Post:
I can only give you the data i have over my Ryzen 3600 (6c12t) with 16gb RAM at 3200Mhz

I run only native Atlas on Linux so can't say much about the other sub projects. Sixtrack is hard to get steady work till the big update https://youtu.be/3QLk6y2WNGs?t=448 is released soonTM. Later will GPU work come too.

I run 2 Atlas native tasks at the same time with 6threads each. That takes right now around 8GB of RAM while idle. Here an older thread about the Memory usage of the projects and how it scales with threads https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4875


Did read in this Forum somewhere that native(linux) are quite a bit faster as they have less overhead (virtualbox) but i don't find it right now. So if you want to build a PC that is only running BOINC for LHC@home then the native tasks are better. It is just some work to set everything up like CVMFS and it is recommended to have a squid Proxy setup so the load on the Servers are smaller. The guides are pinned in https://lhcathome.cern.ch/lhcathome/forum_forum.php?id=3

Atlas tasks take quite some disk space/download. Usually ~40 tasks/around 2 days of work for my system need 20GB on the disk. Right now are the files bigger so it takes 45GB for the same amount of work but that is likely just a mistake as written above "cause the root-file to download for 1 task is 1110MB."

Up and download speeds should be not that heavy while running the Task(the Squid proxy helps to not download the same files for every workunit). But up- and downloading(~500mb download and ~500mb uploading) of the workunits will benefit from good internet speed. With my 100/40mbit internet the upload and Download speeds are usually limited by my Internet not the CERN server. uploading of the file is usually running when the next task started and has no CPU load yet.

Ram size should for Atlas be quite stable so for your 12core 24threads you could just run 4 Tasks at the same time with 6 threads each. that will around 16GB. Or you could run 3tasks with 8 threads but i have no RAM usage for that setup.

My guess would be that fast RAM and good bandwidth to the RAM is useful. with so much data in Memory. Did test around with 2400Mhz and 3200Mhz and did not notice a big enough change that i could see it in the Run time
(sec). But more cores and faster cores than mine will likely hit the memory Bandwidth harder. Don't think you will find good data for that and will likely need to test around with it yourself.

A Fast connection to the Disk should help too as the workunits are quite big for boinc tasks that need to be loaded.

CPU wise could you optimize for the most efficient Frequency(2.2GHz till 3.6Ghz does scale fine for the efficiency on my system. Only the boost gets more inefficient) but that is likely different for your CPU generation.

AVX512 is AFAIK not used yet "executing command: grep -o 'avx2[^ ]*\|AVX2[^ ]*' /proc/cpuinfo" from the Atlas native Log of this user with a CPU that is at least the same Generation as yours Ryzen 9 7950X 16-Core https://lhcathome.cern.ch/lhcathome/results.php?hostid=10821683&offset=0&show_names=0&state=4&appid=
18) Message boards : Number crunching : No Tasks Available (Message 47855)
Posted 14 Mar 2023 by Toggleton
Post:
We may run out of ATLAS tasks by the end of today(10.March), but we should have some more submitted next week.
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5447&postid=47831 and the amount of users from ATLAS did eat up the work from the other projects too.
CMS
Sorry, I thought the workflow was going to last longer, but we had a surge in usage -- perhaps another project ran out of tasks. (Ah, yes, ATLAS has run low lately.)
-dev uses the same job pool, so it runs out too.
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4209&postid=47852

Hope ATLAS will have work again soon.
19) Message boards : Number crunching : No task for Android devices? (Message 47739)
Posted 31 Jan 2023 by Toggleton
Post:
look at Computing -> server status https://lhcathome.cern.ch/lhcathome/server_status.php . you will see that SixTrack - Unsent 0. Sixtrack are the sub Project that has android WUs but sixtrack has in the last month not a stable and steady output of WUs so it is usually that the WUs are fast gone. So it is even hard for x86 PCs to get enough sixtrack WUs. It is better to have 2 Projects and compute sixtrack when you can grab some WUs.
Not sure if they are producing Android WUs right now when the other platforms have not enough work to do. And looking here https://lhcathome.cern.ch/lhcathome/apps.php 1GigaFlops is usually no work or really low number of user.

And you did test with 1 ARMv7 android phones. Guess they only have 64bit android support when i look at the Application page.
20) Message boards : Number crunching : not sending out SixTrack (Message 29290)
Posted 15 Mar 2017 by Toggleton
Post:
T.J. do you use the old project Url?

The new:
LHC@home
https://lhcathome.cern.ch/lhcathome/



©2024 CERN