1) Message boards : ATLAS application : 6,09 GByte Downloadfile (Message 50644)
Posted 26 Sep 2024 by Profile Yeti
Post:
These 6 GB-WUs have crashed 4 Hosts of mine ==> ATLAS is paused on all machines until this is solved
2) Message boards : ATLAS application : 3-core task crunches slower than 2-core task - why so? (Message 50373)
Posted 10 Jun 2024 by Profile Yeti
Post:
Do these observations remain valid currently? In other words, the fewer cores you use the more efficient is the process?
Thanks.

This is more from a theoretical sight. Each WU needs a startup-Sequence, that run's only on 1 Core. At the End of the WU you will get Idle-Cores until the last thread is finished.

In the past it has proven, that mid-Core configurations are best. I have preferred to run 4-Core-Tasks, it may vary depending on your personal needs
3) Message boards : ATLAS application : No Tasks from LHC@home (Message 49740)
Posted 8 Mar 2024 by Profile Yeti
Post:
Sorry but uninstalling WSL did not work.
I even d/l the BOINC/V-Box bundle to see if that was the case. But no.
I can run Mint on Virtual Box.
Try this from my checklist:

Did you try to crunch Projects using VMs in the past while VT-X / AMD-V / VIA-VT was not enabled? Could be that BOINC has kept this in mind!

To check and fix this, first exit BOINC and make sure, all BOINC-Tasks have really finished.

In your BOINC_Data-Directory you will find a client_state.xml. Open this with a simple editor and search for:
<p_vm_extensions_disabled>1</p_vm_extensions_disabled>

If this is absent or the number is 0 / zero than all is fine. Otherwise change it to 0 / zero <p_vm_extensions_disabled>0</p_vm_extensions_disabled> and safe the file. Be carefull to save it as a real ascii-file

Be carefull that you closed your BOINC-Client successfully before you change anything in client_state.xml. Otherwise BOINC will overwrite your changes
4) Message boards : ATLAS application : ATLAS vbox and native 3.01 (Message 48662)
Posted 25 Sep 2023 by Profile Yeti
Post:
Try this:
[sudo] watch -n10 "find /var/lib/boinc-client/slots -name \"log.EVNTtoHITS\" |sort |xargs -I {} -n1 sh -c \"echo "{}"; grep -Po 'INFO.*Run:Event.*\K\(.*' {} |tail -n4; echo\""
...

This rocks !

Thank you very much

Yeti
5) Message boards : ATLAS application : ATLAS vbox and native 3.01 (Message 48659)
Posted 25 Sep 2023 by Profile Yeti
Post:
Great, I love it ! Thank you for this helpful command

As a Linux-Newbee what would be neccessary to show the BOINC Slot-Number in the line ? (I run three WUs with each 4-Cores simultaneous)

Thanks in Advance
Yeti

I modified the command line to monitor ATLAS native 3.01:
In this example, I used 2 CPUs per task hence the tail -n2.
sudo watch -n10 "find /var/lib/boinc-client/slots/ \( -name \"log.EVNTtoHITS\" -o -name \"AthenaMP.log\" \) |sort |xargs -I {} -n1 sh -c \"egrep 'INFO.*Run:Event ' {} |tail -n2\"|sort -k 7,7"

An example of output:
17:32:48 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool       390     0    INFO          Run:Event 450000:20848791       (200th event for this worker) took 82.67 s. New average 93.67 +- 3.91
17:32:44 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool       391     1    INFO          Run:Event 450000:20848792       (192th event for this worker) took 45.15 s. New average 98.49 +- 3.622
17:32:03 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool       362     0    INFO          Run:Event 450000:22570763       (186th event for this worker) took 40.7 s. New average 96.78 +- 3.699
17:32:53 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool       363     1    INFO          Run:Event 450000:22570764       (178th event for this worker) took 128.6 s. New average 102.1 +- 4.028
17:33:07 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool       312     1    INFO          Run:Event 450000:22644313       (159th event for this worker) took 209.2 s. New average 95.61 +- 3.997
17:33:01 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool       313     0    INFO          Run:Event 450000:22644314       (155th event for this worker) took 152.8 s. New average 99 +- 4.297
6) Message boards : Theory Application : How long may Native-Theory-Tasks run (Message 48150)
Posted 30 May 2023 by Profile Yeti
Post:
I have opened my Native-Atlas-Clients for Native-Theory and see wide varyiung runtimes.

From 00:20 hours to 02:45 hours seem to be fine, but sometimes I see runtimes from 20:00 or even more hours, sometimes with 99% CPU-Cycle, sometimes with no CPU-Cycle.

Can I see, if the tasks are alive and doing fine or should I abort them if longer than XX:00 Hours ?
7) Message boards : ATLAS application : ATLAS vbox and native 3.01 (Message 47924)
Posted 28 Mar 2023 by Profile Yeti
Post:
Important is the work we do with this Tasks
Shure
and not the size of the input file.

If I can do the same science with 130 / 200 / 300 MB Downloads like before, why should we now download 1,2 GB for the same amount of science?

Maybe it is okay for you, fine, but for me it is a wastefulness
8) Message boards : ATLAS application : ATLAS vbox and native 3.01 (Message 47923)
Posted 28 Mar 2023 by Profile Yeti
Post:
But I still can't find any words that claim they are useless for science.
These are not my words
a scientist added a batch to the ATLAS-queue not meant for BOINC

They are not meant for BOINC
9) Message boards : ATLAS application : ATLAS vbox and native 3.01 (Message 47920)
Posted 28 Mar 2023 by Profile Yeti
Post:
Where did you get that idea from?
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5976&postid=47891#47891
10) Message boards : ATLAS application : ATLAS vbox and native 3.01 (Message 47918)
Posted 28 Mar 2023 by Profile Yeti
Post:
I have got a new 3.01 WU, but it seems still to contain this big 1,2 GB file.

What will happen with it, as far as I understood it is useless for BOINC-Crunchers? It would be a waste of resources, electricity, time, bandwith and lifetime of all SSDs if it is still downloaded even it isn't needed.
11) Message boards : ATLAS application : Latest ATLAS jobs getting much larger in download size? (Message 47910)
Posted 28 Mar 2023 by Profile Yeti
Post:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5976&postid=47891
Probably a scientist added a batch to the ATLAS-queue not meant for BOINC, cause the root-file to download for 1 task is 1110MB.

I'm very disappointed not to hear anything more:

It may have been a mistake, but someone could tell us how and much more when it will be fixed !

Or will it stay as the new normal ?

I have just checked and my latest WUs are again with the 1,2 GB file

The answer from David is already 5 days old and no more info :-(

My clients will stay paused on Atlas
12) Message boards : ATLAS application : Exit code -2147467259 (0x80004005) - is this related to running 2 ATLAS at the same time? (Message 47849)
Posted 13 Mar 2023 by Profile Yeti
Post:
Just found this in your logfiles: Setting CPU throttle for VM. (97%)

I have already switches to Atlas-Native, but I remember that in the past CPU-Throttling should be switched off with VB and ATLAS
13) Message boards : ATLAS application : Credit for ATLAS (Message 47829)
Posted 9 Mar 2023 by Profile Yeti
Post:
In case of the 1.996,63 credit points, stderr under "max FLOPS of device" shows 37,11 GFLOPS,
in case of the 211.93 credit points, it shows 4,65 GFLOPS.

How can this come about? As said, we are talking about the same machine and the same environment.
Further, the indicated GFLOPS are never ever in alignment with the runtims / CPu times of the tasks.
So, something seems to be rather wrong with the various calculation in the ATLAS code :-(

Nope, it is not the Atlas code, but something on your machine.

The BOINC-Client makes Benchmarks from Time to Time, and the latest Benchmark was done, when the machine was not really idle or the older one was wrong.

Enshure that your machine is real idle and then initiate BOINC to benchmark again.

Maybe this fixes your problem
14) Message boards : ATLAS application : queue is empty (Message 47728)
Posted 25 Jan 2023 by Profile Yeti
Post:
No more Atlas-Tasks ?
15) Message boards : ATLAS application : ATLAS native v2.91 (Message 47333)
Posted 30 Sep 2022 by Profile Yeti
Post:
Since today around 12:00 UTC I see a lot (not all) of Atlas-Native-Tasks fail after 600 seconds runtime.

You can check here: https://lhcathome.cern.ch/lhcathome/results.php?userid=555&offset=0&show_names=0&state=6&appid=
16) Message boards : Number crunching : Atlas Nativ, CVMFS and Apptainer with Ubuntu 22.04.1 (Message 47208)
Posted 30 Aug 2022 by Profile Yeti
Post:
The line:
[2022-08-30 06:15:06] Singularity works
is a hardcoded String that the script prints to the log.

It could also print "Wuppdibragglkennsdmined works" and it would still run Apptainer.

But hopefully it prints this only to the log if it really works
17) Message boards : Number crunching : Atlas Nativ, CVMFS and Apptainer with Ubuntu 22.04.1 (Message 47206)
Posted 30 Aug 2022 by Profile Yeti
Post:
In Production is singularity active and multiattach.
Only in -dev is apptainer in use, but there is no new work (Holiday?).


From the logfile in production:

[2022-08-30 06:15:06] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7
[2022-08-30 06:15:06] Checking for singularity binary...
[2022-08-30 06:15:06] Using singularity found in PATH at /usr/bin/singularity
[2022-08-30 06:15:06] Running /usr/bin/singularity --version
[2022-08-30 06:15:06] apptainer version 1.1.0-rc.2
[2022-08-30 06:15:06] Checking singularity works with /usr/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname
[2022-08-30 06:15:06] mannivl22
[2022-08-30 06:15:06] Singularity works

From the Logfile in DEV:

[2022-08-29 23:17:50] Using apptainer image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7
[2022-08-29 23:17:50] Checking for apptainer binary...
[2022-08-29 23:17:50] Using apptainer found in PATH at /usr/bin/apptainer
[2022-08-29 23:17:50] Running /usr/bin/apptainer --version
[2022-08-29 23:17:50] apptainer version 1.1.0-rc.2
[2022-08-29 23:17:50] Checking apptainer works with /usr/bin/apptainer exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname
[2022-08-29 23:17:50] mannivl22
[2022-08-29 23:17:50] apptainer works

My boxes get sporadic 1 WU in DEV
18) Message boards : Number crunching : Atlas Nativ, CVMFS and Apptainer with Ubuntu 22.04.1 (Message 47203)
Posted 30 Aug 2022 by Profile Yeti
Post:
I wanted to setup a new Sytem for Atlas Native with Ubuntu 22.04.1

Somewhere I had seen several questions about is Apptainer / cvmfs working with Ubuntu 22.04.1, but never found an answer.

So I have upgraded a working machine to Ubuntu 22.04.1 and for me, it looks as if all is working fine.

cvmfs tells ok
apptainer tells ok
Hitfile is produced

So, is there more that I should check ?

Or, can someone check my results from this box: https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10813571

Thanks in Advance

Yeti
19) Message boards : ATLAS application : Open Firewall Port 25085 (Message 47154)
Posted 16 Aug 2022 by Profile Yeti
Post:
Interesting, so, do I have to change something on my squid too ?
20) Message boards : ATLAS application : Atlas Native Transient HTTP Errors Uploading Resultfile (Message 47013)
Posted 12 Jul 2022 by Profile Yeti
Post:
and check with "cvmfs_config stat" if "DIRECT" changes to the local proxy's IP.

Yes, it shows the local IP from my squid

Thanks a lot !


Next 20


©2024 CERN