ATLAS vbox and native 3.01

Author	Message
Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 455 Credit: 213,668,191 RAC: 3,834	Message 48659 - Posted: 25 Sep 2023, 14:37:16 UTC - in response to Message 48644. Great, I love it ! Thank you for this helpful command As a Linux-Newbee what would be neccessary to show the BOINC Slot-Number in the line ? (I run three WUs with each 4-Cores simultaneous) Thanks in Advance Yeti I modified the command line to monitor ATLAS native 3.01: In this example, I used 2 CPUs per task hence the tail -n2. sudo watch -n10 "find /var/lib/boinc-client/slots/ \( -name \"log.EVNTtoHITS\" -o -name \"AthenaMP.log\" \) \|sort \|xargs -I {} -n1 sh -c \"egrep 'INFO.Run:Event ' {} \|tail -n2\"\|sort -k 7,7" An example of output: 17:32:48 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool 390 0 INFO Run:Event 450000:20848791 (200th event for this worker) took 82.67 s. New average 93.67 +- 3.91 17:32:44 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool 391 1 INFO Run:Event 450000:20848792 (192th event for this worker) took 45.15 s. New average 98.49 +- 3.622 17:32:03 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool 362 0 INFO Run:Event 450000:22570763 (186th event for this worker) took 40.7 s. New average 96.78 +- 3.699 17:32:53 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool 363 1 INFO Run:Event 450000:22570764 (178th event for this worker) took 128.6 s. New average 102.1 +- 4.028 17:33:07 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool 312 1 INFO Run:Event 450000:22644313 (159th event for this worker) took 209.2 s. New average 95.61 +- 3.997 17:33:01 ISF_Kernel_FullG4MT_QS.ISF_LongLivedGeant4Tool 313 0 INFO Run:Event 450000:22644314 (155th event for this worker) took 152.8 s. New average 99 +- 4.297 Supporting BOINC, a great concept !* ID: 48659 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2636 Credit: 277,029,594 RAC: 145,095	Message 48660 - Posted: 25 Sep 2023, 17:33:48 UTC - in response to Message 48659. Try this: [sudo] watch -n10 "find /var/lib/boinc-client/slots -name \"log.EVNTtoHITS\" \|sort \|xargs -I {} -n1 sh -c \"echo "{}"; grep -Po 'INFO.Run:Event.\K\(.*' {} \|tail -n4; echo\"" Example (2-core setup): /var/lib/boinc-client/slots/0/PanDA_Pilot-5970886672/log.EVNTtoHITS (22th event for this worker) took 440 s. New average 176.4 +- 22.97 (22th event for this worker) took 56.65 s. New average 169.1 +- 19.16 /var/lib/boinc-client/slots/1/PanDA_Pilot-5970704360/log.EVNTtoHITS (66th event for this worker) took 60.29 s. New average 181.3 +- 11.98 (70th event for this worker) took 205.2 s. New average 174.8 +- 10.96 /var/lib/boinc-client/slots/2/PanDA_Pilot-5970572499/log.EVNTtoHITS (119th event for this worker) took 51.69 s. New average 170.3 +- 7.973 (117th event for this worker) took 60.95 s. New average 171.7 +- 7.405 Hints: - Removed the search for "AthenaMP.log" since ATLAS now reports everything to "log.EVNTtoHITS". - use "tail -n4" for a 4-core setup, "tail -n3" for a 3-core setup ... - Like all suggested commands before the oneliner prints (partly) the last n lines matching the pattern rather than the last line per worker thread. Should be good enough for a rough overview. ID: 48660 · Reply Quote

Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 455 Credit: 213,668,191 RAC: 3,834	Message 48662 - Posted: 25 Sep 2023, 19:38:24 UTC - in response to Message 48660. Try this: [sudo] watch -n10 "find /var/lib/boinc-client/slots -name \"log.EVNTtoHITS\" \|sort \|xargs -I {} -n1 sh -c \"echo "{}"; grep -Po 'INFO.Run:Event.\K\(.' {} \|tail -n4; echo\"" ... This rocks ! Thank you very much Yeti Supporting BOINC, a great concept !* ID: 48662 · Reply Quote

tazzduke Send message Joined: 24 Jun 10 Posts: 43 Credit: 6,215,669 RAC: 65	Message 48663 - Posted: 26 Sep 2023, 0:50:53 UTC - in response to Message 48662. Try this: [sudo] watch -n10 "find /var/lib/boinc-client/slots -name \"log.EVNTtoHITS\" \|sort \|xargs -I {} -n1 sh -c \"echo "{}"; grep -Po 'INFO.Run:Event.\K\(.*' {} \|tail -n4; echo\"" ... This rocks ! Thank you very much Yeti +1 ID: 48663 · Reply Quote

maeax Send message Joined: 2 May 07 Posts: 2267 Credit: 175,671,719 RAC: 30	Message 48705 - Posted: 30 Sep 2023, 7:34:10 UTC Last modified: 30 Sep 2023, 7:43:06 UTC ATLAS Simulation 3.01 (vbox64_mt_mcore_atlas) Name - 186NDmwdg63np2BDcpmwOghnABFKDmABFKDmtdFKDmDR9KDmriteGo Is it possible to get some of this 1.45 GByte in the Squid - ProxyServer? Or is it possible to reduce this file in other whise? Have reduced from 8 Tasks to 2 Tasks for each Threadripper in prefs! ID: 48705 · Reply Quote

Toby Broom Volunteer moderator Send message Joined: 27 Sep 08 Posts: 871 Credit: 727,984,541 RAC: 159,636	Message 48711 - Posted: 30 Sep 2023, 11:39:43 UTC Maximium object size in cache is 6GB so should be there if needed. ID: 48711 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2636 Credit: 277,029,594 RAC: 145,095	Message 48712 - Posted: 30 Sep 2023, 12:16:01 UTC - in response to Message 48711. If squid.conf from the forum is used ATLAS EVNT files are excluded from being cached by intention. Squid will just download and forward them to the BOINC client. It is configured that way because: - each task sends a unique URL for the file, hence from the HTTP point of view they are all different - their content is different ) - not writing them to disk avoids the cache quota being used up very quickly - not writing them to disk avoids the files being written to disk at all (on the Squid box) A large "maximium object size" is mainly thought to have enough headroom for vdi files. Unlike the EVNT files those will be written to the disk cache. ) In fact David Cameron once mentioned they have a limited #different EVNT files. But the chance to get tasks using the same input file is extremely small, hence not worth to cache them. ID: 48712 · Reply Quote

Harri Liljeroos Send message Joined: 28 Sep 04 Posts: 765 Credit: 56,914,596 RAC: 27,185	Message 48781 - Posted: 14 Oct 2023, 21:40:46 UTC I had an unusual Atlas task that shows abnormal CPU time. Otherwise I don't see anything different for it. Here's the result: https://lhcathome.cern.ch/lhcathome/result.php?resultid=400390410 and here's the same in Panda: https://bigpanda.cern.ch/job/5984624678/ The task was run on a win 10 host inside a VM with 4 CPU cores. Normally these 400 event tasks run for about 3-4 hours of wall clock time and 12-16 hours of CPU time. The task in question ran for 3:37 hours but measured CPU time of 36 hours. That would correspond to 10 CPU cores used. But CPU usage was normal while it was running. So I wonder what is the story behind this bizarre CPU time? ID: 48781 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1450 Credit: 9,747,300 RAC: 593	Message 48782 - Posted: 15 Oct 2023, 10:00:30 UTC - in response to Message 48781. So I wonder what is the story behind this bizarre CPU time? Really strange. It seems to me, that it's a BOINC issue. The difference from 1 day off was already during the run: 2023-10-14 19:18:15 (10428): Status Report: Elapsed Time: '6000.000000' 2023-10-14 19:18:15 (10428): Status Report: CPU Time: '107253.250000' 2023-10-14 20:58:18 (10428): Status Report: Elapsed Time: '12000.000000' 2023-10-14 20:58:18 (10428): Status Report: CPU Time: '129955.890625' ID: 48782 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2636 Credit: 277,029,594 RAC: 145,095	Message 48783 - Posted: 15 Oct 2023, 13:08:41 UTC - in response to Message 48782. These lines are not from BOINC. Instead they are from ATLAS. Looks like that task had an internal problem which is not exposed to any log here. ID: 48783 · Reply Quote

maeax Send message Joined: 2 May 07 Posts: 2267 Credit: 175,671,719 RAC: 30	Message 49643 - Posted: 25 Feb 2024, 10:28:18 UTC https://lhcathome.cern.ch/lhcathome/result.php?resultid=406919512 [2024-02-25 11:16:18] 2024-02-25 10:16:04,707 \| WARNING \| format EVNTtoHITS has no such key: dbData [2024-02-25 11:16:18] 2024-02-25 10:16:04,707 \| WARNING \| format EVNTtoHITS has no such key: dbTime [2024-02-25 11:16:18] 2024-02-25 10:16:04,707 \| WARNING \| wrong length of table data, x=[1708855815.0, 1708855876.0], y=[1909.0, 253620.0] (must be same and length>=4) [2024-02-25 11:16:18] 2024-02-25 10:16:04,708 \| INFO \| .............................. ID: 49643 · Reply Quote

LHC@home