Message boards :
ATLAS application :
2000 Events Threadripper 3995WX
Message board moderation
Author | Message |
---|---|
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 307 |
Finished today: 8 Cpu's 23 Sep 2023, 18:51:11 UTC 24 Sep 2023, 12:50:48 UTC Fertig und Bestätigt 64,018.61 476,227.40 12,480.55 Creditpoints ATLAS Simulation v3.01 (vbox64_mt_mcore_atlas) windows_x86_64 |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 307 |
Second on the same PC: Uploadfile: 1.43 GByte Laufzeit 17 Stunden 20 min. 46 sek. CPU Zeit 5 Tage 8 Stunden 47 min. 21 sek. 23 Sep 2023, 22:06:24 UTC 24 Sep 2023, 15:38:43 UTC Fertig und Bestätigt 62,446.21 463,641.60 10,991.18 ATLAS Simulation v3.01 (vbox64_mt_mcore_atlas) windows_x86_64 Checkpoint file every 200 events is useful for this very long runtime. Hardware Acceleration in use therefore. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 307 |
AMD Ryzen 9 3950X 16-Core Processor [Family 23 Model 113 Stepping 0] Windows 11 Workstation CPU Count for VM. (2) 2.000 events 23 Sep 2023, 21:42:03 UTC 25 Sep 2023, 19:53:21 UTC Fertig und Bestätigt 162,627.91 314,670.00 19,613.00 ATLAS Simulation v3.01 (vbox64_mt_mcore_atlas) windows_x86_64 |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 307 |
Would be nice to get Longrunner (2000 events) only over a venue in LHC-prefs for those HPC (High performance Computer) - Threadripper3995WX. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 307 |
Four other on Threadripper 3995WX: Laufzeit 14 Stunden 24 min. 20 sek. CPU Zeit 4 Tage 4 Stunden 16 min. 59 sek. Laufzeit 14 Stunden 13 min. 35 sek. CPU Zeit 4 Tage 3 Stunden 5 min. 7 sek. Laufzeit 16 Stunden 25 min. 33 sek. CPU Zeit 4 Tage 22 Stunden 11 min. 55 sek. Laufzeit 16 Stunden 19 min. 1 sek. CPU Zeit 4 Tage 20 Stunden 40 min. 23 sek. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 307 |
atm 1000 events for four Tasks. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 307 |
Seven Tasks with 1k Events finished after 6 hours :-). |
Send message Joined: 17 Oct 06 Posts: 89 Credit: 57,163,734 RAC: 3,524 |
Any reason why the tasks suddenly jumped up to 6 hours? They used be like 40 min to 2 hours in the past? |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 307 |
No idea, only Cern-IT can answer us. In the Spring of 2023, David Cameron makes tests with 2k Atlas-Tasks. When remember correct, 2k need no transfer for us, because the data is direct from the Collider of Atlas. @David Cameron: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5978&postid=47914#47914 Atlas Event Progress Monitor in RDP is now correct, Thank you. |
Send message Joined: 28 Sep 04 Posts: 732 Credit: 49,373,095 RAC: 13,741 |
Atlas Event Progress Monitor in RDP is now correct, Thank you. That applies only to these long 1000 event tasks. For the short ones with 400 events, the monitoring does not work right. |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 23,290 |
According to the logs I checked there were tasks configured to process 100, 400, 500 or 1000 events. To vary the # events doesn't seem to be a good decision from the people who submitted the tasks since it finally unbalances BOINC's work fetch calculation, it's runtime estimation and it's credit calculation. A while ago some tests where done showing 500 events per task are a good compromise between the project needs and most volunteers can handle without major issues. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 307 |
only Cern-IT can answer us. |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 852 |
The 1000 events job produced a 756MB HITS-file to upload. |
Send message Joined: 14 Sep 08 Posts: 52 Credit: 64,094,999 RAC: 17,187 |
I personally prefer the bigger jobs. From what I see, each ATLAS WU always has a 20-30 min idle setup time. Having more work per WU is going to help with efficiency quite a bit. It also seems to reduce the network usage on download side (from client). |
Send message Joined: 17 Sep 04 Posts: 105 Credit: 32,824,862 RAC: 59 |
I agree. Regards, Bob P. |
Send message Joined: 3 Nov 12 Posts: 59 Credit: 142,193,076 RAC: 32,238 |
I personally prefer the bigger jobs. From what I see, each ATLAS WU always has a 20-30 min idle setup time. Having more work per WU is going to help with efficiency quite a bit. It also seems to reduce the network usage on download side (from client). +1 |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 852 |
I don't prefer these longrunners at all. 400 events is really the maximum. Better would be 200 events. From the BOINC point of view: Volunteer computing is meant for PC's not in use. Not all volunteers have monster machines. But the most important disadvantage of running ATLAS and/or CMS (up to 18hrs) jobs is that they need an uninterrupted network connection. This also means that the tasks cannot be suspended or shutdown for more than 20 minutes maybe an hour. A lot of crunchers want to shutdown their machines during evening / night to save electricity costs This machine is running a 1000 events job on 4 cores and is already 26 hours busy and another 7 hours to go. |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 23,290 |
I personally prefer the bigger jobs. This has been discussed forth and back, and yes, there are volunteers with fast computers and fast internet running their systems 24/7 (including mine). Those usually do not have problems even with 2000 eventers. On the other hand: - There are lots of computers that are not fast enough to finish large tasks within a reasonable time - ATLAS (native) does not support suspend/resume, hence tasks start from scratch - ATLAS generates huge upload files Together with other points mentioned in the past those 500 eventers were accepted as compromise. As for long setup times. They are usually shorter in case of - smaller EVNT files, less events - a local HTTP proxy is used - CVMFS is configured to use Cloudflare’s CDN Especially on Linux a few cgroups tweaks via systemd can be set to ensure CPU cycles are not lost during an ATLAS setup but instead given to other running tasks. This slightly slows down an individual task but increases the total throughput of the computer. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 307 |
This machine is running a 1000 events job on 4 cores and is already 26 hours busy and another 7 hours to go. What's about a venue for Events in preferences? four or five venues (100, 200, 500, 1000, 2000). |
Send message Joined: 4 Sep 22 Posts: 92 Credit: 16,008,656 RAC: 8,102 |
Especially on Linux a few cgroups tweaks via systemd can be set to ensure CPU cycles are not lost during an ATLAS setup but instead given to other running tasks. More detail, please. |
©2024 CERN