Message boards : ATLAS application : queue is empty
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next

AuthorMessage
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 418
Credit: 5,667,249
RAC: 3
Message 48549 - Posted: 15 Sep 2023, 4:08:10 UTC - in response to Message 48548.  
Last modified: 15 Sep 2023, 4:10:27 UTC

hadron wrote:
Are you overclocking your CPU? Or maybe running each task on multiple threads?
I have a Ryzen 9 5900X which has almost the same base frequency as the 3900XT, so I would expect your times and mine should be roughly equal.
Actually your CPU is 18% faster according to benchmarks.

Of course I use multiple threads, ATLAS is designed for this. I've left it on the default of 8 threads per task, it tends to average 7, so that's what I told the scheduler via app_config.

    <app_version>
        <app_name>ATLAS</app_name>
        <plan_class>vbox64_mt_mcore_atlas</plan_class>
        <cmdline></cmdline>
        <avg_ncpus>7.000000</avg_ncpus>
        <ngpus>0.000000</ngpus>
    </app_version>

I haven't overclocked.
ID: 48549 · Report as offensive     Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 57
Credit: 8,497,134
RAC: 17,579
Message 48550 - Posted: 15 Sep 2023, 6:14:19 UTC - in response to Message 48549.  

Of course I use multiple threads, ATLAS is designed for this. I've left it on the default of 8 threads per task, it tends to average 7, so that's what I told the scheduler via app_config.

    <app_version>
        <app_name>ATLAS</app_name>
        <plan_class>vbox64_mt_mcore_atlas</plan_class>
        <cmdline></cmdline>
        <avg_ncpus>7.000000</avg_ncpus>
        <ngpus>0.000000</ngpus>
    </app_version>

I haven't overclocked.

When I set the max threads on LHC, all I got were Atlas tasks. I also want to run Theory tasks, so I set LHC to use only 1 thread, then in app_config added a <cmdline>... entry to app_config to control the number of threads directly. It did what I wanted it to do, so that is what I will do again once these 4 tasks are finished (I've temporarily disabled Atlas tasks on LHC).
In addition to the LHC tasks, I'm also running Rosetta, Einstein and Cosmology tasks, so I am certainly not inclined to dedicate 7 threads per Atlas task alone -- initially, I'll set my app_config file to run only 1 Atlas task at a time, on 4 threads, and see where that takes me:

    <app>
        <name>ATLAS</name>
        <max_concurrent>1</max_concurrent>
    </app>
    <app_version>
        <app_name>ATLAS</app_name>
        <avg_ncpus>4</avg_ncpus>
        <plan_class>vbox64_mt_mcore_atlas</plan_class>
        <cmdline>--nthreads 4</cmdline>
    </app_version>

Once I see how this works out, I can easily make changes to suit my preferences.

Thanks for the replies, they've been quite helpful. I'll post again once I see where this takes me.
ID: 48550 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2104
Credit: 159,819,191
RAC: 123,837
Message 48551 - Posted: 15 Sep 2023, 6:25:48 UTC - in response to Message 48546.  

;-)), ok it is so nice to have Atlas back.
Thank you Cern-IT.
ID: 48551 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 418
Credit: 5,667,249
RAC: 3
Message 48552 - Posted: 15 Sep 2023, 6:28:16 UTC - in response to Message 48550.  

When I set the max threads on LHC, all I got were Atlas tasks. I also want to run Theory tasks,
Yes, you never get what you ask for here. If I select to get any tasks, I only get CMS. Every other project gives you an even mixture. Since I have more than one computer, I just set some to get CMS, some Theory, some Atlas, but leave the "if nothing available give me something else" ticked. Then if everything is available, I get some of each. I have to do this as CMS maxes out my uplink and Theory maxes out my downlink. Half of each and it seems to manage without throttling the CPUs. If I had only one computer I've no idea how I'd get a mixture.
ID: 48552 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2104
Credit: 159,819,191
RAC: 123,837
Message 48553 - Posted: 15 Sep 2023, 8:55:01 UTC - in response to Message 48552.  

When you have in prefs all LHC-Projects active,
Boincmanager take the Project from Cern-IT they give us.

So, you can take default, home, school or work to make a difference of work for each of your PC's.
All Projects together in one PC don't work really good.
ID: 48553 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 418
Credit: 5,667,249
RAC: 3
Message 48554 - Posted: 15 Sep 2023, 9:02:39 UTC - in response to Message 48553.  
Last modified: 15 Sep 2023, 9:02:50 UTC

When you have in prefs all LHC-Projects active,
Boincmanager take the Project from Cern-IT they give us.

So, you can take default, home, school or work to make a difference of work for each of your PC's.
Yes, that's what I do, I have a venue for each type of task, and put a few machines in each.

All Projects together in one PC don't work really good.
I have no problem running them all on one PC, if I happen to get a selection.
ID: 48554 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 675
Credit: 43,678,790
RAC: 15,801
Message 48555 - Posted: 15 Sep 2023, 9:14:41 UTC

Four Atlas tasks running here on two hosts, 4 cores used for each task. Run times are currently 48...66 hours (CPU times 156... 248 hours) and all 4 cores still active as seen on Top (console Alt+F3).

If you view the console Alt+F2 you can just make out the number of jobs finished so far (mine are at 1200...1800 finished) averaging about 2000 seconds per job. So they are taking forever to finish. :-(
ID: 48555 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 418
Credit: 5,667,249
RAC: 3
Message 48556 - Posted: 15 Sep 2023, 10:28:47 UTC - in response to Message 48555.  

Four Atlas tasks running here on two hosts, 4 cores used for each task. Run times are currently 48...66 hours (CPU times 156... 248 hours) and all 4 cores still active as seen on Top (console Alt+F3).

If you view the console Alt+F2 you can just make out the number of jobs finished so far (mine are at 1200...1800 finished) averaging about 2000 seconds per job. So they are taking forever to finish. :-(
I estimate your fastest machine is 3/4 of the speed per thread as my Ryzen 9 3900XT. So if you allowed it 8 threads, it would take 1 day compared to my 17.5 hours. With only 4 threads, I'd expect you to take 2 days, so they're fine. Any particular reason you only use 4 threads?
ID: 48556 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 675
Credit: 43,678,790
RAC: 15,801
Message 48557 - Posted: 15 Sep 2023, 11:48:44 UTC - in response to Message 48556.  

Four Atlas tasks running here on two hosts, 4 cores used for each task. Run times are currently 48...66 hours (CPU times 156... 248 hours) and all 4 cores still active as seen on Top (console Alt+F3).

If you view the console Alt+F2 you can just make out the number of jobs finished so far (mine are at 1200...1800 finished) averaging about 2000 seconds per job. So they are taking forever to finish. :-(
I estimate your fastest machine is 3/4 of the speed per thread as my Ryzen 9 3900XT. So if you allowed it 8 threads, it would take 1 day compared to my 17.5 hours. With only 4 threads, I'd expect you to take 2 days, so they're fine. Any particular reason you only use 4 threads?

I selected 4 threads to keep the utilization high (= trying keep the time that only a single thread is used short). With these long tasks I probably will raise the thread count for single task to shorten the run time. Let's see how long these tasks actually will take.

My slower host only has 8 threads of which I leave 2 free and 2 are busy with Einstein GPU tasks, so that leaves only 4 threads for LHC.
ID: 48557 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 418
Credit: 5,667,249
RAC: 3
Message 48558 - Posted: 15 Sep 2023, 11:58:54 UTC - in response to Message 48557.  

I selected 4 threads to keep the utilization high (= trying keep the time that only a single thread is used short). With these long tasks I probably will raise the thread count for single task to shorten the run time. Let's see how long these tasks actually will take.
If you're not getting full utilisation, just run more of them. For example, I have a 24 thread machine and tell four Atlases to run. I let them use up to 8 threads, but tell Boinc they use an average of 6. If you use Boinctasks, you can see what percentage they used and change it accordingly. Not sure if Boinc Manager does this as I never use that horrid interface. For example when I had them set to use 8 threads, it said 75% usage. 75% of 8 is 6, so I set them to 6.

My slower host only has 8 threads of which I leave 2 free and 2 are busy with Einstein GPU tasks, so that leaves only 4 threads for LHC.
GPU tasks don't always need a full core. I adjust and make sure the GPU usage stays high. You can also run more than one task per GPU so the GPU has something to do while waiting for the CPU.
ID: 48558 · Report as offensive     Reply Quote
LRZ-LMU

Send message
Joined: 4 May 17
Posts: 5
Credit: 118,785,284
RAC: 0
Message 48559 - Posted: 15 Sep 2023, 13:25:58 UTC - in response to Message 48558.  

I`m getting that people don`t like the 2k events, so we`ll continue by submitting 400evt jobs. The MC coordination agreed to have a stream of dedicated tasks for this purpose - just waiting for the samples to be requested.
ID: 48559 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2104
Credit: 159,819,191
RAC: 123,837
Message 48560 - Posted: 15 Sep 2023, 13:29:32 UTC - in response to Message 48559.  

Yes, 400 is a good number for events, because of this very different Hardware from us Volunteers.
ID: 48560 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1690
Credit: 104,078,093
RAC: 121,864
Message 48561 - Posted: 15 Sep 2023, 14:00:29 UTC

hi guys, I have a question?
From where do you download ATLAS tasks?
As per the Server Status page, there are none available ...
ID: 48561 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 418
Credit: 5,667,249
RAC: 3
Message 48562 - Posted: 15 Sep 2023, 14:07:52 UTC - in response to Message 48561.  
Last modified: 15 Sep 2023, 14:08:02 UTC

hi guys, I have a question?
From where do you download ATLAS tasks?
As per the Server Status page, there are none available ...
They appear now and again, they're just testing now. Should be a steady stream shortly. Set your account to get atlas only with other types if none available.
ID: 48562 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1280
Credit: 8,496,817
RAC: 2,374
Message 48563 - Posted: 16 Sep 2023, 16:13:08 UTC - in response to Message 48559.  

I got a 2k job and let it run into an error cause the first event needed 787 seconds and the second 2800 seconds.
Let's say on average 1500 seconds times 2000 makes 3 million seconds = 1 million on my 3 threaded VM = 11.5 days 24/7.
Received ---------- 16 Sep 2023, 9:33:40 UTC
Report deadline -- 24 Sep 2023, 8:44:17 UTC
Make up your mind.
ID: 48563 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 418
Credit: 5,667,249
RAC: 3
Message 48564 - Posted: 16 Sep 2023, 16:56:22 UTC - in response to Message 48563.  

I got a 2k job and let it run into an error cause the first event needed 787 seconds and the second 2800 seconds.
Let's say on average 1500 seconds times 2000 makes 3 million seconds = 1 million on my 3 threaded VM = 11.5 days 24/7.
Received ---------- 16 Sep 2023, 9:33:40 UTC
Report deadline -- 24 Sep 2023, 8:44:17 UTC
Make up your mind.
You would have made it on 8 threads. Why throttle the poor thing?
ID: 48564 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 675
Credit: 43,678,790
RAC: 15,801
Message 48602 - Posted: 21 Sep 2023, 14:57:41 UTC
Last modified: 21 Sep 2023, 15:02:10 UTC

A few new tasks has been generated (actually over 300), I got two of them. They are still 2000 job tasks. The server has cancelled all my tasks that were still running. Several hundreds of hours of crunching wasted.
ID: 48602 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 418
Credit: 5,667,249
RAC: 3
Message 48603 - Posted: 21 Sep 2023, 15:27:03 UTC - in response to Message 48602.  
Last modified: 21 Sep 2023, 15:35:00 UTC

Last one I did was 12th September (not sure if that's 2000 size?) took 21 hours on my Ryzen 9 3900XT. I estimate your slowest PC would take double that, just under 2 days. Sounds doable, what's the problem?

https://lhcathome.cern.ch/lhcathome/result.php?resultid=399275439

I shall put everything on Atlas here and see what happens, I have very slow computers and reasonably fast ones.

Great, changed all the settings and cancelled work from other projects then find there's none left to get. Well that was a waste of time.
ID: 48603 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1690
Credit: 104,078,093
RAC: 121,864
Message 48604 - Posted: 21 Sep 2023, 15:48:50 UTC - in response to Message 48602.  

A few new tasks has been generated (actually over 300), I got two of them. They are still 2000 job tasks. The server has cancelled all my tasks that were still running. Several hundreds of hours of crunching wasted.
this for sure is something that should not happen. Not really nice from the project people :-(
The same happened here, but with a few tasks only, so the waste is not more than some 18 hours. But still, that's annoying.

Hence, I will not try to download ATLAS for the time being. My favorite would be CMS, but tasks are still available on a irregular basis only :-(
So, no other choice than to stick with Theory.
ID: 48604 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 675
Credit: 43,678,790
RAC: 15,801
Message 48605 - Posted: 21 Sep 2023, 16:12:59 UTC

Well, I was counting the CPU hours. One task had 181 CPU hours on it and another had 216 hours. When server cancels them none of those values usually get reported to server and is shown as 0 hours (but sometimes they are reported, I don't know what makes the difference) . These tasks were re-sends but they still had time left before deadline, server cancelled them anyway.

These 2000 job tasks take somewhere between 250 and 400 hours of CPU time to finish on my hosts.
ID: 48605 · Report as offensive     Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next

Message boards : ATLAS application : queue is empty


©2024 CERN