Message boards : ATLAS application : ATLAS vbox and native 3.01
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2433
Credit: 227,832,888
RAC: 124,960
Message 47935 - Posted: 29 Mar 2023, 7:35:17 UTC - in response to Message 47932.  

... ALT-F2 monitoring ...

Right, it works here since the tasks are still processing data from LHC Run 2 using the matching scripts.
It didn't work at -dev recently because the tasks there were tests from LHC Run 3 (data/scripts).

ATLAS Monitoring is not yet prepared to deal with the slightly modified Run 3 logging.
Once this is implemented it will be tested at -dev first.
So far the suggestion is to stay patient.
ID: 47935 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 25 Sep 17
Posts: 99
Credit: 3,298,927
RAC: 4,002
Message 47936 - Posted: 29 Mar 2023, 11:33:45 UTC - in response to Message 47914.  

Is the VM RAM size going to be adjusted eventually or will the current defaults be used because 3.01 can run 'Run 2' and 'Run 3'? Will the VM change the RAM assigned by 'Run 2' vs 'Run 3'?
ID: 47936 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1289
Credit: 8,522,395
RAC: 2,336
Message 47937 - Posted: 29 Mar 2023, 12:48:23 UTC - in response to Message 47936.  

Is the VM RAM size going to be adjusted eventually or will the current defaults be used because 3.01 can run 'Run 2' and 'Run 3'? Will the VM change the RAM assigned by 'Run 2' vs 'Run 3'?

The RAM size is set to 4000 MB independend of running tasks from run 2 or run 3.
It also doesn't matter whether you run single core vbox tasks or multi-core e.g. 4 or 8 cores.
ID: 47937 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 25 Sep 17
Posts: 99
Credit: 3,298,927
RAC: 4,002
Message 47938 - Posted: 29 Mar 2023, 16:41:42 UTC - in response to Message 47937.  

VM is assigning 8400 MB for my 6 core tasks. Max # CPUs set to 6 on LHC preferences. No app_config.xml used.
ID: 47938 · Report as offensive     Reply Quote
Profile rbpeake

Send message
Joined: 17 Sep 04
Posts: 99
Credit: 30,836,799
RAC: 8,091
Message 47939 - Posted: 29 Mar 2023, 17:46:58 UTC - in response to Message 47937.  

Is the VM RAM size going to be adjusted eventually or will the current defaults be used because 3.01 can run 'Run 2' and 'Run 3'? Will the VM change the RAM assigned by 'Run 2' vs 'Run 3'?

The RAM size is set to 4000 MB independend of running tasks from run 2 or run 3.
It also doesn't matter whether you run single core vbox tasks or multi-core e.g. 4 or 8 cores.

Was it determined in years past that runs using 4 or more cores resulted in a sharp reduction in processing efficiency?
Regards,
Bob P.
ID: 47939 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 677
Credit: 43,806,880
RAC: 14,366
Message 47940 - Posted: 29 Mar 2023, 18:04:37 UTC - in response to Message 47938.  

VM is assigning 8400 MB for my 6 core tasks. Max # CPUs set to 6 on LHC preferences. No app_config.xml used.

I am running Atlas tasks with VM in Windows 10 using 4 cores per task. I removed the commandline parameter from my app_config.xml that set the memory usage to 6600 KB. The tasks still use 6600 KB memory with 4 CPUs when using 3.01 application. This can be seen also in Alt+F3 TOP terminal window.
ID: 47940 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 677
Credit: 43,806,880
RAC: 14,366
Message 47941 - Posted: 29 Mar 2023, 19:06:49 UTC - in response to Message 47940.  

VM is assigning 8400 MB for my 6 core tasks. Max # CPUs set to 6 on LHC preferences. No app_config.xml used.

I am running Atlas tasks with VM in Windows 10 using 4 cores per task. I removed the commandline parameter from my app_config.xml that set the memory usage to 6600 KB. The tasks still use 6600 KB memory with 4 CPUs when using 3.01 application. This can be seen also in Alt+F3 TOP terminal window.

Edit: I meant of course 6600 MB
ID: 47941 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 677
Credit: 43,806,880
RAC: 14,366
Message 47942 - Posted: 30 Mar 2023, 14:38:57 UTC

Is there a way to recognize the run 3 tasks? From the name? Memory consumption? Or how?
ID: 47942 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1289
Credit: 8,522,395
RAC: 2,336
Message 47943 - Posted: 30 Mar 2023, 15:13:55 UTC - in response to Message 47942.  

Is there a way to recognize the run 3 tasks? From the name? Memory consumption? Or how?

At the moment the output of ALT-F2 monitoring is garbled for run 3 and OK for run 2.
ID: 47943 · Report as offensive     Reply Quote
BellyNitpicker

Send message
Joined: 16 Jun 20
Posts: 8
Credit: 2,318,092
RAC: 0
Message 47944 - Posted: 30 Mar 2023, 17:22:34 UTC

I'm running ATLAS on a Mac Mini (MacOS 12.6.3) under BOINC (7.20.4) and VirtualBox (7.0.6) with five CPUs, a max memory allocation of 12GB, and disc of 40GB.

Each ATLAST task is set for 5 CPUs, and an initial run time of 00:22:20. They reach 100% after the prescribed time, then run away. I've just chopped the last batch as one had run from 00:22:20 to more than 19 hours while showing progress as 100%.

All was fine until two days ago, when I upgraded VB from 7.0.4.

However, my Ubuntu VMs run without problem

Is there an LHC problem with VB 7.0.6?
ID: 47944 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2124
Credit: 159,926,969
RAC: 54,657
Message 47945 - Posted: 30 Mar 2023, 17:27:24 UTC - in response to Message 47944.  

For Windows we need Python.
ID: 47945 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2433
Credit: 227,832,888
RAC: 124,960
Message 47946 - Posted: 30 Mar 2023, 18:06:47 UTC - in response to Message 47944.  

... with five CPUs, a max memory allocation of 12GB, and disc of 40GB

Where did you set this?
The recent logs show 7500 MB are allocated for the ATLAS VM:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=391062360



Same here:
task is set for ... an initial run time of 00:22:20

Where did you set this?



... had run from 00:22:20 to more than 19 hours

Very unusual.


... while showing progress as 100%

This might be caused by the fact that ATLAS published a new version.
App updates always mess BOINC's runtime estimation.
It will take a while until BOINC (server/client) negotiate the new effective GFLOPs value to be used in the runtime/credit calculation. Until then BOINC's progress bar and runtime estimation are pretty much useless.

Beside that ATLAS 3.x has a new (larger) vdi file.
Unfortunately your computer did not yet report a 3.x task, hence it's not possible to check the log for errors.

LHC shouldn't have a problem with Vbox 7.x, especially since you successfully ran the previous version.
ID: 47946 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 25 Sep 17
Posts: 99
Credit: 3,298,927
RAC: 4,002
Message 47947 - Posted: 30 Mar 2023, 20:44:51 UTC - in response to Message 47944.  

I don't see any aborted ATLAS tasks listed under your computer. Two CMS apps show aborted

Did you set LHC to No New Tasks, abort non running tasks or allow the queue to clear before your VirtualBox upgrade? That is usually the best procedure.
ID: 47947 · Report as offensive     Reply Quote
BellyNitpicker

Send message
Joined: 16 Jun 20
Posts: 8
Credit: 2,318,092
RAC: 0
Message 47948 - Posted: 30 Mar 2023, 20:55:35 UTC - in response to Message 47946.  

Where did you set this?


I didn't set it. That's the estimated process time when it arrives.

My settings give BOINC a 40% share of 32GB while running native under MacOS.

Very unusual.


Maybe, but the next task is doing the same. True the percentage is not exactly 100, but it's 99.992, and I would expect that tomorrow morning, with an additional 12 hours on the clock, it might have increased a couple of thousandths of a percent.

https://drive.google.com/file/d/1qDWN1Kdv82dz4bCAwN4EZ1SES4UWmzWq/view?usp=share_link
ID: 47948 · Report as offensive     Reply Quote
BellyNitpicker

Send message
Joined: 16 Jun 20
Posts: 8
Credit: 2,318,092
RAC: 0
Message 47949 - Posted: 30 Mar 2023, 21:00:07 UTC - in response to Message 47946.  

Where did you set this?


my BOINC settings give it 40% of 32GB

Task time is the estimate provided when the task arrives.

Very unusual.


Maybe so, but the next one is the same. Granted, it's 99.992% at the moment, but it's run for 3hrs 25mins out of its allotted 20 mins and 20 seconds, and looks as though it might accrue another couple of thousandths of a percent over the next twelve hours.

https://drive.google.com/file/d/1qDWN1Kdv82dz4bCAwN4EZ1SES4UWmzWq/view?usp=share_link
ID: 47949 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2433
Credit: 227,832,888
RAC: 124,960
Message 47952 - Posted: 31 Mar 2023, 5:48:03 UTC - in response to Message 47949.  

OK, I was thinking in a different direction.

As already mentioned, don't trust the progress bar nor the remaining time until your computer returned a couple of results.
Just let ATLAS run, ideally without intermediate suspension.
ID: 47952 · Report as offensive     Reply Quote
BellyNitpicker

Send message
Joined: 16 Jun 20
Posts: 8
Credit: 2,318,092
RAC: 0
Message 47954 - Posted: 31 Mar 2023, 14:13:28 UTC - in response to Message 47952.  

Here's the same task, 16 hours later. It has been showing remaining time as --- and percent complete as 100.000% for more than ten hours. The elapsed time is clocking up, but there is no CPU activity associated with the task.

https://drive.google.com/file/d/199A_stcrLa-UzNMEwYPzLSPFfkhg8NFY/view?usp=share_link

As fas as I'm concerned, it's a runaway. I can't afford to keep 45% of my available CPUs clogged with a series of runaway tasks, so I've chopped it. I'll come back to LHC in a few weeks to see if the problems are sorted.
ID: 47954 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 25 Sep 17
Posts: 99
Credit: 3,298,927
RAC: 4,002
Message 47955 - Posted: 31 Mar 2023, 15:03:15 UTC - in response to Message 47954.  

I don't think the ATLAS task like to be started and stopped.
I would pick a single project application to run and stick to only that for now.
You can also work through Yeti's checklist. It's a sticky in the forums.

Your picture shows an ATLAS task waiting to run at about 8 minutes in. It takes about 15 minutes before the computation phase of ATLAS as it is talking CVMFS and getting needed files and info.
ID: 47955 · Report as offensive     Reply Quote
Toggleton

Send message
Joined: 4 Mar 17
Posts: 20
Credit: 8,299,338
RAC: 9,127
Message 47956 - Posted: 1 Apr 2023, 6:35:47 UTC - in response to Message 47924.  
Last modified: 1 Apr 2023, 7:05:38 UTC

Seems like the tasks since a few hours are some tasks 240MB again instead of the over 1GB. tasks with EVNT 321... are the smaller ones. there are still coming some 327... with 1GB.
ID: 47956 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2124
Credit: 159,926,969
RAC: 54,657
Message 47957 - Posted: 1 Apr 2023, 7:22:48 UTC - in response to Message 47954.  

BellyNitpicker,
your OS is Darwin and you have successful Tasks with the old Atlas-Version 2.03.
With Version 3.01 something went wrong. maybe more RAM needed.
ID: 47957 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : ATLAS application : ATLAS vbox and native 3.01


©2024 CERN