Message boards : ATLAS application : Console monitoring
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
gyllic

Send message
Joined: 9 Dec 14
Posts: 202
Credit: 2,533,875
RAC: 0
Message 29613 - Posted: 25 Mar 2017, 10:33:36 UTC - in response to Message 29612.  
Last modified: 25 Mar 2017, 10:35:00 UTC

not always does the console seem to work:

I have currently a task running (1-core) for which I opened the console a few times since it startet some 36 hours ago (BOINC shows a status of about 88%). Every time the console showed what it was supposed to show.
Now, suddenly, the console is only black, all over. Not a single figure or letter.
What does this mean? Is the task broken? Is the console broken? Is the VM broken?

try to klick into the console, hit enter once and maybe you see some output. but not sure if this works.
ID: 29613 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 436
Credit: 117,893,361
RAC: 7,675
Message 29617 - Posted: 25 Mar 2017, 11:09:48 UTC - in response to Message 29612.  

Erich65 wrote:
What does this mean? Is the task broken? Is the console broken? Is the VM broken?

Go to my Checklist V3 and check Number 16 Scenario E


Supporting BOINC, a great concept !
ID: 29617 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1516
Credit: 46,055,466
RAC: 56,791
Message 29628 - Posted: 25 Mar 2017, 15:24:57 UTC - in response to Message 29617.  

Thanks, Yeti, for your advice.

However, when I came back home lateron, I saw that the task had finished properly. So maybe something was wrong only with the console GUI, or whatever.
ID: 29628 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 436
Credit: 117,893,361
RAC: 7,675
Message 29662 - Posted: 26 Mar 2017, 21:35:07 UTC

I'm running SingleCore, but each event nr exists several times ! ?




Supporting BOINC, a great concept !
ID: 29662 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 368
Credit: 13,468,233
RAC: 7,473
Message 29665 - Posted: 27 Mar 2017, 7:32:03 UTC - in response to Message 29662.  

The output is added to the previous output so that's why you see repetition (notice that the timestamps are the same). I will try to flush the screen each time before printing the output.
ID: 29665 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 2011
Credit: 147,507,652
RAC: 114,758
Message 29697 - Posted: 28 Mar 2017, 12:19:20 UTC

@David Cameron

The console output works and is much better than to have nothing.

Can you add the total number of WU events?
Perhaps like:
... Event nr. 5/100 took ...
ID: 29697 · Report as offensive     Reply Quote
MPI für Physik

Send message
Joined: 20 Mar 15
Posts: 7
Credit: 555,444,194
RAC: 1,084,477
Message 29709 - Posted: 29 Mar 2017, 15:34:06 UTC

It seems like that the new information output produces also a lot of mails.
Every time when a event is processed you are doing some grep on the events, but the location is wrong, so the postmaster is sending everytime a mail.


Subject: Cron <root@localhost> grep -h "Event nr" /home/atlas01/RunAtlas/Panda_Pilot_*/PandaJob_*/athenaMP-workers-EVNTtoHITS-sim/worker_*/AthenaMP.log|sort > /dev/tty2
grep: /home/atlas01/RunAtlas/Panda_Pilot_*/PandaJob_*/athenaMP-workers-EVNTtoHITS-sim/worker_*/AthenaMP.log: No such file or directory

It would be great if that could be fixed!
ID: 29709 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 368
Credit: 13,468,233
RAC: 7,473
Message 29721 - Posted: 30 Mar 2017, 8:59:56 UTC - in response to Message 29709.  

This is fixed now (the fix will be propagated to new tasks in a few hours).

The errors should only happen at the start of the task before the log is started, did you see it during the whole task?
ID: 29721 · Report as offensive     Reply Quote
MPI für Physik

Send message
Joined: 20 Mar 15
Posts: 7
Credit: 555,444,194
RAC: 1,084,477
Message 29723 - Posted: 30 Mar 2017, 12:29:14 UTC - in response to Message 29721.  

Thank you David!

Unfortunately i saw it during the whole task, on each PC, so there where a huge amount of E-Mails.
I will check if it is fine now.
ID: 29723 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 588
Credit: 33,781,019
RAC: 19,686
Message 29724 - Posted: 30 Mar 2017, 14:28:29 UTC

The TOP (Alt+F3) does not work anymore. It stopped working when I changed to running with a single core.
ID: 29724 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 436
Credit: 117,893,361
RAC: 7,675
Message 29725 - Posted: 30 Mar 2017, 14:44:01 UTC - in response to Message 29724.  

The TOP (Alt+F3) does not work anymore. It stopped working when I changed to running with a single core.

Are you shure ? TOP wasn't available on Atlas until now


Supporting BOINC, a great concept !
ID: 29725 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 588
Credit: 33,781,019
RAC: 19,686
Message 29729 - Posted: 30 Mar 2017, 17:38:45 UTC - in response to Message 29725.  

Yes, it was working about a week ago when it was announced but not at the moment. With Alt+F3 I now get the same screen as with Alt+F1
ID: 29729 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 368
Credit: 13,468,233
RAC: 7,473
Message 29731 - Posted: 30 Mar 2017, 19:49:14 UTC - in response to Message 29729.  

Well it was kind of half-working a week ago, but my latest attempts to make it work fully stopped it working completely.

The problem seems to be running a persistent command with sudo (root permission is needed to write to the console) inside a script run as a normal user. It works for other LHC projects because as I understand they run bootstrap scripts as root. I will keep trying to find a way to make it work.
ID: 29731 · Report as offensive     Reply Quote
Timo425

Send message
Joined: 28 Sep 17
Posts: 4
Credit: 451,660
RAC: 0
Message 33052 - Posted: 12 Nov 2017, 13:30:08 UTC
Last modified: 12 Nov 2017, 13:33:18 UTC

So I'm trying to figure out why my LHC tasks in linux seem to run so slow.
Atlas simulation 1.01 (vbox64_mt_mcore_atlas) has been running for almost 13 hours now and is currently at 99.994%. However it is progressing very slowly at this point.
alt+f2 does nothing in the vm console, only alt+f1 and alt+f6 switch to some other information. I'm using Ubuntu, I have 16gb RAM and i5-6500 as cpu. My cpu usage is currently set at 99%, however 3 cores are used by LHC and 1 core is used by 3 Einstein WUs at the same time. Also LHC priority is half of normal.
Task manager shows 5gb of 15.6gb used and 1 core is running at 100% while other three are usually slacking around 10-25%.
ID: 33052 · Report as offensive     Reply Quote
Timo425

Send message
Joined: 28 Sep 17
Posts: 4
Credit: 451,660
RAC: 0
Message 33057 - Posted: 13 Nov 2017, 10:44:16 UTC
Last modified: 13 Nov 2017, 10:49:34 UTC

It seems I can't edit my own previous post?
Anyway I cancelled my previous task and started a new one. Now alt+f2 and alt+f3 works in the VM console and it seems that the cores are being used for only around 0.3%. This new task has been going at it for 14 hours now and CPU time is only 33 minutes. Looks like it happens to every ATLAS WU I get.
I will try to look around the forum for a solution, if there is one..

EDIT: I already added app_config.xml for the task, currently I tell it to use 3 cores and ram limit is set to 8 gb. Any other suggestions for the xml?
app_config.xml:
<?xml version="1.0"?>
<app_config>
<project_max_concurrent>3</project_max_concurrent>
<app>
<name>ATLAS</name>
<max_concurrent>1</max_concurrent>
</app>
<app_version>
<app_name>ATLAS</app_name>
<avg_ncpus>3.000000</avg_ncpus>
<plan_class>vbox64_mt_mcore_atlas</plan_class>
<cmdline>--memory_size_mb 8000</cmdline>
</app_version>
</app_config>
ID: 33057 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 436
Credit: 117,893,361
RAC: 7,675
Message 33058 - Posted: 13 Nov 2017, 10:45:47 UTC - in response to Message 33057.  
Last modified: 13 Nov 2017, 10:46:10 UTC

I will try to look around the forum for a solution, if there is one..

Take a walk through my checklist: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161&postid=29359#29359


Supporting BOINC, a great concept !
ID: 33058 · Report as offensive     Reply Quote
Timo425

Send message
Joined: 28 Sep 17
Posts: 4
Credit: 451,660
RAC: 0
Message 33059 - Posted: 13 Nov 2017, 14:31:00 UTC
Last modified: 13 Nov 2017, 14:34:10 UTC

Yeti, thanks! Resetting lhc@home project seemed to do the trick, alt+f3 showed no much activity in first 10 mins but then 3 cores started crunching hard at 100%. :) Hopefully it will stay that way when I resume other projects as well, for now I will try to run a few ATLAS WUs in a row.
A lot of stuff going on in alt+F2 too.
ID: 33059 · Report as offensive     Reply Quote
Profile rbpeake

Send message
Joined: 17 Sep 04
Posts: 83
Credit: 25,643,071
RAC: 5,294
Message 46979 - Posted: 6 Jul 2022, 20:00:12 UTC - in response to Message 29473.  
Last modified: 6 Jul 2022, 20:01:13 UTC

[Repeated]
Regards,
Bob P.
ID: 46979 · Report as offensive     Reply Quote
Profile rbpeake

Send message
Joined: 17 Sep 04
Posts: 83
Credit: 25,643,071
RAC: 5,294
Message 46980 - Posted: 6 Jul 2022, 20:00:24 UTC - in response to Message 29473.  

We have added some information on the processed events in ATLAS tasks on consoles inside the VM.

To show the consoles, go to the advanced view of BOINC manager, select a running ATLAS task and you should see the button "Show VM Console" on the left menu. If you do not see this button you may need to install the VirtualBox extension pack and/or install remote desktop software such as CoRD on Mac OS or xfreerdp on Linux. There should be remote desktop software included by default on Windows but maybe someone else can confirm this.

When you click "Show VM Console" you should see a terminal window with a login prompt. If you press Alt-F2 (Alt-Fn-F2 on Mac) you should see a screen like this:



NOTE you will only see this information after the task has been running for some time, i.e. has simulated at least 1 event. So please wait up to 30 minutes for information to appear.

This output shows the number of events processed by each core, as well as the time per event and the average time per event so far. Each core has its own independent counter which is why you see the event numbers repeated. In the example there are 4 cores and with 100 events per task each core will process 25 events each. This information therefore can give you an estimate of how long the task will run.

We are working on putting the "top" output into console 3 (Alt-F3) but it doesn't quite work perfectly yet.


I can't get this to work. Does it remain a valid procedure?
Thanks.
Regards,
Bob P.
ID: 46980 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 2011
Credit: 147,507,652
RAC: 114,758
Message 46981 - Posted: 6 Jul 2022, 20:17:26 UTC - in response to Message 46980.  

The appearance of console 2 (press ALT + F2) has completely changed.
It presents the data from the same log that you see in the screenshot but in a different manner.

On console 3 (press ALT + F3) you will see the output of the "top" command of the running VM.

Both commands run at low priority to leave the majority of cpu cycles for the scientific app.
Hence, it may need a couple of minutes until the VM's basic setup has finished and the required components are available.
ID: 46981 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : ATLAS application : Console monitoring


©2022 CERN