Message boards :
ATLAS application :
Missing Output at Console 2
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,401,087 RAC: 123,599 |
8 on my main computer are fine they have the scrolling list of events. Mine are still without console output (except the first line). |
Send message Joined: 2 Sep 04 Posts: 453 Credit: 193,464,258 RAC: 5,837 |
So far as I have understood it depends on the kind of Input-Files that our LHC-WUs are generated from. Maybe that David had not finished his fix for this. So it must not be critical if the output is missing; if you want to check if your WU is healthy use "Properties" and check "elapsed Time" versus "CPU-Time" Supporting BOINC, a great concept ! |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,401,087 RAC: 123,599 |
So far as I have understood it depends on the kind of Input-Files that our LHC-WUs are generated from. Maybe that David had not finished his fix for this. My results seem to be valid. So the main objective of the WUs is fulfilled. The missing console output is nothing to worry about. My comment should have been just a hint that some hosts still have problems with it while others don't, e.g. Toby Broom's. |
Send message Joined: 18 Dec 15 Posts: 1687 Credit: 102,987,306 RAC: 125,849 |
A minute ago, I retried the VM console - and, surprise, this time it worked :-) Thanks to whoever got the problem solved! |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,401,087 RAC: 123,599 |
A minute ago, I retried the VM console - and, surprise, this time it worked :-) Is it really solved? On my hosts the output is printed only right before the end of WU. |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,401,087 RAC: 123,599 |
A minute ago, I retried the VM console - and, surprise, this time it worked :-) Still not solved. And what about the plans to activate a top console? This would be very helpful to identify error conditions, e.g. not enough RAM. |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
I think there might be a problem with reporting the events in single-core tasks and from a brief look at your results I see only single core. In this case the logs we extract the information from are structured slightly differently. I'll try running some single core tasks to investigate. In the meantime I think I have finally got top working in console 3, can others confirm that it works for them? If so I will post a new thread to celebrate! |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,401,087 RAC: 123,599 |
David Cameron wrote: ... finally got top working in console 3, can others confirm that it works ... At least not for this WU from today: https://lhcathome.cern.ch/lhcathome/result.php?resultid=154071874 |
Send message Joined: 28 Sep 04 Posts: 675 Credit: 43,528,244 RAC: 15,546 |
David Cameron wrote:... finally got top working in console 3, can others confirm that it works ... I see this one with working top. https://lhcathome.cern.ch/lhcathome/result.php?resultid=154068006. But it is not acting like top on LHCb tasks (=updating screen on same position about every second) but scrolling a new screenful of text every second or so. Anyway it gives you the information about memory and CPU usage as it should. |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,401,087 RAC: 123,599 |
Harri Liljeroos wrote: I see this one with working top. https://lhcathome.cern.ch/lhcathome/result.php?resultid=154068006. But it is not acting like top on LHCb tasks (=updating screen on same position about every second) but scrolling a new screenful of text every second or so. Anyway it gives you the information about memory and CPU usage as it should. I can also confirm that the top output on console 3 works like Harri described it. |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
But it is not acting like top on LHCb tasks (=updating screen on same position about every second) but scrolling a new screenful of text every second or so. Anyway it gives you the information about memory and CPU usage as it should. Can you share a screenshot of your console? I could not manage to get a persistent top process running in a tty so I simply run it once every 5 seconds and send the first 24 lines of output to the tty. On my rdesktop (linux) it looks ok because the console is 24 lines high so I'd like to see if it looks bad on Windows. EDIT: this is how it looks for me: It shows nicely the single-core VM problem with 8 processes using 12.5% CPU each :) |
Send message Joined: 2 Sep 04 Posts: 453 Credit: 193,464,258 RAC: 5,837 |
|
Send message Joined: 28 Sep 04 Posts: 675 Credit: 43,528,244 RAC: 15,546 |
But it is not acting like top on LHCb tasks (=updating screen on same position about every second) but scrolling a new screenful of text every second or so. Anyway it gives you the information about memory and CPU usage as it should. Currently my system is struggling with over 50 LHCb tasks with deadline on the 9th. To push these through as fast as I can I have unselected all subprojects until I have cleared my cache of LHCb tasks. So until then I won't be running any Atlas tasks. |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,401,087 RAC: 123,599 |
Harri Liljeroos wrote: Currently my system is struggling with over 50 LHCb tasks with deadline on the 9th. Let me guess. In the past days you had a couple of LHCb WUs that finished very quickly due to the server problems. Those short runtimes raised the flops value of your host's DB record and echoed back via the latest scheduler replies. Now your hosts calculate a very short runtime estimate and download far too much WUs to get the cache filled. As a side effect the credits for the next WUs will be near absolute zero. If this is the reason it has to be addressed to the developers. The outliers should be handled like at sixtrack. |
Send message Joined: 28 Sep 04 Posts: 675 Credit: 43,528,244 RAC: 15,546 |
Harri Liljeroos wrote:Currently my system is struggling with over 50 LHCb tasks with deadline on the 9th. That's just about sums what happened. Originally Boinc downloaded a bunch of tasks that lasted about 20 minutes each and they all validated. This happened when I had switched from Atlas to LHCb because Atlas had run out tasks and jobs. The average processing rate went to above 400 GFLOPS (now it shows about 140). With next Atlas problems I switched again to LHCb and Boinc downloaded over 90 LHCb tasks thinking that they are also now taking only 20 minutes each. Instead they are taking anything between 12 minutes to 12 hours. The credit given is matching the run times (from 15 to about 500). |
Send message Joined: 2 May 07 Posts: 2090 Credit: 158,777,751 RAC: 128,475 |
Atlas-Tasks with two CPU's show only with F2: Event processing information will appear here. Have app_config memorysize changed from 5 GByte to 7 GByte, because the F3-Top show a use of memorysize between 5.3 and 5.7 and no swap. |
Send message Joined: 18 Dec 15 Posts: 1687 Credit: 102,987,306 RAC: 125,849 |
Atlas-Tasks with two CPU's show only with F2: same thing happens with my tasks. No idea why this still / again does not work properly. Have app_config memorysize changed from 5 GByte to 7 GByte, Recently, I noticed an increased memory usage for the four 2-core ATLAS tasks which are running concurrently on one of my systems. In my app_config, 7.3 GB are set, and they seem to be used up to a great part. |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,401,087 RAC: 123,599 |
maeax wrote: Atlas-Tasks with two CPU's show only with F2: Same here on a 2-core setup. maeax wrote: Have app_config memorysize changed from 5 GByte to 7 GByte, You may compare the TOP values for the OS cache. If WUs fail before all athena.pys (corresponding to configured cores) are launched, add more RAM. If the WU starts without an error a higher RAM value for the VM will only add more OS cache. |
Send message Joined: 2 May 07 Posts: 2090 Credit: 158,777,751 RAC: 128,475 |
If the WU starts without an error a higher RAM value for the VM will only add more OS cache. Maybe a reduce to 6 GByte is possible. The efficient is better for the whole system in this way. It is not important to find the minimum of Mem-usage, but the optimum! |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
maeax wrote:Atlas-Tasks with two CPU's show only with F2: Sorry, I think I broke the event information when I added the top output. It should be fixed now, it may take a few hours to propagate to the WU. |
©2024 CERN