Message boards : ATLAS application : ATLAS: Post never ending tasks here
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Michael H.W. Weber

Send message
Joined: 18 Sep 04
Posts: 30
Credit: 5,100,929
RAC: 0
Message 30811 - Posted: 17 Jun 2017, 15:21:42 UTC
Last modified: 17 Jun 2017, 15:22:10 UTC

... to continue a thread which I initiated over at ATLAS@home a long time back to classify problems with WUs for better analysis.

This is actually the first of these types of tasks which I encountered after the merging of ATLAS@home and LHC@home:

https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=71030232

Michael.
ID: 30811 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,137,814
RAC: 105,274
Message 30812 - Posted: 17 Jun 2017, 15:39:04 UTC

You have 3 CPU's and 5.000 MByte RAM.

For me 2 CPU's and 5.000 MByte and 95% Processor-use work since a long time, but Windows and not Linux.
Do you have a app_config?
ID: 30812 · Report as offensive     Reply Quote
Profile Michael H.W. Weber

Send message
Joined: 18 Sep 04
Posts: 30
Credit: 5,100,929
RAC: 0
Message 30852 - Posted: 19 Jun 2017, 10:14:24 UTC
Last modified: 19 Jun 2017, 10:21:42 UTC

This machine completed a number of ATLAS tasks sucessfully requiring around 5000 to 6000 CPU seconds. I now have another task running for almost 24 hrs (!) with progress 99.999% and remainind time 00:00:00, which therefore is the next one that will never end (and hence is faulty):

https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=71412840

Unlike the other one which I posted above, this one does NOT show an error message in the console view. Everything looks normal but that task appears dead.
With this task, however, I can't exclude an issue with network because I realized that the CERN server was down yesterday and some CERN WUs exchange data with the server during run time. So maybe that task stalled due to such an issue and therefore does not show any error message or event progress in the Virtualbox console view.
I just aborted it.

Michael.

P.S. Yes, I use an .xml file to set LHC to run 3 CPU ATLAS tasks with 4.9 GB RAM allowance. Only two tasks are allowed to be downloaded and only one of them can run at a time.
ID: 30852 · Report as offensive     Reply Quote
Profile HerveUAE
Avatar

Send message
Joined: 18 Dec 16
Posts: 123
Credit: 37,495,365
RAC: 0
Message 30968 - Posted: 23 Jun 2017, 8:40:55 UTC

I had one of those recently as well: https://lhcathome.cern.ch/lhcathome/result.php?resultid=147790645

The WU was configured to run with 2 cores, 7000 Mbytes RAM. It never ended after 1 day and 9 hours so I aborted it. The WU console was responding to the login prompt and running using only 1 core as can be seen by the CPU time of the WU.
I tried to stop the VM and start it again, because in some past cases this solved the issue. But in this case it did not help.
We are the product of random evolution.
ID: 30968 · Report as offensive     Reply Quote

Message boards : ATLAS application : ATLAS: Post never ending tasks here


©2024 CERN