Message boards :
ATLAS application :
ATLAS: Post never ending tasks here
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 18 Sep 04 Posts: 30 Credit: 5,100,929 RAC: 0 ![]() ![]() |
... to continue a thread which I initiated over at ATLAS@home a long time back to classify problems with WUs for better analysis. This is actually the first of these types of tasks which I encountered after the merging of ATLAS@home and LHC@home: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=71030232 Michael. |
Send message Joined: 2 May 07 Posts: 1829 Credit: 139,146,488 RAC: 70,628 ![]() ![]() ![]() |
You have 3 CPU's and 5.000 MByte RAM. For me 2 CPU's and 5.000 MByte and 95% Processor-use work since a long time, but Windows and not Linux. Do you have a app_config? |
![]() Send message Joined: 18 Sep 04 Posts: 30 Credit: 5,100,929 RAC: 0 ![]() ![]() |
This machine completed a number of ATLAS tasks sucessfully requiring around 5000 to 6000 CPU seconds. I now have another task running for almost 24 hrs (!) with progress 99.999% and remainind time 00:00:00, which therefore is the next one that will never end (and hence is faulty): https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=71412840 Unlike the other one which I posted above, this one does NOT show an error message in the console view. Everything looks normal but that task appears dead. With this task, however, I can't exclude an issue with network because I realized that the CERN server was down yesterday and some CERN WUs exchange data with the server during run time. So maybe that task stalled due to such an issue and therefore does not show any error message or event progress in the Virtualbox console view. I just aborted it. Michael. P.S. Yes, I use an .xml file to set LHC to run 3 CPU ATLAS tasks with 4.9 GB RAM allowance. Only two tasks are allowed to be downloaded and only one of them can run at a time. |
![]() ![]() Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0 ![]() ![]() |
I had one of those recently as well: https://lhcathome.cern.ch/lhcathome/result.php?resultid=147790645 The WU was configured to run with 2 cores, 7000 Mbytes RAM. It never ended after 1 day and 9 hours so I aborted it. The WU console was responding to the login prompt and running using only 1 core as can be seen by the CPU time of the WU. I tried to stop the VM and start it again, because in some past cases this solved the issue. But in this case it did not help. We are the product of random evolution. |
©2023 CERN