Message boards :
Sixtrack Application :
Very short Runtimes
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 222,922,687 RAC: 137,969 |
Today each of my hosts got a bunch of sixtrack tasks. Most of them have runtimes of only a few seconds. Faulty or not? See: https://lhcathome.cern.ch/lhcathome/results.php?hostid=10486310 https://lhcathome.cern.ch/lhcathome/results.php?hostid=10486393 |
Send message Joined: 29 Feb 16 Posts: 157 Credit: 2,659,975 RAC: 0 |
these are tasks where a region of highly unstable phase space is scanned You can guess on your own if the tasks is likely to be short or not from its name, e.g.: https://lhcathome.cern.ch/lhcathome/result.php?resultid=154383317 . The __14_16__means that a normalised amplitude between 14 and 16 sigma is scanned, which is very large wrt typical figures, where we expect some chaotic motion to arise at about 6sigma |
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 222,922,687 RAC: 137,969 |
OK. Thank you. I understand that there's nothing to worry about. Otherwise you are now aware of it. |
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,723,571 RAC: 234,385 |
I got a few errors, as before only on Linux, they are all: AMPLITUDES EXCEED THE MAXIMUM VALUES IN UMLAUF *** ERROR ***,PROBLEMS WRITING TO FILE 10 FROM ABEND ERROR CODE : 5001 process exited with code 101 (0x65, -155) They are all supershort run times, like before. I think with the exact same hardware there wasn't any errors, I'm running some more for the next 24hr on Linux then will switch back to windows. |
Send message Joined: 29 Feb 16 Posts: 157 Credit: 2,659,975 RAC: 0 |
Hello Toby, Thanks for pointing this out. I suspect something weird at the level of the input, not at the level of the exe. Could you point me to some of these results? Thanks a lot in advance, |
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,723,571 RAC: 234,385 |
These are the only 2 left https://lhcathome.cern.ch/lhcathome/result.php?resultid=155240749 https://lhcathome.cern.ch/lhcathome/result.php?resultid=155210817 |
Send message Joined: 29 Feb 16 Posts: 157 Credit: 2,659,975 RAC: 0 |
I am really puzzled. A valid result on the same WU has been produced by the same executable on another host: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=74466109 |
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,723,571 RAC: 234,385 |
In the past we thought it could be the hyperthreadding bug, but this uses AVX, plus I have tested this computer and it never had the bug with the proposed testing method. Plus some of the top error rates came from CPU's that were not effected e.g. i5/Xeon This would leave Linux Kernel 4.8 as a possiable candidate. The failure on is the same these very short runtimes and alway on Linux. I can run on another Linux kernel to see if it pops up again. |
©2024 CERN