Message boards :
LHCb Application :
-152 (0xFFFFFF68) ERR_NETOPEN
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,165 RAC: 22,029 |
lately, some of my LHCb tasks failed after about 7 minutes with -152 (0xFFFFFF68) ERR_NETOPEN stderr says: [ERROR] Could not connect to Condor server on port 9618 In the past, I experienced the same problem many times with CMS tasks, which apparently also connect to the Condor server, and this seems to fail from time to time. Any idea why this happens? A flaw with the Condor server? |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,165 RAC: 22,029 |
a few minutes ago, another task failed with the same error code as quoted above. Besides, two other tasks failed with error code: 207 (0x000000CF) EXIT_NO_SUB_TASKS in STDERR it says: [ERROR] No jobs were available to run so, obviously, there are enough tasks for download, but no jobs to be crunched by these tasks :-( An experience which I have often made with CMS :-( |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,165 RAC: 22,029 |
meanwhile, I got serveral more task failures with either -152 (0xFFFFFF68) ERR_NETOPEN or 207 (0x000000CF) EXIT_NO_SUB_TASKS what's going on at CERN? Is the system at the verge of a breakdown? |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,165 RAC: 22,029 |
the above cited failures are getting more and more. No idea what problems persist at CERN. I will cease crunching as I do not want to waste my processors and my electricity for nothing. Really annoying :-( |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
lately, some of my LHCb tasks failed after about 7 minutes with I don't see it here on LHCb. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10501634&offset=0&show_names=0&state=0&appid=12 And I looked at a few on CMS, and don't see it there either. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10501634&offset=0&show_names=0&state=0&appid=11 Maybe it is a network problem that somehow affects that port? EDIT: There was one today on LHCb. https://lhcathome.cern.ch/lhcathome/result.php?resultid=172154298 |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,165 RAC: 22,029 |
EDIT: There was one today on LHCb.2018-01-06 05:55:21 (4990): Guest Log: [ERROR] Could not connect to Condor server on port 9618 2018-01-06 05:55:21 (4990): Guest Log: [INFO] Shutting Down. 2018-01-06 05:55:21 (4990): VM Completion File Detected. 2018-01-06 05:55:21 (4990): VM Completion Message: Could not connect to Condor server on port 9618 Jim, this is exactly the type of error I have gotten so many times, also with CMS tasks. There seems to be some kind of persisting problem with the Condor Server, for long time now :-( |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
Jim, this is exactly the type of error I have gotten so many times, also with CMS tasks. There seems to be some kind of persisting problem with the Condor Server, for long time now :-( Maybe it is related to path delay if they use some sort of triggered port on their server (?). The timing must be critical for there to be a difference between us. |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,165 RAC: 22,029 |
Maybe it is related to path delay if they use some sort of triggered port on their server (?).hm, maybe so ... I already brought this problem to the attention of Ivan (CMS guy), he said he'll try to have somone look into it some time. It's frustrating when so many tasks fail due to some flaw with the Condor Server. |
©2024 CERN