log in

All ATLAS and CMS tasks "aborted by project" - why so?


Advanced search

Message boards : ATLAS application : All ATLAS and CMS tasks "aborted by project" - why so?

Author Message
Erich56
Send message
Joined: 18 Dec 15
Posts: 304
Credit: 3,437,579
RAC: 8,426
Message 31568 - Posted: 23 Jul 2017, 18:51:32 UTC

Short time ago, on one of my PCs on which I had running 4 ATLAS and 4 CMS tasks, all these were "aborted by Project".
On the other two PCs, this was not the case.
None of the aborted tasks are being shown in my tasks list on the Webpage.

New CMS tasks were downloaded and got started, ATLAS tasks show up in the BOINC Manager as being downloaded, but the download is extremely slow and comes to a complete halt most of the time.
I checked my Internet connection, it works perfectly.

Anyone making the same experience?

Crystal Pellet
Volunteer moderator
Volunteer tester
Send message
Joined: 14 Jan 10
Posts: 328
Credit: 2,772,160
RAC: 3,191
Message 31569 - Posted: 23 Jul 2017, 19:00:46 UTC - in response to Message 31568.

Anyone making the same experience?

I had 2 ATLAS tasks running.
Looked at my system cause they should be ready meanwhile.
Those 2 were gone, but are not in my result list on the server.

This is in BOINC log:

23-Jul-2017 15:10:47 [LHC@home] Sending scheduler request: To fetch work.
23-Jul-2017 15:10:47 [LHC@home] Requesting new tasks for CPU
23-Jul-2017 15:10:49 [LHC@home] Scheduler request completed: got 2 new tasks
23-Jul-2017 15:10:51 [LHC@home] Started download of jf_3856ac8e00f6b80f32da8a91720b5c4a
23-Jul-2017 15:10:51 [LHC@home] Started download of hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_input.tar.gz
23-Jul-2017 15:10:54 [LHC@home] Finished download of hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_input.tar.gz
23-Jul-2017 15:10:54 [LHC@home] Started download of rte_hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn.tar.gz
23-Jul-2017 15:10:56 [LHC@home] Finished download of rte_hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn.tar.gz
23-Jul-2017 15:10:56 [LHC@home] Started download of boinc_job_script.F8HBvn
23-Jul-2017 15:10:57 [LHC@home] Finished download of boinc_job_script.F8HBvn
23-Jul-2017 15:10:57 [LHC@home] Started download of jf_084ee013f9063f6104c9ae4e7c86cc35
23-Jul-2017 15:11:03 [LHC@home] update requested by user
23-Jul-2017 15:11:06 [LHC@home] Sending scheduler request: Requested by user.
23-Jul-2017 15:11:06 [LHC@home] Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: )
23-Jul-2017 15:11:07 [LHC@home] Scheduler request completed
23-Jul-2017 15:11:15 [LHC@home] Finished download of jf_3856ac8e00f6b80f32da8a91720b5c4a
23-Jul-2017 15:11:15 [LHC@home] Started download of 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_input.tar.gz
23-Jul-2017 15:11:17 [LHC@home] Finished download of 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_input.tar.gz
23-Jul-2017 15:11:17 [LHC@home] Started download of rte_0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm.tar.gz
23-Jul-2017 15:11:17 [LHC@home] Starting task hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0
23-Jul-2017 15:11:18 [LHC@home] Finished download of rte_0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm.tar.gz
23-Jul-2017 15:11:18 [LHC@home] Started download of boinc_job_script.ElobDv
23-Jul-2017 15:11:20 [LHC@home] Finished download of boinc_job_script.ElobDv
23-Jul-2017 15:11:25 [LHC@home] Finished download of jf_084ee013f9063f6104c9ae4e7c86cc35
23-Jul-2017 15:15:08 [LHC@home] Starting task 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0
23-Jul-2017 20:11:12 [LHC@home] Sending scheduler request: Requested by project.
23-Jul-2017 20:11:12 [LHC@home] Not requesting tasks: "no new tasks" requested via Manager
23-Jul-2017 20:11:15 [LHC@home] Scheduler request completed
23-Jul-2017 20:11:15 [LHC@home] Result hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0 is no longer usable
23-Jul-2017 20:11:15 [LHC@home] Result 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0 is no longer usable
23-Jul-2017 20:11:25 [LHC@home] Sending scheduler request: To report completed tasks.
23-Jul-2017 20:11:25 [LHC@home] Reporting 2 completed tasks
23-Jul-2017 20:11:25 [LHC@home] Not requesting tasks: "no new tasks" requested via Manager
23-Jul-2017 20:11:27 [LHC@home] Scheduler request completed
23-Jul-2017 20:11:27 [LHC@home] garbage_collect(); still have active task for acked result hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0; state 5
23-Jul-2017 20:11:32 [LHC@home] garbage_collect(); still have active task for acked result 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0; state 5
23-Jul-2017 20:11:38 [LHC@home] garbage_collect(); still have active task for acked result hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0; state 6
23-Jul-2017 20:11:43 [LHC@home] garbage_collect(); still have active task for acked result 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0; state 6
23-Jul-2017 20:11:48 [LHC@home] Computation for task hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0 finished
23-Jul-2017 20:11:48 [LHC@home] Output file hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0_ATLAS_result for task hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0 absent
23-Jul-2017 20:11:48 [LHC@home] Computation for task 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0 finished
23-Jul-2017 20:11:48 [LHC@home] Output file 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0_ATLAS_result for task 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0 absent
23-Jul-2017 20:11:48 [LHC@home] Sending scheduler request: To report completed tasks.
23-Jul-2017 20:11:48 [LHC@home] Reporting 2 completed tasks
23-Jul-2017 20:11:48 [LHC@home] Not requesting tasks: "no new tasks" requested via Manager
23-Jul-2017 20:11:50 [LHC@home] Scheduler request completed
23-Jul-2017 20:11:50 [LHC@home] Got ack for task hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0, but can't find it
23-Jul-2017 20:11:50 [LHC@home] Got ack for task 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0, but can't find it

Erich56
Send message
Joined: 18 Dec 15
Posts: 304
Credit: 3,437,579
RAC: 8,426
Message 31576 - Posted: 23 Jul 2017, 19:53:58 UTC

This was shown in my BOINC log for the 8 aborted tasks:

23/07/2017 20:16:51 | LHC@home | [error] Got ack for task CMS_20797_1500746391.189425_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task CMS_20276_1500746091.040548_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task CMS_17026_1500712758.273755_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task CMS_3419_1500753597.693533_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task aMQNDm2FOtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmwtGKDmsXBe3n_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task NsiNDmtXStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmU2GKDm0MrXKo_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task 60lKDmm6TtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDms5GKDm0zNFSm_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task GleMDm0hTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmN5GKDmNstWLn_0, but can't find it

Toby Broom
Volunteer moderator
Send message
Joined: 27 Sep 08
Posts: 358
Credit: 78,286,980
RAC: 112,343
Message 31584 - Posted: 23 Jul 2017, 22:02:02 UTC

I'm unable to get new atlas tasks in addition

Harri Liljeroos
Avatar
Send message
Joined: 28 Sep 04
Posts: 189
Credit: 6,003,668
RAC: 4,414
Message 31587 - Posted: 23 Jul 2017, 22:29:45 UTC - in response to Message 31584.

I'm unable to get new atlas tasks in addition

Same here, one task is being downloaded but it just gives transient HTTP error and nothing comes down here (0 bytes downloaded after a couple of retries). Here's the one https://lhcathome.cern.ch/lhcathome/result.php?resultid=152060009

The file it is trying to download is jf_e88cd5647976b8bfce8af462a99d37c5.
____________

Jim1348
Send message
Joined: 15 Nov 14
Posts: 71
Credit: 3,033,536
RAC: 10,549
Message 31589 - Posted: 24 Jul 2017, 0:19:13 UTC
Last modified: 24 Jul 2017, 0:20:12 UTC

I have three ATLAS stuck in download ("project backoff") for a little over an hour, but none aborted thus far. Maybe they found the problem and are holding off sending new ones?

Jim1348
Send message
Joined: 15 Nov 14
Posts: 71
Credit: 3,033,536
RAC: 10,549
Message 31590 - Posted: 24 Jul 2017, 7:27:07 UTC - in response to Message 31589.

The ATLAS have all downloaded/uploaded and reported now, and I see no further problems at the moment.

Harri Liljeroos
Avatar
Send message
Joined: 28 Sep 04
Posts: 189
Credit: 6,003,668
RAC: 4,414
Message 31591 - Posted: 24 Jul 2017, 7:56:38 UTC
Last modified: 24 Jul 2017, 8:06:44 UTC

Mine has downloaded also during the night.

I had also one Atlas task stuck on my other host which finished downloading after I retried it this morning. This host also had one Atlas task aborted by server (shows status "aborted, 202" on BoincTasks). This resulted in Vbox crash. It seems to run normally now.
[edit:]The aborted task is not on the server pages.
____________

Erich56
Send message
Joined: 18 Dec 15
Posts: 304
Credit: 3,437,579
RAC: 8,426
Message 31592 - Posted: 24 Jul 2017, 10:36:52 UTC - in response to Message 31591.

This resulted in Vbox crash...

also here, for all the tasks that were "abortet by project", the VBox crashed.

Since late night, every seems to be back to normal.

anyone any idea what the reason for the disturbance was?

ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 29 Aug 05
Posts: 261
Credit: 2,099,759
RAC: 5,306
Message 31593 - Posted: 24 Jul 2017, 11:02:46 UTC - in response to Message 31592.

This resulted in Vbox crash...

also here, for all the tasks that were "abortet by project", the VBox crashed.

Since late night, every seems to be back to normal.

anyone any idea what the reason for the disturbance was?

No, but allow me to share this monitor page for your speculation. (Time axis is BST = UTC+1.)
____________

Message boards : ATLAS application : All ATLAS and CMS tasks "aborted by project" - why so?