log in

All ATLAS and CMS tasks "aborted by project" - why so?


Advanced search

Message boards : ATLAS application : All ATLAS and CMS tasks "aborted by project" - why so?

Author Message
Erich56
Send message
Joined: 18 Dec 15
Posts: 383
Credit: 3,873,774
RAC: 7,567
Message 31568 - Posted: 23 Jul 2017, 18:51:32 UTC

Short time ago, on one of my PCs on which I had running 4 ATLAS and 4 CMS tasks, all these were "aborted by Project".
On the other two PCs, this was not the case.
None of the aborted tasks are being shown in my tasks list on the Webpage.

New CMS tasks were downloaded and got started, ATLAS tasks show up in the BOINC Manager as being downloaded, but the download is extremely slow and comes to a complete halt most of the time.
I checked my Internet connection, it works perfectly.

Anyone making the same experience?

Crystal Pellet
Volunteer moderator
Volunteer tester
Send message
Joined: 14 Jan 10
Posts: 384
Credit: 2,997,809
RAC: 2,011
Message 31569 - Posted: 23 Jul 2017, 19:00:46 UTC - in response to Message 31568.

Anyone making the same experience?

I had 2 ATLAS tasks running.
Looked at my system cause they should be ready meanwhile.
Those 2 were gone, but are not in my result list on the server.

This is in BOINC log:

23-Jul-2017 15:10:47 [LHC@home] Sending scheduler request: To fetch work.
23-Jul-2017 15:10:47 [LHC@home] Requesting new tasks for CPU
23-Jul-2017 15:10:49 [LHC@home] Scheduler request completed: got 2 new tasks
23-Jul-2017 15:10:51 [LHC@home] Started download of jf_3856ac8e00f6b80f32da8a91720b5c4a
23-Jul-2017 15:10:51 [LHC@home] Started download of hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_input.tar.gz
23-Jul-2017 15:10:54 [LHC@home] Finished download of hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_input.tar.gz
23-Jul-2017 15:10:54 [LHC@home] Started download of rte_hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn.tar.gz
23-Jul-2017 15:10:56 [LHC@home] Finished download of rte_hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn.tar.gz
23-Jul-2017 15:10:56 [LHC@home] Started download of boinc_job_script.F8HBvn
23-Jul-2017 15:10:57 [LHC@home] Finished download of boinc_job_script.F8HBvn
23-Jul-2017 15:10:57 [LHC@home] Started download of jf_084ee013f9063f6104c9ae4e7c86cc35
23-Jul-2017 15:11:03 [LHC@home] update requested by user
23-Jul-2017 15:11:06 [LHC@home] Sending scheduler request: Requested by user.
23-Jul-2017 15:11:06 [LHC@home] Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: )
23-Jul-2017 15:11:07 [LHC@home] Scheduler request completed
23-Jul-2017 15:11:15 [LHC@home] Finished download of jf_3856ac8e00f6b80f32da8a91720b5c4a
23-Jul-2017 15:11:15 [LHC@home] Started download of 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_input.tar.gz
23-Jul-2017 15:11:17 [LHC@home] Finished download of 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_input.tar.gz
23-Jul-2017 15:11:17 [LHC@home] Started download of rte_0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm.tar.gz
23-Jul-2017 15:11:17 [LHC@home] Starting task hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0
23-Jul-2017 15:11:18 [LHC@home] Finished download of rte_0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm.tar.gz
23-Jul-2017 15:11:18 [LHC@home] Started download of boinc_job_script.ElobDv
23-Jul-2017 15:11:20 [LHC@home] Finished download of boinc_job_script.ElobDv
23-Jul-2017 15:11:25 [LHC@home] Finished download of jf_084ee013f9063f6104c9ae4e7c86cc35
23-Jul-2017 15:15:08 [LHC@home] Starting task 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0
23-Jul-2017 20:11:12 [LHC@home] Sending scheduler request: Requested by project.
23-Jul-2017 20:11:12 [LHC@home] Not requesting tasks: "no new tasks" requested via Manager
23-Jul-2017 20:11:15 [LHC@home] Scheduler request completed
23-Jul-2017 20:11:15 [LHC@home] Result hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0 is no longer usable
23-Jul-2017 20:11:15 [LHC@home] Result 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0 is no longer usable
23-Jul-2017 20:11:25 [LHC@home] Sending scheduler request: To report completed tasks.
23-Jul-2017 20:11:25 [LHC@home] Reporting 2 completed tasks
23-Jul-2017 20:11:25 [LHC@home] Not requesting tasks: "no new tasks" requested via Manager
23-Jul-2017 20:11:27 [LHC@home] Scheduler request completed
23-Jul-2017 20:11:27 [LHC@home] garbage_collect(); still have active task for acked result hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0; state 5
23-Jul-2017 20:11:32 [LHC@home] garbage_collect(); still have active task for acked result 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0; state 5
23-Jul-2017 20:11:38 [LHC@home] garbage_collect(); still have active task for acked result hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0; state 6
23-Jul-2017 20:11:43 [LHC@home] garbage_collect(); still have active task for acked result 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0; state 6
23-Jul-2017 20:11:48 [LHC@home] Computation for task hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0 finished
23-Jul-2017 20:11:48 [LHC@home] Output file hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0_ATLAS_result for task hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0 absent
23-Jul-2017 20:11:48 [LHC@home] Computation for task 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0 finished
23-Jul-2017 20:11:48 [LHC@home] Output file 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0_ATLAS_result for task 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0 absent
23-Jul-2017 20:11:48 [LHC@home] Sending scheduler request: To report completed tasks.
23-Jul-2017 20:11:48 [LHC@home] Reporting 2 completed tasks
23-Jul-2017 20:11:48 [LHC@home] Not requesting tasks: "no new tasks" requested via Manager
23-Jul-2017 20:11:50 [LHC@home] Scheduler request completed
23-Jul-2017 20:11:50 [LHC@home] Got ack for task hGKNDmiSTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmo4GKDmhpn9vn_0, but can't find it
23-Jul-2017 20:11:50 [LHC@home] Got ack for task 0qiLDmElStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmK3GKDmcCCmYm_0, but can't find it

Erich56
Send message
Joined: 18 Dec 15
Posts: 383
Credit: 3,873,774
RAC: 7,567
Message 31576 - Posted: 23 Jul 2017, 19:53:58 UTC

This was shown in my BOINC log for the 8 aborted tasks:

23/07/2017 20:16:51 | LHC@home | [error] Got ack for task CMS_20797_1500746391.189425_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task CMS_20276_1500746091.040548_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task CMS_17026_1500712758.273755_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task CMS_3419_1500753597.693533_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task aMQNDm2FOtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmwtGKDmsXBe3n_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task NsiNDmtXStqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmU2GKDm0MrXKo_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task 60lKDmm6TtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDms5GKDm0zNFSm_0, but can't find it
23/07/2017 20:16:51 | LHC@home | [error] Got ack for task GleMDm0hTtqnSu7Ccp2YYBZmABFKDmABFKDmXNGKDmN5GKDmNstWLn_0, but can't find it

Toby Broom
Volunteer moderator
Send message
Joined: 27 Sep 08
Posts: 376
Credit: 88,663,408
RAC: 174,189
Message 31584 - Posted: 23 Jul 2017, 22:02:02 UTC

I'm unable to get new atlas tasks in addition

Harri Liljeroos
Avatar
Send message
Joined: 28 Sep 04
Posts: 205
Credit: 6,174,207
RAC: 2,707
Message 31587 - Posted: 23 Jul 2017, 22:29:45 UTC - in response to Message 31584.

I'm unable to get new atlas tasks in addition

Same here, one task is being downloaded but it just gives transient HTTP error and nothing comes down here (0 bytes downloaded after a couple of retries). Here's the one https://lhcathome.cern.ch/lhcathome/result.php?resultid=152060009

The file it is trying to download is jf_e88cd5647976b8bfce8af462a99d37c5.
____________

Jim1348
Send message
Joined: 15 Nov 14
Posts: 86
Credit: 3,721,688
RAC: 14,000
Message 31589 - Posted: 24 Jul 2017, 0:19:13 UTC
Last modified: 24 Jul 2017, 0:20:12 UTC

I have three ATLAS stuck in download ("project backoff") for a little over an hour, but none aborted thus far. Maybe they found the problem and are holding off sending new ones?

Jim1348
Send message
Joined: 15 Nov 14
Posts: 86
Credit: 3,721,688
RAC: 14,000
Message 31590 - Posted: 24 Jul 2017, 7:27:07 UTC - in response to Message 31589.

The ATLAS have all downloaded/uploaded and reported now, and I see no further problems at the moment.

Harri Liljeroos
Avatar
Send message
Joined: 28 Sep 04
Posts: 205
Credit: 6,174,207
RAC: 2,707
Message 31591 - Posted: 24 Jul 2017, 7:56:38 UTC
Last modified: 24 Jul 2017, 8:06:44 UTC

Mine has downloaded also during the night.

I had also one Atlas task stuck on my other host which finished downloading after I retried it this morning. This host also had one Atlas task aborted by server (shows status "aborted, 202" on BoincTasks). This resulted in Vbox crash. It seems to run normally now.
[edit:]The aborted task is not on the server pages.
____________

Erich56
Send message
Joined: 18 Dec 15
Posts: 383
Credit: 3,873,774
RAC: 7,567
Message 31592 - Posted: 24 Jul 2017, 10:36:52 UTC - in response to Message 31591.

This resulted in Vbox crash...

also here, for all the tasks that were "abortet by project", the VBox crashed.

Since late night, every seems to be back to normal.

anyone any idea what the reason for the disturbance was?

ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 29 Aug 05
Posts: 305
Credit: 2,415,598
RAC: 5,149
Message 31593 - Posted: 24 Jul 2017, 11:02:46 UTC - in response to Message 31592.

This resulted in Vbox crash...

also here, for all the tasks that were "abortet by project", the VBox crashed.

Since late night, every seems to be back to normal.

anyone any idea what the reason for the disturbance was?

No, but allow me to share this monitor page for your speculation. (Time axis is BST = UTC+1.)
____________

djoser
Send message
Joined: 30 Aug 14
Posts: 20
Credit: 1,872,063
RAC: 783
Message 32565 - Posted: 29 Sep 2017, 16:11:29 UTC

Hello,

i recently had several tasks which exited with
202 (0x000000CA) EXIT_ABORTED_BY_PROJECT

For example task # 158294001

What could be the reason?

Regards, djoser.
____________
Why mine when you can research? - GRIDCOIN - Real cryptocurrency without wasting hashes! www.gridcoin.us

Crystal Pellet
Volunteer moderator
Volunteer tester
Send message
Joined: 14 Jan 10
Posts: 384
Credit: 2,997,809
RAC: 2,011
Message 32566 - Posted: 29 Sep 2017, 16:26:18 UTC - in response to Message 32565.

At 10:42:36 UTC today a second replication of your task was sent to a Linux machine to run the task as a test of the ATLAS native application.
The result from that machine was returned at 11:28:45 UTC. Since your task was not started yet, it was cancelled by the server cause the result was not needed anymore.

https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=76005038

djoser
Send message
Joined: 30 Aug 14
Posts: 20
Credit: 1,872,063
RAC: 783
Message 32567 - Posted: 29 Sep 2017, 16:42:40 UTC - in response to Message 32566.
Last modified: 29 Sep 2017, 16:44:59 UTC

Many thanks for that perfectly good explaination!
As long as there are no recources are wasted i'm abosolutely okay with that.
____________
Why mine when you can research? - GRIDCOIN - Real cryptocurrency without wasting hashes! www.gridcoin.us

Profile MechaToaster
Avatar
Send message
Joined: 17 Aug 17
Posts: 10
Credit: 87,993
RAC: 70
Message 32594 - Posted: 3 Oct 2017, 6:33:18 UTC

this has started to happen to all of my atlas tasks(not cms though) lately, just a few days maybe. they abort in about a day, maybe less. one of them was also aborted while it was in progress. i see that they have been completed by other machines like Crystal Pellet pointed out.
is this normal behaviour then? nothing wrong on my end?
____________

Crystal Pellet
Volunteer moderator
Volunteer tester
Send message
Joined: 14 Jan 10
Posts: 384
Credit: 2,997,809
RAC: 2,011
Message 32596 - Posted: 3 Oct 2017, 7:16:55 UTC - in response to Message 32594.

...
is this normal behaviour then? nothing wrong on my end?

Cancelling task already in progress is not normal BOINC behaviour, except when done by the project administrator for reasons like bad created batches/workunits.
This is not the case here. If there are truly already running tasks aborted by the project, this would mean bad project configuration.

Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester
Send message
Joined: 15 Jul 05
Posts: 114
Credit: 1,204,096
RAC: 7,040
Message 32597 - Posted: 3 Oct 2017, 7:26:14 UTC - in response to Message 32596.

AFAIK there should be no changes regarding ATLAS over the last days. My PC has been happily crunching ATLAS tasks and I have a couple in progress.

(We're phasing out the legacy Sixtrack server, as announced under News, but this should not affect ATLAS tasks.)

Crystal Pellet
Volunteer moderator
Volunteer tester
Send message
Joined: 14 Jan 10
Posts: 384
Credit: 2,997,809
RAC: 2,011
Message 32599 - Posted: 3 Oct 2017, 7:30:17 UTC - in response to Message 32597.

Hi Nils,

A bit strange in my opinion is, that there are Native tasks (copies) sent out of VBox tasks in progress. Is this on purpose?

Profile MechaToaster
Avatar
Send message
Joined: 17 Aug 17
Posts: 10
Credit: 87,993
RAC: 70
Message 32605 - Posted: 3 Oct 2017, 8:39:34 UTC - in response to Message 32596.
Last modified: 3 Oct 2017, 8:40:17 UTC

...
is this normal behaviour then? nothing wrong on my end?

Cancelling task already in progress is not normal BOINC behaviour, except when done by the project administrator for reasons like bad created batches/workunits.
This is not the case here. If there are truly already running tasks aborted by the project, this would mean bad project configuration.


so far only one atlas task in progress has been aborted by project, but it was the only atlas task ive ran since this began. ive always been in the middle of other tasks when receiving atlas jobs and by time i finish the other tasks, the atlas tasks get aborted by project before i can start them.
what do you mean by "bad project configuration"? what should i be looking to fix/change, if im understanding you correctly?
____________

Crystal Pellet
Volunteer moderator
Volunteer tester
Send message
Joined: 14 Jan 10
Posts: 384
Credit: 2,997,809
RAC: 2,011
Message 32610 - Posted: 3 Oct 2017, 10:09:05 UTC - in response to Message 32605.

... what do you mean by "bad project configuration"? what should i be looking to fix/change, if im understanding you correctly?

Project configuration is done by the admins, not by clients/users, so you can't do anything about it.

Profile MechaToaster
Avatar
Send message
Joined: 17 Aug 17
Posts: 10
Credit: 87,993
RAC: 70
Message 32616 - Posted: 3 Oct 2017, 20:08:18 UTC - in response to Message 32610.

... what do you mean by "bad project configuration"? what should i be looking to fix/change, if im understanding you correctly?

Project configuration is done by the admins, not by clients/users, so you can't do anything about it.

guess i misunderstood your earlier posts. there is nothing wrong on my end then? you had said "this is not the case here" in response to my first post, so i was a bit confused.
____________

Message boards : ATLAS application : All ATLAS and CMS tasks "aborted by project" - why so?