Message boards : ATLAS application : Download failures
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Erich56

Send message
Joined: 18 Dec 15
Posts: 1757
Credit: 115,849,737
RAC: 84,836
Message 50228 - Posted: 23 May 2024, 13:14:12 UTC
Last modified: 23 May 2024, 13:18:04 UTC

since a few days ago, when looking up the tasks list of my various hosts, I noticed now and then a task with status "download error".
Now, one of my hosts would not let me download any task at all.
Whenever I push the update button, new tasks are showing up in the download manager, but a few seconds later the status says "download error".

The BOINC events log shows the following:

23.05.2024 14:50:05 | LHC@home | Started download of boinc_job_script.ZHGBc6
23.05.2024 14:50:05 | LHC@home | [error] File JX0KDmOONT5nsSi4apGgGQJmABFKDmABFKDm4ySLDmBVWKDmZoIKVo_input.tar.gz has wrong size: expected 479968, got 0
23.05.2024 14:50:05 | LHC@home | [error] Checksum or signature error for JX0KDmOONT5nsSi4apGgGQJmABFKDmABFKDm4ySLDmBVWKDmZoIKVo_input.tar.gz
23.05.2024 14:50:06 | LHC@home | Incomplete read of 520.000000 < 5KB for boinc_job_script.ZHGBc6 - truncating
23.05.2024 14:50:06 | LHC@home | Finished download of boinc_job_script.ZHGBc6
23.05.2024 14:50:06 | LHC@home | [error] File boinc_job_script.ZHGBc6 has wrong size: expected 17537, got 0
23.05.2024 14:50:06 | LHC@home | [error] Checksum or signature error for boinc_job_script.ZHGBc6

so, what's this now?

Edit: I am seeing this behaviour on some of my other hosts as well :-(
ID: 50228 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2197
Credit: 173,398,042
RAC: 44,335
Message 50229 - Posted: 23 May 2024, 13:28:18 UTC - in response to Message 50228.  

We have to wait, because of new Server upgrade yesterday.
Seeing the same and have stopped download.
ID: 50229 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1757
Credit: 115,849,737
RAC: 84,836
Message 50232 - Posted: 24 May 2024, 10:04:36 UTC

what happens now is:
shortly after a new task shows up in the BOINC Manager, it is being "cancelled by project" before the download starts.

Hopefully these problems will be straightened out soon :-)
ID: 50232 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2197
Credit: 173,398,042
RAC: 44,335
Message 50233 - Posted: 24 May 2024, 10:13:36 UTC - in response to Message 50232.  

Canceling Tasks is in other projects also. Saw this in one project today.
With this function, you don't need to run it a third, or fourth time.
For me, Atlas running well atm.
ID: 50233 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 711
Credit: 47,659,105
RAC: 35,151
Message 50235 - Posted: 24 May 2024, 11:23:45 UTC

There is something wrong with the server status page data. The 'In progress' value jumped from about 14000 to 54000 on 23rd of May at about 8:30 but the Jobs graph doesn't show any significant change for that time.
See here: https://grafana.kiska.pw/d/boinc/boinc?orgId=1&var-project=lhc@home&from=now-2d&to=now&refresh=30m So maybe the data on server is screwed.
ID: 50235 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1757
Credit: 115,849,737
RAC: 84,836
Message 50238 - Posted: 24 May 2024, 17:25:08 UTC

by now I am experiencing the "download failure" problem on all of my hosts (all Windows).

does anyone of you people NOT have this problem?
ID: 50238 · Report as offensive     Reply Quote
rob

Send message
Joined: 4 Mar 11
Posts: 27
Credit: 3,828,461
RAC: 807
Message 50239 - Posted: 24 May 2024, 18:23:32 UTC - in response to Message 50238.  

No problems downloading here, but then I'm only doing one or two downloads at a time
ID: 50239 · Report as offensive     Reply Quote
Saturn911

Send message
Joined: 3 Nov 12
Posts: 54
Credit: 136,936,349
RAC: 101,393
Message 50240 - Posted: 24 May 2024, 19:48:36 UTC - in response to Message 50238.  

Two absurd WUs here.
Look at this of my hosts:
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10847676
Old WU's, done last week.
They are announced for downloading again and again. But no download at all.
The worst is, they block download of new Atlas WUs.

My other hosts are downloading like a charm.
ID: 50240 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2197
Credit: 173,398,042
RAC: 44,335
Message 50241 - Posted: 24 May 2024, 20:15:31 UTC - in response to Message 50240.  
Last modified: 24 May 2024, 20:29:38 UTC

ISP say no problems, BUT
Cern have 1 MBit/s for Atlas atm.
A few weeks ago was the same Problem.
ID: 50241 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1757
Credit: 115,849,737
RAC: 84,836
Message 50242 - Posted: 24 May 2024, 20:15:55 UTC

Now the download problem seems to be different to what it was before - see BOINC event log:

24.05.2024 22:08:58 | LHC@home | Started download of Q6aLDmdJ5T5n7Olcko1bjSoqABFKDmABFKDmOtvXDmPHdKDmcz8DZn_EVNT.38776201._000085.pool.root.1
24.05.2024 22:08:58 | LHC@home | Started download of Q6aLDmdJ5T5n7Olcko1bjSoqABFKDmABFKDmOtvXDmPHdKDmcz8DZn_input.tar.gz
24.05.2024 22:08:58 | LHC@home | Started download of boinc_job_script.yNAYSM
24.05.2024 22:08:59 | LHC@home | Giving up on download of Q6aLDmdJ5T5n7Olcko1bjSoqABFKDmABFKDmOtvXDmPHdKDmcz8DZn_EVNT.38776201._000085.pool.root.1: permanent HTTP error
24.05.2024 22:08:59 | LHC@home | Finished download of Q6aLDmdJ5T5n7Olcko1bjSoqABFKDmABFKDmOtvXDmPHdKDmcz8DZn_input.tar.gz (0 bytes)
24.05.2024 22:08:59 | LHC@home | Finished download of boinc_job_script.yNAYSM (0 bytes)
ID: 50242 · Report as offensive     Reply Quote
Profile Landjunge

Send message
Joined: 16 May 15
Posts: 1
Credit: 22,820,628
RAC: 134,133
Message 50243 - Posted: 24 May 2024, 20:49:15 UTC

Same here. I have started another boinc instance to get around the problem. Hope these tasks get sortet out asap.
ID: 50243 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2197
Credit: 173,398,042
RAC: 44,335
Message 50244 - Posted: 24 May 2024, 21:31:35 UTC
Last modified: 24 May 2024, 21:36:30 UTC

One upload with 1,1 GByte atm?
Uploadspeed not more as 500 kBits. (30 minutes)
ID: 50244 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1757
Credit: 115,849,737
RAC: 84,836
Message 50245 - Posted: 25 May 2024, 5:30:57 UTC

also, something is wrong with the tasks list in the user's webpage.

for this host, as can be seen, 5 ATLAS tasks are shown as being in process:
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10815905

In reality, only 1 is in process, all others got finished day before yesterday and yesterday. So why are they not shown as finished and validated (or not validated)?

In the past days, there seem to be quite a number of problems over there at LHC&home
ID: 50245 · Report as offensive     Reply Quote
Saturn911

Send message
Joined: 3 Nov 12
Posts: 54
Credit: 136,936,349
RAC: 101,393
Message 50246 - Posted: 25 May 2024, 6:03:53 UTC - in response to Message 50245.  


for this host, as can be seen, 5 ATLAS tasks are shown as being in process:
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10815905

These prevent new WUs for this host!
It's impossible to abort them.
ID: 50246 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2197
Credit: 173,398,042
RAC: 44,335
Message 50247 - Posted: 25 May 2024, 6:40:08 UTC - in response to Message 50245.  

Do you have in Boincmanager -> Filetransfer for this Atlas-Tasks transfers which are open?
Had last Night also Atlas-Tasks with Problems, but are now finished and validated.
ID: 50247 · Report as offensive     Reply Quote
Saturn911

Send message
Joined: 3 Nov 12
Posts: 54
Credit: 136,936,349
RAC: 101,393
Message 50248 - Posted: 25 May 2024, 7:26:53 UTC - in response to Message 50247.  
Last modified: 25 May 2024, 7:39:58 UTC

Do you have in Boincmanager -> Filetransfer for this Atlas-Tasks transfers which are open?

Nothing in Filetransfer.
As soon as I set "Allow new work" they appear for some seconds as downloads, but there is no transfer. It changes time of download, nothing else.

Edit:
A new try gave new WUs. But the 2 blocking tasks are still present.
ID: 50248 · Report as offensive     Reply Quote
Saturn911

Send message
Joined: 3 Nov 12
Posts: 54
Credit: 136,936,349
RAC: 101,393
Message 50255 - Posted: 27 May 2024, 12:02:31 UTC

@David Cameron
Please delete these 2 WU from server.

QlULDmVieS5n7Olcko1bjSoqABFKDmABFKDmOtvXDmvMaKDmojnWSn
XLgKDmdkhR5nsSi4apGgGQJmABFKDmABFKDm4ySLDm6FPKDmFCgdxm

They are misconfigured (end date in 2027).

Hostid 10847676 trys to download them again and again.
Can't interrupt this. Aborting does not help.

For example

Anwendung
ATLAS Simulation 3.01 (native_mt)
Name
QlULDmVieS5n7Olcko1bjSoqABFKDmABFKDmOtvXDmvMaKDmojnWSn
Status
Herunterladen fehlgeschlagen
erhalten
Mo 27 Mai 2024 13:29:44
Ablaufdatum
So 04 Jul 2027 09:54:10
Ressourcen
6 CPUs
Geschätzter Berechnungsaufwand
43’200 GFLOPs
Ausführbare Datei
wrapper_26015_x86_64-pc-linux-gnu
ID: 50255 · Report as offensive     Reply Quote
CloverField

Send message
Joined: 17 Oct 06
Posts: 80
Credit: 56,488,930
RAC: 15,061
Message 50256 - Posted: 27 May 2024, 13:52:24 UTC

Has anyone actually had a successful task download for atlas since this issue started? Ive been running theory since it started but I try and give atlas a go once a day and all of my downloads continue to fail.
ID: 50256 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2509
Credit: 249,192,738
RAC: 127,524
Message 50257 - Posted: 27 May 2024, 14:29:57 UTC - in response to Message 50256.  

Has anyone actually had a successful task download for atlas since this issue started?

Yes.
Had a short power outage 2 days ago that caused all tasks to fail.
Hence, I restarted from scratch.
ATLAS was last (today).
So far without any download issues and some valids.


BTW:
@Saturn911
David left CERN more than a year ago.
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5989&postid=48019
ID: 50257 · Report as offensive     Reply Quote
rob

Send message
Joined: 4 Mar 11
Posts: 27
Credit: 3,828,461
RAC: 807
Message 50258 - Posted: 27 May 2024, 14:43:04 UTC - in response to Message 50256.  

Two today:
BipKDmO9fV5n9Rq4apOajLDm4fhM0noT9bVo2ijZDmF6FKDmBB9tyn received at 10:31
https://lhcathome.cern.ch/lhcathome/result.php?resultid=411404266

zaBMDmcviV5n9Rq4apOajLDm4fhM0noT9bVo2ijZDmABGKDm8fILCn received at 14:51
https://lhcathome.cern.ch/lhcathome/result.php?resultid=411405886

Both are sitting in the queue ready to sart
ID: 50258 · Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : ATLAS application : Download failures


©2024 CERN