Message boards :
News :
Server outage - uploads failing
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Jul 05 Posts: 249 Credit: 5,974,599 RAC: 0 |
Due to a network problem in the CERN computer centre early Thursday morning, our BOINC servers have lost access to a storage cluster. Hence uploads are failing and access to web pages as well. Hopefully this should be fixed soon. |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 34,609 |
Thanks Nils. Got fresh work for CMS and Theory but not yet for ATLAS. Might be due to a huge number of work requests. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 456 |
Thank you, Nils and your Team, Yes, Atlas-native say, no tasks are avalaible. Server-Status page for ATLAS 0 Tasks. |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 34,609 |
Now I got an ATLAS task but it failed with "-186 (0xFFFFFF46) ERR_RESULT_DOWNLOAD": https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4387&postid=41334 |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 34,609 |
CMS tasks are starting fine and all of them get a subtask immediately after the initial setup but when the 1st subtask has finished they have problems getting a new one. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 456 |
Since Yesterday evening 20 UTC are some interrupts for up and download shown. It need a manually reaction with Boinc transfer- try again. |
Send message Joined: 9 Jan 15 Posts: 151 Credit: 431,596,822 RAC: 0 |
Atlas WU download error: couldn't get input files: <file_xfer_error> <file_name>VlLMDmUZ6EwnsSi4apGgGQJmABFKDmABFKDmvjyKDmABFKDmCNuF5m_EVNT.19652175._000455.pool.root.1</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 34,609 |
A link to the affected task or WU would have been helpful. Guess it's this one: https://lhcathome.cern.ch/lhcathome/result.php?resultid=259798525 If you take look at the WU overview: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=131025881 you may notice that all of your wingmen get the same error. The reason is that an input file is missing on the download server. Since max#errors is set to 3 this WU will be send out 3 times until the server will automatically remove it from the queue. |
Send message Joined: 15 Jul 05 Posts: 249 Credit: 5,974,599 RAC: 0 |
One of our 3 file servers still had issues, it should be fixed now. |
Send message Joined: 9 Jan 15 Posts: 151 Credit: 431,596,822 RAC: 0 |
Yes for this error it this wu got affected. I was on break at work and short of time and didn't digg to much on history of other wu's or history into this on. My conclusion would be server would be in bad shape to send data as put out http error. It was example and if one failed it would be several more this wu's would not be the only one. I opt-out of Atlas as soon i saw it and now turn it back when i got home and my hosts download just fine now. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
At just that time, I picked up one of the usual errors "207 (0x000000CF) EXIT_NO_SUB_TASKS". https://lhcathome.cern.ch/lhcathome/result.php?resultid=259641851 However, the next eight gave "194 (0x000000C2) EXIT_ABORTED_BY_CLIENT". https://lhcathome.cern.ch/lhcathome/result.php?resultid=259690076 I had posted on this error message before, and I think this explains it. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 456 |
FTM some Download-Error are shown for Atlas and Atlas-native. Sorry, but it is Weekend. |
©2024 CERN