Message boards : ATLAS application : ATLAS affected by DSL Outage
Message board moderation

To post messages, you must log in.

AuthorMessage
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2401
Credit: 225,175,311
RAC: 124,037
Message 31192 - Posted: 30 Jun 2017, 10:32:50 UTC

The following ATLAS job was affected by a 1 h DSL outage caused by my ISP and finished later without manual intervention.
You (mainly the project team) may examine the logs to see if the VM behaves as expected.

https://lhcathome.cern.ch/lhcathome/result.php?resultid=150145411
ID: 31192 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 31193 - Posted: 30 Jun 2017, 11:23:21 UTC - in response to Message 31192.  

From the output:

Input/output error: '/cvmfs/atlas.cern.ch/repo/sw/software/21.0/AtlasOffline/21.0.15/InstallArea/x86_64-slc6-gcc49-opt/jobOptions/SimuJobTransforms/skeleton.HITSMerge.py'

Data in /cvmfs is read over the network so your outage caused the task to fail. You still got the credits because you used significant CPU but our systems down the chain will see the error and retry the task automatically.
ID: 31193 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2401
Credit: 225,175,311
RAC: 124,037
Message 31197 - Posted: 30 Jun 2017, 12:00:22 UTC - in response to Message 31193.  

Thank you David.

It may at least help to harden the WUs against outages or - if this situations are rare - simply reschedule the job as you stated.
ID: 31197 · Report as offensive     Reply Quote

Message boards : ATLAS application : ATLAS affected by DSL Outage


©2024 CERN