Message boards : ATLAS application : Atlas Verison 2 WU's crash (at startup?)
Message board moderation

To post messages, you must log in.

AuthorMessage
csbyseti

Send message
Joined: 6 Jul 17
Posts: 22
Credit: 29,430,354
RAC: 0
Message 40187 - Posted: 18 Oct 2019, 7:51:46 UTC

Tomorrow morning i see on all Computers Atlas Wu's only with 110%-150% CPU Load but ~10h run time. CPU Load Value should be at 350-390% with 4 threads.

I think all Wu's from yesterday are crashed, but without working Alt-F2 it's not so easy to get something displayed.

There must be a problem in the actual set of Wu's older version 2 Wu's worked normal.

Got some VM Text with Kernel Panic displayed.

Some team members got the same Problem.
ID: 40187 · Report as offensive     Reply Quote
csbyseti

Send message
Joined: 6 Jul 17
Posts: 22
Credit: 29,430,354
RAC: 0
Message 40188 - Posted: 18 Oct 2019, 8:13:10 UTC

Got on one VM following error:

https://c.web.de/@309282286350637121/m6p10ZAuQlOkGGaU2RipmQ

I 've seen this error also on a machine at WU restart.
ID: 40188 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,924,227
RAC: 137,713
Message 40190 - Posted: 18 Oct 2019, 8:43:17 UTC - in response to Message 40188.  

vboxsf --> vbox shared folder driver

best cases:
- upgrade to the most recent vbox version
- reset the project (to get a fresh vdi if your's got damaged somehow)

worst case:
The vbox extensions inside the vdi are not compatible to the running VM kernel any more.
This needs to be investigated/fixed by the CERN developers.
ID: 40190 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 40191 - Posted: 18 Oct 2019, 9:17:13 UTC

I aborted some tasks. Although set up with 4 cores (out of 8), The VM is using 'only' 20-22% of the CPU.
Consoles F2 and F3 are not showing information, so no idea what the VM is doing - hence abort.
ID: 40191 · Report as offensive     Reply Quote
Klaus

Send message
Joined: 27 Aug 15
Posts: 27
Credit: 9,956,094
RAC: 4,433
Message 40192 - Posted: 18 Oct 2019, 9:21:00 UTC

Same problem today. Atlas job do not start.
In the vbox.log is no line Guest Log: *** Starting ATLAS job. (PandaID=...... at startup.
ID: 40192 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,093,506
RAC: 103,308
Message 40193 - Posted: 18 Oct 2019, 9:59:20 UTC

Same here,
it need to roll back to the previous Version from yesterday, without Alt+F3 feature.
ID: 40193 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 40196 - Posted: 18 Oct 2019, 11:58:42 UTC - in response to Message 40193.  

I have rolled back yesterday's changes. I'm travelling today so don't have time to investigate what the problem is but will look at it on Monday.
ID: 40196 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,093,506
RAC: 103,308
Message 40197 - Posted: 18 Oct 2019, 12:51:38 UTC

Thank you David.
ID: 40197 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 40241 - Posted: 22 Oct 2019, 11:08:10 UTC

The problem is now understood and fixed. In the next hour or two new WU will have the fixed version of the top console activated.
ID: 40241 · Report as offensive     Reply Quote
csbyseti

Send message
Joined: 6 Jul 17
Posts: 22
Credit: 29,430,354
RAC: 0
Message 40242 - Posted: 22 Oct 2019, 14:10:47 UTC - in response to Message 40241.  

This sounds good.

Thanks for the fast reaction on Friday.

On my i7-5820k wu's are failling also the whole weekend. Updated virtualbox now. Hope it will solve the Problems on this machine.
ID: 40242 · Report as offensive     Reply Quote

Message boards : ATLAS application : Atlas Verison 2 WU's crash (at startup?)


©2024 CERN