Message boards : ATLAS application : Tasks failing with "G4 exception at line 3445"
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile HerveUAE
Avatar

Send message
Joined: 18 Dec 16
Posts: 123
Credit: 37,495,365
RAC: 0
Message 31215 - Posted: 1 Jul 2017, 7:07:28 UTC

On only one of my 4 computers, several of the tasks fail with "Validate error" and the Stderr output has the following error:
Transform executor raised TransformValidationException: EVNTtoHITS got a SIGABRT signal (exit code 134); G4 exception at line 3445 (see jobReport for further details)

Here are example of such tasks:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150102272
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150089802
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150085059
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150081883

Anybody has an idea of what could cause the error? Should I just reboot the computer to see if it goes away?
We are the product of random evolution.
ID: 31215 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 31245 - Posted: 3 Jul 2017, 7:17:42 UTC - in response to Message 31215.  

There is some strange stuff in the log:

2017-07-01 07:16:44 (19104): Guest Log: CCopyoipnygi nign piuntp ufti lfeisl eisn tion tRou nRAutnlAatsl.as
2017-07-01 07:16:44 (19104): Guest Log: CCopyoipnygi nign piuntp ufti lfeisl eisn tion tRou nRAutnlAatsl.as.

It looks like two tasks are running at the same time inside the same virtual machine. I've seen this before but never figured out what causes it. Can you try a reboot to see if it helps?
ID: 31245 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,152,673
RAC: 15,405
Message 31247 - Posted: 3 Jul 2017, 9:06:09 UTC

I have seen several of those also. Here's one that failed with messages showing three times. https://lhcathome.cern.ch/lhcathome/result.php?resultid=150209806 The task gave also the error 65 and ran only for about 15 minutes.

The host in question had finished one task before that successfully and then one just after that. I have increased the memory from the default settings.
ID: 31247 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,152,673
RAC: 15,405
Message 31259 - Posted: 3 Jul 2017, 18:08:27 UTC

Here's one failed task single core task https://lhcathome.cern.ch/lhcathome/result.php?resultid=150347204. It has some of the lines in stderr repeated 7 times. The memory was set to 4500 MB. This one didn't have any error messages though.
ID: 31259 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 31262 - Posted: 3 Jul 2017, 19:00:51 UTC - in response to Message 31245.  
Last modified: 3 Jul 2017, 19:05:46 UTC

There is some strange stuff in the log:

2017-07-01 07:16:44 (19104): Guest Log: CCopyoipnygi nign piuntp ufti lfeisl eisn tion tRou nRAutnlAatsl.as
2017-07-01 07:16:44 (19104): Guest Log: CCopyoipnygi nign piuntp ufti lfeisl eisn tion tRou nRAutnlAatsl.as.

It looks like two tasks are running at the same time inside the same virtual machine. I've seen this before but never figured out what causes it. Can you try a reboot to see if it helps?

I don't think this is coming from 2 tasks running. The output is coming from the same procID 19104,
so I think this is a minor wrapper problem filtering Guest.log data needed for the stderr.txt from VBox.log.
ID: 31262 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,969,741
RAC: 136,700
Message 31264 - Posted: 3 Jul 2017, 19:26:41 UTC - in response to Message 31215.  

On only one of my 4 computers, several of the tasks fail with "Validate error" and the Stderr output has the following error:
Transform executor raised TransformValidationException: EVNTtoHITS got a SIGABRT signal (exit code 134); G4 exception at line 3445 (see jobReport for further details)

Here are example of such tasks:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150102272
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150089802
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150085059
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150081883

Anybody has an idea of what could cause the error? Should I just reboot the computer to see if it goes away?

You may try a project reset and delete remaining trash (like old vdi files) from the slots directory.
Then restart your host and request a fresh WU.
ID: 31264 · Report as offensive     Reply Quote
Profile HerveUAE
Avatar

Send message
Joined: 18 Dec 16
Posts: 123
Credit: 37,495,365
RAC: 0
Message 31271 - Posted: 4 Jul 2017, 2:45:14 UTC - in response to Message 31264.  

Can you try a reboot to see if it helps?


I have rebooted the PC and it solved the issue. It is not the first time it occurs on that computer specifically and I don't remember seeing G4 error in any of my other 2 computers. Strange...

You may try a project reset and delete remaining trash (like old vdi files) from the slots directory.
Then restart your host and request a fresh WU.

Next time it occurs, I will try that suggestion instead of rebooting the computer.

I don't think this is coming from 2 tasks running. The output is coming from the same procID 19104,
so I think this is a minor wrapper problem filtering Guest.log data needed for the stderr.txt from VBox.log.

I have seen stderr.txt logs with lines repeated multiple times and / or intertwined several times in the past. I also do not think it is linked to a task failure.
We are the product of random evolution.
ID: 31271 · Report as offensive     Reply Quote

Message boards : ATLAS application : Tasks failing with "G4 exception at line 3445"


©2024 CERN