log in

Tasks failing with "G4 exception at line 3445"


Advanced search

Message boards : ATLAS application : Tasks failing with "G4 exception at line 3445"

Author Message
Profile HerveUAE
Avatar
Send message
Joined: 18 Dec 16
Posts: 120
Credit: 6,749,027
RAC: 20,218
Message 31215 - Posted: 1 Jul 2017, 7:07:28 UTC

On only one of my 4 computers, several of the tasks fail with "Validate error" and the Stderr output has the following error:

Transform executor raised TransformValidationException: EVNTtoHITS got a SIGABRT signal (exit code 134); G4 exception at line 3445 (see jobReport for further details)

Here are example of such tasks:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150102272
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150089802
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150085059
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150081883

Anybody has an idea of what could cause the error? Should I just reboot the computer to see if it goes away?
____________
We are the product of random evolution.

David Cameron
Project administrator
Project developer
Project scientist
Send message
Joined: 13 May 14
Posts: 139
Credit: 3,159,531
RAC: 6,484
Message 31245 - Posted: 3 Jul 2017, 7:17:42 UTC - in response to Message 31215.

There is some strange stuff in the log:

2017-07-01 07:16:44 (19104): Guest Log: CCopyoipnygi nign piuntp ufti lfeisl eisn tion tRou nRAutnlAatsl.as
2017-07-01 07:16:44 (19104): Guest Log: CCopyoipnygi nign piuntp ufti lfeisl eisn tion tRou nRAutnlAatsl.as.

It looks like two tasks are running at the same time inside the same virtual machine. I've seen this before but never figured out what causes it. Can you try a reboot to see if it helps?

Harri Liljeroos
Avatar
Send message
Joined: 28 Sep 04
Posts: 205
Credit: 6,174,208
RAC: 2,706
Message 31247 - Posted: 3 Jul 2017, 9:06:09 UTC

I have seen several of those also. Here's one that failed with messages showing three times. https://lhcathome.cern.ch/lhcathome/result.php?resultid=150209806 The task gave also the error 65 and ran only for about 15 minutes.

The host in question had finished one task before that successfully and then one just after that. I have increased the memory from the default settings.
____________

Harri Liljeroos
Avatar
Send message
Joined: 28 Sep 04
Posts: 205
Credit: 6,174,208
RAC: 2,706
Message 31259 - Posted: 3 Jul 2017, 18:08:27 UTC

Here's one failed task single core task https://lhcathome.cern.ch/lhcathome/result.php?resultid=150347204. It has some of the lines in stderr repeated 7 times. The memory was set to 4500 MB. This one didn't have any error messages though.
____________

Crystal Pellet
Volunteer moderator
Volunteer tester
Send message
Joined: 14 Jan 10
Posts: 384
Credit: 2,997,809
RAC: 2,011
Message 31262 - Posted: 3 Jul 2017, 19:00:51 UTC - in response to Message 31245.
Last modified: 3 Jul 2017, 19:05:46 UTC

There is some strange stuff in the log:

2017-07-01 07:16:44 (19104): Guest Log: CCopyoipnygi nign piuntp ufti lfeisl eisn tion tRou nRAutnlAatsl.as
2017-07-01 07:16:44 (19104): Guest Log: CCopyoipnygi nign piuntp ufti lfeisl eisn tion tRou nRAutnlAatsl.as.

It looks like two tasks are running at the same time inside the same virtual machine. I've seen this before but never figured out what causes it. Can you try a reboot to see if it helps?

I don't think this is coming from 2 tasks running. The output is coming from the same procID 19104,
so I think this is a minor wrapper problem filtering Guest.log data needed for the stderr.txt from VBox.log.

computezrmle
Send message
Joined: 15 Jun 08
Posts: 347
Credit: 3,501,271
RAC: 1,830
Message 31264 - Posted: 3 Jul 2017, 19:26:41 UTC - in response to Message 31215.

On only one of my 4 computers, several of the tasks fail with "Validate error" and the Stderr output has the following error:
Transform executor raised TransformValidationException: EVNTtoHITS got a SIGABRT signal (exit code 134); G4 exception at line 3445 (see jobReport for further details)

Here are example of such tasks:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150102272
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150089802
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150085059
https://lhcathome.cern.ch/lhcathome/result.php?resultid=150081883

Anybody has an idea of what could cause the error? Should I just reboot the computer to see if it goes away?

You may try a project reset and delete remaining trash (like old vdi files) from the slots directory.
Then restart your host and request a fresh WU.

Profile HerveUAE
Avatar
Send message
Joined: 18 Dec 16
Posts: 120
Credit: 6,749,027
RAC: 20,218
Message 31271 - Posted: 4 Jul 2017, 2:45:14 UTC - in response to Message 31264.

Can you try a reboot to see if it helps?


I have rebooted the PC and it solved the issue. It is not the first time it occurs on that computer specifically and I don't remember seeing G4 error in any of my other 2 computers. Strange...

You may try a project reset and delete remaining trash (like old vdi files) from the slots directory.
Then restart your host and request a fresh WU.

Next time it occurs, I will try that suggestion instead of rebooting the computer.

I don't think this is coming from 2 tasks running. The output is coming from the same procID 19104,
so I think this is a minor wrapper problem filtering Guest.log data needed for the stderr.txt from VBox.log.

I have seen stderr.txt logs with lines repeated multiple times and / or intertwined several times in the past. I also do not think it is linked to a task failure.
____________
We are the product of random evolution.

Message boards : ATLAS application : Tasks failing with "G4 exception at line 3445"