Message boards : LHCb Application : 194 EXIT_ABORTED_BY_CLIENT reason
Message board moderation

To post messages, you must log in.

AuthorMessage
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,027,000
RAC: 0
Message 30671 - Posted: 6 Jun 2017, 19:37:38 UTC
Last modified: 6 Jun 2017, 19:39:36 UTC

I got this error in an almost completed workunit. Can you explain why it occurred? Did LHC@Home lose all the work or are there inner jobs that had been uploaded while elaborating?

Log:
2017-06-06 17:30:38 (3863): Starting VM. (boinc_8769000619169c94, slot#0)
2017-06-06 17:30:39 (3863): Successfully started VM. (PID = '3980')
2017-06-06 17:30:39 (3863): Reporting VM Process ID to BOINC.
2017-06-06 17:30:39 (3863): VM state change detected. (old = 'poweroff', new = 'running')
2017-06-06 17:30:39 (3863): Detected: Web Application Enabled (http://localhost:48223)
2017-06-06 17:30:39 (3863): Detected: Remote Desktop Enabled (localhost:60736)
2017-06-06 17:30:39 (3863): Status Report: Job Duration: '64800.000000'
2017-06-06 17:30:39 (3863): Status Report: Elapsed Time: '61742.436630'
2017-06-06 17:30:39 (3863): Status Report: CPU Time: '41911.440000'
2017-06-06 17:30:39 (3863): Preference change detected
2017-06-06 17:30:39 (3863): Setting CPU throttle for VM. (100%)
2017-06-06 17:30:39 (3863): Setting network throttle for VM. (30KB)
2017-06-06 17:30:39 (3863): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 300 seconds) or (Vbox_job.xml: 600 seconds))
2017-06-06 18:00:21 (3863): VM Heartbeat file specified, but missing heartbeat.
2017-06-06 18:00:21 (3863): Capturing screenshot.
2017-06-06 18:00:23 (3863): Screenshot completed.
2017-06-06 18:00:23 (3863): Powering off VM.
2017-06-06 18:00:23 (3863): Successfully stopped VM.
2017-06-06 18:00:23 (3863): Deregistering VM. (boinc_8769000619169c94, slot#0)
2017-06-06 18:00:23 (3863): Removing network bandwidth throttle group from VM.
2017-06-06 18:00:23 (3863): Removing storage controller(s) from VM.
2017-06-06 18:00:23 (3863): Removing VM from VirtualBox.
2017-06-06 18:00:23 (3863): Removing virtual disk drive from VirtualBox.
2017-06-06 18:00:28 (3863): Failed to open screenshot image file. (project_preferen)

Link: https://lhcathome.cern.ch/lhcathome/result.php?resultid=144238423
ID: 30671 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 598
Credit: 376,678,338
RAC: 54,958
Message 30674 - Posted: 6 Jun 2017, 19:48:57 UTC

The hearbeat file was missing so the job quit.

Looks like you did work fine but you just got no credit.
ID: 30674 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1291
Credit: 23,324,631
RAC: 4,365
Message 33762 - Posted: 10 Jan 2018, 6:35:07 UTC

last night, a LHCb task failed after 13 hours, which is annoying.

Exit status 194 (0x000000C2) EXIT_ABORTED_BY_CLIENT

obviously, there was no heartbeat, whatever this means:

2018-01-10 03:41:49 (864): VM Heartbeat file specified, but missing heartbeat.
2018-01-10 03:41:49 (864): Powering off VM.


The task is:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=172240263

Can anyone explain to me how come?
ID: 33762 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 337
Credit: 237,918
RAC: 0
Message 33765 - Posted: 10 Jan 2018, 9:05:42 UTC - in response to Message 33762.  

It means that something went wrong with the VM. The heartbeat mechanism is there to protect against situations where the VM may freeze. Usually this is during the boot process but occasionally it can happen later.
ID: 33765 · Report as offensive     Reply Quote

Message boards : LHCb Application : 194 EXIT_ABORTED_BY_CLIENT reason


©2020 CERN