1) Message boards : Theory Application : Error: process exited with code 194 (0xc2, -62) (Message 38025)
Posted 17 Feb 2019 by Trotador
Post:
[quote]

.......

I would install the 5.2.26 version update and the Ext. Pack and reboot and try again and they will probably work.


I followed your advise and so far so good, no new 194 errors.

Thanks


Nah, errors continue happening
2) Message boards : Theory Application : Error: process exited with code 194 (0xc2, -62) (Message 38013)
Posted 14 Feb 2019 by Trotador
Post:
[quote]

.......

I would install the 5.2.26 version update and the Ext. Pack and reboot and try again and they will probably work.


I followed your advise and so far so good, no new 194 errors.

Thanks
3) Message boards : Theory Application : Error: process exited with code 194 (0xc2, -62) (Message 37965)
Posted 9 Feb 2019 by Trotador
Post:
Hi,

Yes I've aborted tasks but the ones I'm referring to are the tasks finished as "Error while Computing" with the error in the thread title.

I've included the link to the computer mostly affected

thanks!
4) Message boards : Theory Application : Error: process exited with code 194 (0xc2, -62) (Message 37957)
Posted 9 Feb 2019 by Trotador
Post:
Hi, many units failing with this error message in Ubuntu 18.01 with Virtualbox (5.2.18r124319). I've seen it in other linux hosts but I did not find a discussion of it in the forums. Sorry if I've missed it. Any hint?, they used to happen happen almost at the end of the process so many processing hours wasted. Many times all running units crash at a time with this error.

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10564024&offset=0&show_names=0&state=6&appid=


thanks!
5) Message boards : ATLAS application : Computing error (Message 36805)
Posted 21 Sep 2018 by Trotador
Post:
Yes, that calculations is what I made but errors occurred, randomly apparently but constantly.

Analyzing in VirtualBox the data of the VMs I find two different sizes for the VMs base memory: 5000Mb and 10200Mb. The later are the ones failing. I reduced to 5 WUs per host and no error so far.

I do not see any difference in the WU name construction that allow identifying them and I can not say it is not something in my end.
6) Message boards : ATLAS application : Computing error (Message 36797)
Posted 21 Sep 2018 by Trotador
Post:
I'm having this error in one of my hosts in many WUs as of late, but not all WUs, maybe since beginning of this week (after outage?), before everything seemed to go more or less OK. I'm crunching 6 WU, 8 cores each, the host has 96 GB RAM, Is lack of enough RAM?, it was running 8 WUs before OK. In other hosts I get also this error but in very few WUs

Any idea?.

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10564024
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10558422

<core_client_version>7.6.31</core_client_version>
<![CDATA[
<stderr_txt>
2018-09-20 23:45:46 (54265): vboxwrapper (7.7.26196): starting
2018-09-20 23:45:47 (54265): Feature: Checkpoint interval offset (466 seconds)
2018-09-20 23:45:47 (54265): Detected: VirtualBox VboxManage Interface (Version: 5.2.18)
2018-09-20 23:45:47 (54265): Detected: Minimum checkpoint interval (900.000000 seconds)
2018-09-20 23:45:47 (54265): Successfully copied 'init_data.xml' to the shared directory.
2018-09-20 23:45:47 (54265): Create VM. (boinc_b70d56aa415278d2, slot#6)
2018-09-20 23:45:53 (54265): Setting Memory Size for VM. (10200MB)
2018-09-20 23:45:54 (54265): Setting CPU Count for VM. (8)
2018-09-20 23:45:54 (54265): Setting Chipset Options for VM.
2018-09-20 23:45:54 (54265): Setting Boot Options for VM.
2018-09-20 23:45:54 (54265): Setting Network Configuration for NAT.
2018-09-20 23:45:54 (54265): Enabling VM Network Access.
2018-09-20 23:45:54 (54265): Disabling USB Support for VM.
2018-09-20 23:45:54 (54265): Disabling COM Port Support for VM.
2018-09-20 23:45:55 (54265): Disabling LPT Port Support for VM.
2018-09-20 23:45:55 (54265): Disabling Audio Support for VM.
2018-09-20 23:45:55 (54265): Disabling Clipboard Support for VM.
2018-09-20 23:45:55 (54265): Disabling Drag and Drop Support for VM.
2018-09-20 23:45:55 (54265): Adding storage controller(s) to VM.
2018-09-20 23:45:55 (54265): Adding virtual disk drive to VM. (vm_image.vdi)
2018-09-20 23:45:55 (54265): Adding VirtualBox Guest Additions to VM.
2018-09-20 23:45:55 (54265): Adding network bandwidth throttle group to VM. (Defaulting to 1024GB)
2018-09-20 23:45:56 (54265): forwarding host port 57873 to guest port 80
2018-09-20 23:45:56 (54265): Enabling remote desktop for VM.
2018-09-20 23:45:56 (54265): Required extension pack not installed, remote desktop not enabled.
2018-09-20 23:45:56 (54265): Enabling shared directory for VM.
2018-09-20 23:45:56 (54265): Starting VM. (boinc_b70d56aa415278d2, slot#6)
2018-09-20 23:45:59 (54265): Successfully started VM. (PID = '56560')
2018-09-20 23:45:59 (54265): Reporting VM Process ID to BOINC.
2018-09-20 23:46:07 (54265): Guest Log: BIOS: VirtualBox 5.2.18
2018-09-20 23:46:07 (54265): Guest Log: CPUID EDX: 0x178bfbff
2018-09-20 23:46:07 (54265): Guest Log: BIOS: ata0-0: PCHS=16383/16/63 LCHS=1024/255/63
2018-09-20 23:46:07 (54265): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032
2018-09-20 23:46:07 (54265): Guest Log: BIOS: Booting from Hard Disk...
2018-09-20 23:46:07 (54265): Guest Log: BIOS: KBD: unsupported int 16h function 03
2018-09-20 23:46:07 (54265): Guest Log: BIOS: AX=0305 BX=0000 CX=0000 DX=0000
2018-09-20 23:46:07 (54265): VM state change detected. (old = 'poweroff', new = 'running')
2018-09-20 23:46:07 (54265): Detected: Web Application Enabled (http://localhost:57873)
2018-09-20 23:46:07 (54265): Preference change detected
2018-09-20 23:46:07 (54265): Setting CPU throttle for VM. (100%)
2018-09-20 23:46:12 (54265): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 600 seconds) or (Vbox_job.xml: 900 seconds))
2018-09-20 23:47:22 (54265): Guest Log: vboxguest: major 0, IRQ 20, I/O port d020, MMIO at 00000000f0400000 (size 0x400000)
2018-09-20 23:47:34 (54265): Guest Log: VBoxGuest: VBoxGuestCommonGuestCapsAcquire: pSession(0xffff88028dcbac10), OR(0x0), NOT(0xffffffff), flags(0x0)
2018-09-20 23:47:34 (54265): Guest Log: VBoxGuest: VBoxGuestCommonGuestCapsAcquire: pSession(0xffff880287253610), OR(0x0), NOT(0xffffffff), flags(0x0)
2018-09-20 23:47:34 (54265): Guest Log: VBoxGuest: VBoxGuestCommonGuestCapsAcquire: pSession(0xffff88028dcba810), OR(0x0), NOT(0xffffffff), flags(0x0)
2018-09-20 23:47:34 (54265): Guest Log: VBoxGuest: VBoxGuestCommonGuestCapsAcquire: pSession(0xffff880287276810), OR(0x0), NOT(0xffffffff), flags(0x0)
2018-09-20 23:48:15 (54265): Guest Log: Copying input files into RunAtlas.
2018-09-20 23:48:19 (54265): Guest Log: Copied input files into RunAtlas.
2018-09-20 23:48:28 (54265): Guest Log: copied the webapp to /var/www
2018-09-20 23:48:28 (54265): Guest Log: This vm does not need to setup http proxy
2018-09-20 23:48:28 (54265): Guest Log: ATHENA_PROC_NUMBER=8
2018-09-20 23:48:29 (54265): Guest Log: Starting ATLAS job. (PandaID=4064500070 taskID=15385155)
2018-09-20 23:59:05 (54265): Preference change detected
2018-09-20 23:59:05 (54265): Setting CPU throttle for VM. (100%)
2018-09-20 23:59:05 (54265): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 600 seconds) or (Vbox_job.xml: 900 seconds))
2018-09-21 00:05:13 (54265): Preference change detected
2018-09-21 00:05:13 (54265): Setting CPU throttle for VM. (100%)
2018-09-21 00:05:13 (54265): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 600 seconds) or (Vbox_job.xml: 900 seconds))
2018-09-21 00:27:04 (54265): Preference change detected
2018-09-21 00:27:04 (54265): Setting CPU throttle for VM. (100%)
2018-09-21 00:27:04 (54265): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 600 seconds) or (Vbox_job.xml: 900 seconds))
2018-09-21 01:25:18 (54265): Status Report: Elapsed Time: '6000.363421'
2018-09-21 01:25:18 (54265): Status Report: CPU Time: '41699.990000'
2018-09-21 02:36:24 (54265): VM is no longer is a running state. It is in 'poweroff'.
2018-09-21 02:36:24 (54265): VM state change detected. (old = 'running', new = 'poweroff')
2018-09-21 02:36:24 (54265): Powering off VM.
2018-09-21 02:36:24 (54265): Deregistering VM. (boinc_b70d56aa415278d2, slot#6)
2018-09-21 02:36:24 (54265): Removing network bandwidth throttle group from VM.
2018-09-21 02:36:24 (54265): Removing storage controller(s) from VM.
2018-09-21 02:36:24 (54265): Removing VM from VirtualBox.
2018-09-21 02:36:25 (54265): Removing virtual disk drive from VirtualBox.
2018-09-21 02:36:30 (54265): Virtual machine exited.
02:36:30 (54265): called boinc_finish(0)

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>O47KDms37MtnlyackoJh5iwnABFKDmABFKDmTdiMDmABFKDmXHHw9m_2_r254113286_ATLAS_result</file_name>
<error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
7) Message boards : Sixtrack Application : Sixtrack tasks with name starting with Jone0_sep_jta_bbo are crashing on my system (Message 32454)
Posted 17 Sep 2017 by Trotador
Post:
+1
8) Message boards : Sixtrack Application : Max quantity of WUs = 20? (Message 32419)
Posted 14 Sep 2017 by Trotador
Post:
Hi

If I set in my preferences page Max # jobs to unlimited, I can not download more than 20 WUs, if I set it to 24 (the maximum value available) I can download 24 units

Why the 20 units limit when in the unlimited setting? Is it on purpose or an error? My hosts can simultaneously crunch many more that 24 units
9) Message boards : News : No RESULTS accepted from Linux Kernel 4.8.* (Message 31393)
Posted 14 Jul 2017 by Trotador
Post:
I also have a host with 4.8 without a single errored wu.

So I think you have to look harder
10) Message boards : Sixtrack Application : exceeded elapsed time limit 30940.80 (180000000.00G/5817.56G) (Message 30378)
Posted 17 May 2017 by Trotador
Post:
The trick was to shutdown boincmanager before making changes to client_state.xml. Doing it so, either changing application fplos or wu rsc_fpops_bound work like a charm.

Thanks!
11) Message boards : Sixtrack Application : exceeded elapsed time limit 30940.80 (180000000.00G/5817.56G) (Message 30354)
Posted 15 May 2017 by Trotador
Post:
It is the sse2 application, I've changed the value of flops in the client_state.xml file but it reverts to the previous figure of 5817 Ggplos...

any idea?

Probably all tasks downloaded before you changed to the lower fpops will still have the higher fpops in the workunit settings.

If you are already hacking the client_state.xml, you could increase the <rsc_fpops_bound> for those workunits with a factor 10.


Thank for the suggestion but it is also reverting after I chnge it.
12) Message boards : Sixtrack Application : exceeded elapsed time limit 30940.80 (180000000.00G/5817.56G) (Message 30350)
Posted 15 May 2017 by Trotador
Post:
It is the sse2 application, I've changed the value of flops in the client_state.xml file but it reverts to the previous figure of 5817 Ggplos...

any idea?
13) Message boards : Sixtrack Application : exceeded elapsed time limit 30940.80 (180000000.00G/5817.56G) (Message 30349)
Posted 15 May 2017 by Trotador
Post:
And, is there a way to identify which tasks have this problem? I see in my log successful and errored tasks downloaded at the same time
14) Message boards : Sixtrack Application : exceeded elapsed time limit 30940.80 (180000000.00G/5817.56G) (Message 30348)
Posted 15 May 2017 by Trotador
Post:

exceeded elapsed time limit 30940.80 (180000000.00G/5817.56G)

This floating point was reported much too high.
Therefore the client has calculated a much shorter time to finish.

Meanwhile that machine is reporting a measured floating point speed of 1957.67 million ops/second.
If that value was used when requesting tasks, you would have 91946.04 seconds to finish a task.


How that could happen?
15) Message boards : Sixtrack Application : exceeded elapsed time limit 30940.80 (180000000.00G/5817.56G) (Message 30344)
Posted 15 May 2017 by Trotador
Post:
Same issue, suspending LHC in that host,
16) Message boards : Sixtrack Application : exceeded elapsed time limit 30940.80 (180000000.00G/5817.56G) (Message 30343)
Posted 15 May 2017 by Trotador
Post:
It continues happening in that host, suspending similar tasks.Let's see what happen with wzero_ ones
17) Message boards : Sixtrack Application : exceeded elapsed time limit 30940.80 (180000000.00G/5817.56G) (Message 30336)
Posted 14 May 2017 by Trotador
Post:
Many unit failing with this message, and actually correspond to units which exceed the indicated processing time.They stop and abort themselves at that moment

programming error?, there are other units of the same type finishing Ok with processing times beyond 30940 seconds

https://lhcathome.cern.ch/lhcathome/result.php?resultid=139382571
<core_client_version>7.6.31</core_client_version>
<![CDATA[
<message>
exceeded elapsed time limit 30940.80 (180000000.00G/5817.56G)
</message>
<stderr_txt>

</stderr_txt>
]]>



©2024 CERN