Message boards : ATLAS application : ATLAS tasks not using CPU but still in process
Message board moderation

To post messages, you must log in.

AuthorMessage
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 40719 - Posted: 29 Nov 2019, 10:32:22 UTC
Last modified: 29 Nov 2019, 10:48:35 UTC

I don't understand what is going on.

Numerous ATLAS tasks run a few seconds of CPU time (according to boinctasks) and then go off and run on their own and making progress of only .0050% every 2 seconds. I have had 2 tasks run for 24hrs now and they are still not finished.

How can a task run and not use CPU time?
This is not logical at all.

Appears to be a problem with the VM.
So why doesn't BOINC or VM stop the tasks?
-------------------------
Note: Reparied VM app and reinstalled Extension pack

---------------------------------------

Application
ATLAS Simulation 2.00 (vbox64_mt_mcore_atlas)
Name
yDuNDmdbssvn9Rq4apoT9bVoABFKDmABFKDmF0QRDmABFKDmJR3Qum
State
Running
Received
11/23/2019 11:27:32 PM
Report deadline
11/30/2019 11:27:32 PM
Resources
8 CPUs
Estimated computation size
43,200 GFLOPs
CPU time
00:00:12
CPU time since checkpoint
---
Elapsed time
21:18:23
Estimated time remaining
00:27:41
Fraction done
97.879%
Virtual memory size
119.32 MB
Working set size
9.96 GB
Directory
slots/16
Process ID
22516
Progress rate
4.680% per hour <-- 21 hours normally. Its been more than that.
Executable
vboxwrapper_26198ab7_windows_x86_64.exe



Application
ATLAS Simulation 2.00 (vbox64_mt_mcore_atlas)
Name
Z7pKDmsGssvn9Rq4apoT9bVoABFKDmABFKDmGtCRDmABFKDmYTBRUo
State
Running
Received
11/23/2019 11:27:32 PM
Report deadline
11/30/2019 11:27:32 PM
Resources
8 CPUs
Estimated computation size
43,200 GFLOPs
CPU time
00:00:06
CPU time since checkpoint
---
Elapsed time
19:24:49
Estimated time remaining
00:35:51
Fraction done
97.013%
Virtual memory size
119.61 MB
Working set size
9.96 GB
Directory
slots/18
Process ID
22804
Progress rate
5.040% per hour <-- 19 hrs normally
Executable
vboxwrapper_26198ab7_windows_x86_64.exe


---------------

2019-11-23 23:34:10 (33384): Detected: vboxwrapper 26197
2019-11-23 23:34:10 (33384): Detected: BOINC client v7.7
2019-11-23 23:34:10 (33384): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-23 23:34:11 (33384): Successfully copied 'init_data.xml' to the shared directory.
2019-11-23 23:34:11 (33384): Create VM. (boinc_40528c2fb1dac14d, slot#16)
2019-11-23 23:34:12 (33384): Setting Memory Size for VM. (10200MB)
2019-11-23 23:34:12 (33384): Setting CPU Count for VM. (8)
2019-11-23 23:34:12 (33384): Setting Chipset Options for VM.
2019-11-23 23:34:13 (33384): Setting Boot Options for VM.
2019-11-23 23:34:13 (33384): Setting Network Configuration for NAT.
2019-11-23 23:34:13 (33384): Enabling VM Network Access.
2019-11-23 23:34:14 (33384): Disabling USB Support for VM.
2019-11-23 23:34:14 (33384): Disabling COM Port Support for VM.
2019-11-23 23:34:14 (33384): Disabling LPT Port Support for VM.
2019-11-23 23:34:14 (33384): Disabling Audio Support for VM.
2019-11-23 23:34:15 (33384): Disabling Clipboard Support for VM.
2019-11-23 23:34:15 (33384): Disabling Drag and Drop Support for VM.
2019-11-23 23:34:15 (33384): Adding storage controller(s) to VM.
2019-11-23 23:34:16 (33384): Adding virtual disk drive to VM. (vm_image.vdi)
2019-11-23 23:34:48 (33384): Error in storage attach (fixed disk) for VM: -2135228409
Command:
VBoxManage -q storageattach "boinc_40528c2fb1dac14d" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "C:\boinc data\slots\16/vm_image.vdi"
Output:
VBoxManage.exe: error: Medium 'C:\boinc data\slots\16\vm_image.vdi' is not accessible. UUID {5e2342bb-76a4-44ff-81f6-2f3283cde68f} of the medium 'C:\boinc data\slots\16\vm_image.vdi' does not match the value {035ea719-2a3c-4c14-a422-e8e8bb07d41a} stored in the media registry ('C:\Users\Greg\.VirtualBox\VirtualBox.xml')
VBoxManage.exe: error: Details: code VBOX_E_INVALID_OBJECT_STATE (0x80bb0007), component MediumWrap, interface IMedium, callee IUnknown
VBoxManage.exe: error: Context: "SetIds(fSetNewUuid, bstrNewUuid.raw(), fSetNewParentUuid, bstrNewParentUuid.raw())" at line 694 of file VBoxManageStorageController.cpp
VBoxManage.exe: error: Failed to set the medium/parent medium UUID

Notes:

Another VirtualBox management application has locked the session for
this VM. BOINC cannot properly monitor this VM
and so this job will be aborted.


2019-11-23 23:34:48 (33384): Could not create VM
2019-11-23 23:34:48 (33384): ERROR: VM failed to start
2019-11-23 23:34:53 (33384):
NOTE: VM session lock error encountered.
BOINC will be notified that it needs to clean up the environment.
This might be a temporary problem and so this job will be rescheduled for another time.

2019-11-24 23:44:00 (19876): Detected: vboxwrapper 26197
2019-11-24 23:44:00 (19876): Detected: BOINC client v7.7
2019-11-24 23:44:01 (19876): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-24 23:44:02 (19876): Starting VM using VBoxManage interface. (boinc_40528c2fb1dac14d, slot#16)
2019-11-24 23:44:06 (19876): Successfully started VM. (PID = '20424')
2019-11-24 23:44:06 (19876): Reporting VM Process ID to BOINC.
2019-11-24 23:44:06 (19876): Guest Log: BIOS: VirtualBox 6.0.14

2019-11-24 23:44:06 (19876): Guest Log: CPUID EDX: 0x178bfbff

2019-11-24 23:44:06 (19876): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-24 23:44:07 (19876): Preference change detected
2019-11-24 23:44:07 (19876): Setting CPU throttle for VM. (100%)
2019-11-24 23:44:07 (19876): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-24 23:44:09 (19876): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032

2019-11-24 23:44:09 (19876): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=80

2019-11-24 23:44:09 (19876): Guest Log: BIOS: Boot from Hard Disk 0 failed

2019-11-24 23:44:09 (19876): Guest Log: BIOS: Boot : bseqnr=2, bootseq=0003

2019-11-24 23:44:09 (19876): Guest Log: BIOS: CDROM boot failure code : 0002

2019-11-24 23:44:09 (19876): Guest Log: BIOS: Boot from CD-ROM failed

2019-11-24 23:44:09 (19876): Guest Log: Could not read from the boot medium! System halted.

2019-11-25 01:26:16 (19876): Status Report: Elapsed Time: '6000.958063'
2019-11-25 01:26:16 (19876): Status Report: CPU Time: '1.953125'
2019-11-25 01:41:35 (19876): Stopping VM.
2019-11-25 01:41:35 (19876): Error in stop VM for VM: -108
Command:
VBoxManage -q controlvm "boinc_40528c2fb1dac14d" savestate
Output:

2019-11-25 01:41:35 (19876): VM did not stop when requested.
2019-11-25 01:41:35 (19876): VM was successfully terminated.
2019-11-28 12:18:08 (27484): Detected: vboxwrapper 26197
2019-11-28 12:18:08 (27484): Detected: BOINC client v7.7
2019-11-28 12:18:09 (27484): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-28 12:18:09 (27484): Starting VM using VBoxManage interface. (boinc_40528c2fb1dac14d, slot#16)
2019-11-28 12:18:14 (27484): Successfully started VM. (PID = '4048')
2019-11-28 12:18:14 (27484): Reporting VM Process ID to BOINC.
2019-11-28 12:18:14 (27484): Guest Log: BIOS: VirtualBox 6.0.14

2019-11-28 12:18:14 (27484): Guest Log: CPUID EDX: 0x178bfbff

2019-11-28 12:18:14 (27484): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-28 12:18:14 (27484): Status Report: Elapsed Time: '6901.958063'
2019-11-28 12:18:14 (27484): Status Report: CPU Time: '1.953125'
2019-11-28 12:18:14 (27484): Preference change detected
2019-11-28 12:18:14 (27484): Setting CPU throttle for VM. (100%)
2019-11-28 12:18:14 (27484): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-28 12:18:16 (27484): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032

2019-11-28 12:18:16 (27484): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=80

2019-11-28 12:18:16 (27484): Guest Log: BIOS: Boot from Hard Disk 0 failed

2019-11-28 12:18:16 (27484): Guest Log: BIOS: Boot : bseqnr=2, bootseq=0003

2019-11-28 12:18:16 (27484): Guest Log: BIOS: CDROM boot failure code : 0002

2019-11-28 12:18:16 (27484): Guest Log: BIOS: Boot from CD-ROM failed

2019-11-28 12:18:16 (27484): Guest Log: Could not read from the boot medium! System halted.

2019-11-28 13:58:34 (27484): Status Report: Elapsed Time: '12901.958063'
2019-11-28 13:58:34 (27484): Status Report: CPU Time: '3.859375'
2019-11-28 14:02:40 (27484): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-28 14:02:44 (27484): Stopping VM.
2019-11-28 14:06:55 (36452): Detected: vboxwrapper 26197
2019-11-28 14:06:55 (36452): Detected: BOINC client v7.7
2019-11-28 14:06:56 (36452): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-28 14:06:57 (36452): Starting VM using VBoxManage interface. (boinc_40528c2fb1dac14d, slot#16)
2019-11-28 14:07:01 (36452): Successfully started VM. (PID = '5348')
2019-11-28 14:07:01 (36452): Reporting VM Process ID to BOINC.
2019-11-28 14:07:01 (36452): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-28 14:07:01 (36452): Status Report: Elapsed Time: '13144.958063'
2019-11-28 14:07:01 (36452): Status Report: CPU Time: '3.859375'
2019-11-28 14:07:01 (36452): Preference change detected
2019-11-28 14:07:01 (36452): Setting CPU throttle for VM. (100%)
2019-11-28 14:07:02 (36452): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-28 14:17:04 (36452): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-28 14:17:08 (36452): Stopping VM.
2019-11-28 14:19:06 (16188): Detected: vboxwrapper 26197
2019-11-28 14:19:06 (16188): Detected: BOINC client v7.7
2019-11-28 14:19:07 (16188): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-28 14:19:08 (16188): Starting VM using VBoxManage interface. (boinc_40528c2fb1dac14d, slot#16)
2019-11-28 14:19:13 (16188): Successfully started VM. (PID = '16252')
2019-11-28 14:19:13 (16188): Reporting VM Process ID to BOINC.
2019-11-28 14:19:13 (16188): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-28 14:19:13 (16188): Status Report: Elapsed Time: '13743.958063'
2019-11-28 14:19:13 (16188): Status Report: CPU Time: '5.781250'
2019-11-28 14:19:13 (16188): Preference change detected
2019-11-28 14:19:13 (16188): Setting CPU throttle for VM. (100%)
2019-11-28 14:19:13 (16188): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-28 15:59:37 (16188): Status Report: Elapsed Time: '19744.958063'
2019-11-28 15:59:37 (16188): Status Report: CPU Time: '8.000000'
2019-11-28 17:39:59 (16188): Status Report: Elapsed Time: '25744.958063'
2019-11-28 17:39:59 (16188): Status Report: CPU Time: '8.218750'
2019-11-28 19:20:17 (16188): Status Report: Elapsed Time: '31744.958063'
2019-11-28 19:20:17 (16188): Status Report: CPU Time: '8.296875'
2019-11-28 20:24:40 (16188): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-28 20:24:43 (16188): Stopping VM.
20:28:11 (14692): Can't acquire lockfile (32) - waiting 35s
20:28:46 (14692): Can't acquire lockfile (32) - exiting
20:28:46 (14692): Error: The process cannot access the file because it is being used by another process.

(0x20)
2019-11-28 20:29:44 (16188): VM did not stop when requested.
2019-11-28 20:29:44 (16188): VM was NOT successfully terminated.
2019-11-28 20:39:26 (4668): Detected: vboxwrapper 26197
2019-11-28 20:39:26 (4668): Detected: BOINC client v7.7
2019-11-28 20:39:27 (4668): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-28 20:39:28 (4668): Starting VM using VBoxManage interface. (boinc_40528c2fb1dac14d, slot#16)
2019-11-28 20:39:32 (4668): Successfully started VM. (PID = '14828')
2019-11-28 20:39:32 (4668): Reporting VM Process ID to BOINC.
2019-11-28 20:39:32 (4668): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-28 20:39:32 (4668): Status Report: Elapsed Time: '35599.958063'
2019-11-28 20:39:32 (4668): Status Report: CPU Time: '8.343750'
2019-11-28 20:39:32 (4668): Preference change detected
2019-11-28 20:39:32 (4668): Setting CPU throttle for VM. (100%)
2019-11-28 20:39:32 (4668): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-28 20:59:37 (4668): Preference change detected
2019-11-28 20:59:37 (4668): Setting CPU throttle for VM. (100%)
2019-11-28 20:59:37 (4668): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-28 21:15:01 (4668): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-28 21:15:02 (4668): VM state change detected. (old = 'Paused', new = 'Running')
2019-11-28 21:15:05 (4668): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-28 21:15:06 (4668): VM state change detected. (old = 'Paused', new = 'Running')
2019-11-28 22:19:41 (4668): Status Report: Elapsed Time: '41600.892689'
2019-11-28 22:19:41 (4668): Status Report: CPU Time: '10.375000'
2019-11-28 23:59:47 (4668): Status Report: Elapsed Time: '47600.892689'
2019-11-28 23:59:47 (4668): Status Report: CPU Time: '10.406250'
2019-11-29 00:05:56 (4668): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-29 00:06:02 (4668): Stopping VM.
00:06:18 (15400): Can't acquire lockfile (32) - waiting 35s
00:06:53 (15400): Can't acquire lockfile (32) - exiting
00:06:53 (15400): Error: The process cannot access the file because it is being used by another process.

(0x20)
2019-11-29 00:11:03 (4668): VM did not stop when requested.
2019-11-29 00:11:03 (4668): VM was NOT successfully terminated.
2019-11-29 00:21:24 (22516): Detected: vboxwrapper 26197
2019-11-29 00:21:24 (22516): Detected: BOINC client v7.7
2019-11-29 00:21:25 (22516): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-29 00:21:25 (22516): Starting VM using VBoxManage interface. (boinc_40528c2fb1dac14d, slot#16)
2019-11-29 00:21:29 (22516): Successfully started VM. (PID = '17056')
2019-11-29 00:21:29 (22516): Reporting VM Process ID to BOINC.
2019-11-29 00:21:29 (22516): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-29 00:21:29 (22516): Status Report: Elapsed Time: '47967.892689'
2019-11-29 00:21:29 (22516): Status Report: CPU Time: '10.406250'
2019-11-29 00:21:29 (22516): Preference change detected
2019-11-29 00:21:29 (22516): Setting CPU throttle for VM. (100%)
2019-11-29 00:21:30 (22516): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-29 02:01:35 (22516): Status Report: Elapsed Time: '53967.892689'
2019-11-29 02:01:35 (22516): Status Report: CPU Time: '12.109375'
2019-11-29 03:41:42 (22516): Status Report: Elapsed Time: '59967.892689'
2019-11-29 03:41:42 (22516): Status Report: CPU Time: '12.109375'
2019-11-29 05:21:47 (22516): Status Report: Elapsed Time: '65967.892689'
2019-11-29 05:21:47 (22516): Status Report: CPU Time: '12.109375'
2019-11-29 07:01:54 (22516): Status Report: Elapsed Time: '71967.892689'
2019-11-29 07:01:54 (22516): Status Report: CPU Time: '12.109375'
2019-11-29 08:42:00 (22516): Status Report: Elapsed Time: '77967.892689'
2019-11-29 08:42:00 (22516): Status Report: CPU Time: '12.109375'
2019-11-29 10:22:06 (22516): Status Report: Elapsed Time: '83967.892689'
2019-11-29 10:22:06 (22516): Status Report: CPU Time: '12.109375'


**If there are problems, why does BOINC not stop the process or why does VM not stop the process?**

----------------
**Same thing here***


2019-11-23 23:35:26 (30808): Detected: vboxwrapper 26197
2019-11-23 23:35:26 (30808): Detected: BOINC client v7.7
2019-11-23 23:35:26 (30808): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-23 23:35:27 (30808): Successfully copied 'init_data.xml' to the shared directory.
2019-11-23 23:35:28 (30808): Create VM. (boinc_6697212dcdfc8fcc, slot#18)
2019-11-23 23:35:28 (30808): Setting Memory Size for VM. (10200MB)
2019-11-23 23:35:29 (30808): Setting CPU Count for VM. (8)
2019-11-23 23:35:29 (30808): Setting Chipset Options for VM.
2019-11-23 23:35:29 (30808): Setting Boot Options for VM.
2019-11-23 23:35:29 (30808): Setting Network Configuration for NAT.
2019-11-23 23:35:30 (30808): Enabling VM Network Access.
2019-11-23 23:35:30 (30808): Disabling USB Support for VM.
2019-11-23 23:35:30 (30808): Disabling COM Port Support for VM.
2019-11-23 23:35:30 (30808): Disabling LPT Port Support for VM.
2019-11-23 23:35:31 (30808): Disabling Audio Support for VM.
2019-11-23 23:35:31 (30808): Disabling Clipboard Support for VM.
2019-11-23 23:35:31 (30808): Disabling Drag and Drop Support for VM.
2019-11-23 23:35:32 (30808): Adding storage controller(s) to VM.
2019-11-23 23:35:32 (30808): Adding virtual disk drive to VM. (vm_image.vdi)
2019-11-23 23:36:05 (30808): Error in storage attach (fixed disk) for VM: -2135228409
Command:
VBoxManage -q storageattach "boinc_6697212dcdfc8fcc" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "C:\boinc data\slots\18/vm_image.vdi"
Output:
VBoxManage.exe: error: Medium 'C:\boinc data\slots\18\vm_image.vdi' is not accessible. UUID {5e2342bb-76a4-44ff-81f6-2f3283cde68f} of the medium 'C:\boinc data\slots\18\vm_image.vdi' does not match the value {0ddfc1ab-1419-4a8d-b48b-eeae3cc12a9c} stored in the media registry ('C:\Users\Greg\.VirtualBox\VirtualBox.xml')
VBoxManage.exe: error: Details: code VBOX_E_INVALID_OBJECT_STATE (0x80bb0007), component MediumWrap, interface IMedium, callee IUnknown
VBoxManage.exe: error: Context: "SetIds(fSetNewUuid, bstrNewUuid.raw(), fSetNewParentUuid, bstrNewParentUuid.raw())" at line 694 of file VBoxManageStorageController.cpp
VBoxManage.exe: error: Failed to set the medium/parent medium UUID

Notes:

Another VirtualBox management application has locked the session for
this VM. BOINC cannot properly monitor this VM
and so this job will be aborted.


2019-11-23 23:36:05 (30808): Could not create VM
2019-11-23 23:36:05 (30808): ERROR: VM failed to start
2019-11-23 23:36:10 (30808):
NOTE: VM session lock error encountered.
BOINC will be notified that it needs to clean up the environment.
This might be a temporary problem and so this job will be rescheduled for another time.

2019-11-28 14:16:54 (37776): Detected: vboxwrapper 26197
2019-11-28 14:16:54 (37776): Detected: BOINC client v7.7
2019-11-28 14:16:55 (37776): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-28 14:16:55 (37776): Starting VM using VBoxManage interface. (boinc_6697212dcdfc8fcc, slot#18)
2019-11-28 14:17:00 (37776): Successfully started VM. (PID = '37248')
2019-11-28 14:17:00 (37776): Reporting VM Process ID to BOINC.
2019-11-28 14:17:00 (37776): Guest Log: BIOS: VirtualBox 6.0.14

2019-11-28 14:17:00 (37776): Guest Log: CPUID EDX: 0x178bfbff

2019-11-28 14:17:00 (37776): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-28 14:17:00 (37776): Preference change detected
2019-11-28 14:17:00 (37776): Setting CPU throttle for VM. (100%)
2019-11-28 14:17:00 (37776): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-28 14:17:02 (37776): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032

2019-11-28 14:17:02 (37776): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=80

2019-11-28 14:17:02 (37776): Guest Log: BIOS: Boot from Hard Disk 0 failed

2019-11-28 14:17:02 (37776): Guest Log: BIOS: Boot : bseqnr=2, bootseq=0003

2019-11-28 14:17:02 (37776): Guest Log: BIOS: CDROM boot failure code : 0002

2019-11-28 14:17:02 (37776): Guest Log: BIOS: Boot from CD-ROM failed

2019-11-28 14:17:02 (37776): Guest Log: Could not read from the boot medium! System halted.

2019-11-28 14:17:05 (37776): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-28 14:17:09 (37776): Stopping VM.
2019-11-28 14:19:06 (16240): Detected: vboxwrapper 26197
2019-11-28 14:19:06 (16240): Detected: BOINC client v7.7
2019-11-28 14:19:07 (16240): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-28 14:19:08 (16240): Starting VM using VBoxManage interface. (boinc_6697212dcdfc8fcc, slot#18)
2019-11-28 14:19:13 (16240): Successfully started VM. (PID = '6304')
2019-11-28 14:19:13 (16240): Reporting VM Process ID to BOINC.
2019-11-28 14:19:13 (16240): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-28 14:19:13 (16240): Preference change detected
2019-11-28 14:19:13 (16240): Setting CPU throttle for VM. (100%)
2019-11-28 14:19:13 (16240): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-28 15:59:33 (16240): Status Report: Elapsed Time: '6000.000000'
2019-11-28 15:59:33 (16240): Status Report: CPU Time: '2.031250'
2019-11-28 17:39:55 (16240): Status Report: Elapsed Time: '12000.000000'
2019-11-28 17:39:55 (16240): Status Report: CPU Time: '2.296875'
2019-11-28 19:20:12 (16240): Status Report: Elapsed Time: '18000.000000'
2019-11-28 19:20:12 (16240): Status Report: CPU Time: '2.390625'
2019-11-28 20:24:40 (16240): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-28 20:24:43 (16240): Stopping VM.
20:28:11 (17084): Can't acquire lockfile (32) - waiting 35s
20:28:46 (17084): Can't acquire lockfile (32) - exiting
20:28:46 (17084): Error: The process cannot access the file because it is being used by another process.

(0x20)
2019-11-28 20:29:44 (16240): VM did not stop when requested.
2019-11-28 20:29:44 (16240): VM was NOT successfully terminated.
2019-11-28 20:39:26 (4252): Detected: vboxwrapper 26197
2019-11-28 20:39:26 (4252): Detected: BOINC client v7.7
2019-11-28 20:39:27 (4252): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-28 20:39:27 (4252): Starting VM using VBoxManage interface. (boinc_6697212dcdfc8fcc, slot#18)
2019-11-28 20:39:32 (4252): Successfully started VM. (PID = '9400')
2019-11-28 20:39:32 (4252): Reporting VM Process ID to BOINC.
2019-11-28 20:39:32 (4252): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-28 20:39:32 (4252): Status Report: Elapsed Time: '21860.000000'
2019-11-28 20:39:32 (4252): Status Report: CPU Time: '2.421875'
2019-11-28 20:39:32 (4252): Preference change detected
2019-11-28 20:39:32 (4252): Setting CPU throttle for VM. (100%)
2019-11-28 20:39:33 (4252): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-28 20:59:36 (4252): Preference change detected
2019-11-28 20:59:36 (4252): Setting CPU throttle for VM. (100%)
2019-11-28 20:59:37 (4252): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-28 21:15:47 (4252): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-28 21:15:48 (4252): VM state change detected. (old = 'Paused', new = 'Running')
2019-11-28 22:19:41 (4252): Status Report: Elapsed Time: '27860.679735'
2019-11-28 22:19:41 (4252): Status Report: CPU Time: '4.656250'
2019-11-28 23:59:47 (4252): Status Report: Elapsed Time: '33860.679735'
2019-11-28 23:59:47 (4252): Status Report: CPU Time: '4.671875'
2019-11-29 00:05:56 (4252): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-29 00:06:03 (4252): Stopping VM.
00:06:18 (23112): Can't acquire lockfile (32) - waiting 35s
00:06:53 (23112): Can't acquire lockfile (32) - exiting
00:06:53 (23112): Error: The process cannot access the file because it is being used by another process.

(0x20)
2019-11-29 00:11:03 (4252): VM did not stop when requested.
2019-11-29 00:11:03 (4252): VM was NOT successfully terminated.
2019-11-29 00:21:24 (22804): Detected: vboxwrapper 26197
2019-11-29 00:21:24 (22804): Detected: BOINC client v7.7
2019-11-29 00:21:25 (22804): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-29 00:21:25 (22804): Starting VM using VBoxManage interface. (boinc_6697212dcdfc8fcc, slot#18)
2019-11-29 00:21:29 (22804): Successfully started VM. (PID = '2328')
2019-11-29 00:21:29 (22804): Reporting VM Process ID to BOINC.
2019-11-29 00:21:29 (22804): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-29 00:21:29 (22804): Status Report: Elapsed Time: '34227.679735'
2019-11-29 00:21:29 (22804): Status Report: CPU Time: '4.671875'
2019-11-29 00:21:29 (22804): Preference change detected
2019-11-29 00:21:29 (22804): Setting CPU throttle for VM. (100%)
2019-11-29 00:21:30 (22804): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-29 02:01:35 (22804): Status Report: Elapsed Time: '40227.679735'
2019-11-29 02:01:35 (22804): Status Report: CPU Time: '6.421875'
2019-11-29 03:41:41 (22804): Status Report: Elapsed Time: '46227.679735'
2019-11-29 03:41:41 (22804): Status Report: CPU Time: '6.421875'
2019-11-29 05:21:47 (22804): Status Report: Elapsed Time: '52227.679735'
2019-11-29 05:21:47 (22804): Status Report: CPU Time: '6.421875'
2019-11-29 07:01:54 (22804): Status Report: Elapsed Time: '58227.679735'
2019-11-29 07:01:54 (22804): Status Report: CPU Time: '6.421875'
2019-11-29 08:42:00 (22804): Status Report: Elapsed Time: '64227.679735'
2019-11-29 08:42:00 (22804): Status Report: CPU Time: '6.421875'
2019-11-29 10:22:06 (22804): Status Report: Elapsed Time: '70227.679735'
2019-11-29 10:22:06 (22804): Status Report: CPU Time: '6.421875'
ID: 40719 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,911,388
RAC: 138,129
Message 40724 - Posted: 29 Nov 2019, 13:47:01 UTC - in response to Message 40719.  

Looks like you should clean the complete environment.
The following steps are suggested:

1. Set all projects to "no new tasks" and all tasks (from all projects) that are not yet started on hold.
This runs your slots directory dry.

2. Cancel all LHC@home tasks and wait until this is reported to the server.

3. If all tasks that are not set on hold are finished, shut down your BOINC client and wait a few minutes.

4. Reboot your computer but ensure it doesn't automatically start BOINC.

5. Delete all directories below .../slots/

6. Open your VirtualBox manager and check for inaccessible or corrupt VMs.
If they are clearly related to BOINC, delete them.

7. Start your BOINC client

8. Reset LHC@home to ensure you get a fresh vdi copy.

9. Resume your tasks, but step by step.

10. Uncheck "no new tasks"
ID: 40724 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 40726 - Posted: 29 Nov 2019, 14:36:33 UTC - in response to Message 40724.  

Looks like you should clean the complete environment.
The following steps are suggested:

1. Set all projects to "no new tasks" and all tasks (from all projects) that are not yet started on hold.
This runs your slots directory dry.

2. Cancel all LHC@home tasks and wait until this is reported to the server.

3. If all tasks that are not set on hold are finished, shut down your BOINC client and wait a few minutes.

4. Reboot your computer but ensure it doesn't automatically start BOINC.

5. Delete all directories below .../slots/

6. Open your VirtualBox manager and check for inaccessible or corrupt VMs.
If they are clearly related to BOINC, delete them.

7. Start your BOINC client

8. Reset LHC@home to ensure you get a fresh vdi copy.

9. Resume your tasks, but step by step.

10. Uncheck "no new tasks"



Ok will do, thanks.
It will be a few days, I store about 2.5 days of work.
ID: 40726 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 40727 - Posted: 29 Nov 2019, 14:55:11 UTC - in response to Message 40726.  

It will be a few days, I store about 2.5 days of work.
If you don't mind, you may abort not started tasks.
ID: 40727 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 40730 - Posted: 29 Nov 2019, 16:12:48 UTC - in response to Message 40727.  

It will be a few days, I store about 2.5 days of work.
If you don't mind, you may abort not started tasks.


I don't have any more ATLAS tasks, I just have 2 Theory tasks and they run just fine.
I have a backlog from other projects I need to clear out as well.
This computer is spread out across a bunch of other projects in addition to LHC.
Though maybe I should select a few to give up which will take some thinking.
ID: 40730 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,911,388
RAC: 138,129
Message 40731 - Posted: 29 Nov 2019, 16:39:44 UTC - in response to Message 40730.  

I have a backlog from other projects I need to clear out as well.

Only the non-started tasks.
That's why I suggest to set the projects to "no new tasks" and suspend the non-started tasks.
The BOINC client will not discard them and you can start them after the cleanup.
ID: 40731 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 40734 - Posted: 29 Nov 2019, 18:59:03 UTC - in response to Message 40731.  
Last modified: 29 Nov 2019, 19:00:26 UTC

I have a backlog from other projects I need to clear out as well.

Only the non-started tasks.
That's why I suggest to set the projects to "no new tasks" and suspend the non-started tasks.
The BOINC client will not discard them and you can start them after the cleanup.


Ok got it...will do that.
Should be ready by morning.
Got a GPU grid that is a long process.
ID: 40734 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 40738 - Posted: 30 Nov 2019, 9:08:49 UTC - in response to Message 40734.  

Have done the clean up.
Have to work through a backlog of other projects.
This clean up might help with a GPU Grid problem I am having.

Keep an eye here, either late sunday or monday I should have some news.
ID: 40738 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,911,388
RAC: 138,129
Message 40739 - Posted: 30 Nov 2019, 12:06:03 UTC - in response to Message 40738.  

You mentioned that you run GPUGRID tasks and your computer details show 2 GPUs.
Other information from your computer detail pages and your ATLAS logfiles show that you run ATLAS as 8-core setup.

Altogether this seems to be a configuration that might overload your computer in certain situations and (regarding ATLAS) an 8-core setup is not the most efficient solution.

Hence a suggestion would be to run ATLAS as 4-core setup and limit the number of concurrently running tasks to 3.
Those specs can be set using the following app_config.xml in /base_folder_of_your_BOINC_client/projects/lhcathome.cern.ch_lhcathome:
<app_config>
  <app>
    <name>ATLAS</app_name>
    <max_concurrent>3</max_concurrent>
  </app>
  <app_version>
    <app_name>ATLAS</app_name>
    <plan_class>vbox64_mt_mcore_atlas</plan_class>
    <avg_ncpus>4.0</avg_ncpus>
    <cmdline>--nthreads 4</cmdline>
  </app_version>
</app_config>
ID: 40739 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 41083 - Posted: 26 Dec 2019, 19:13:38 UTC - in response to Message 40739.  

You mentioned that you run GPUGRID tasks and your computer details show 2 GPUs.
Other information from your computer detail pages and your ATLAS logfiles show that you run ATLAS as 8-core setup.

Altogether this seems to be a configuration that might overload your computer in certain situations and (regarding ATLAS) an 8-core setup is not the most efficient solution.

Hence a suggestion would be to run ATLAS as 4-core setup and limit the number of concurrently running tasks to 3.
Those specs can be set using the following app_config.xml in /base_folder_of_your_BOINC_client/projects/lhcathome.cern.ch_lhcathome:
<app_config>
  <app>
    <name>ATLAS</app_name>
    <max_concurrent>3</max_concurrent>
  </app>
  <app_version>
    <app_name>ATLAS</app_name>
    <plan_class>vbox64_mt_mcore_atlas</plan_class>
    <avg_ncpus>4.0</avg_ncpus>
    <cmdline>--nthreads 4</cmdline>
  </app_version>
</app_config>


The 4 core section seems to be ignored for some reason.
However by quitting FAH CPU the tasks run along nicely and very orderly now.
ID: 41083 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 41087 - Posted: 26 Dec 2019, 23:32:53 UTC - in response to Message 41083.  
Last modified: 26 Dec 2019, 23:33:22 UTC

You mentioned that you run GPUGRID tasks and your computer details show 2 GPUs.
Other information from your computer detail pages and your ATLAS logfiles show that you run ATLAS as 8-core setup.

Altogether this seems to be a configuration that might overload your computer in certain situations and (regarding ATLAS) an 8-core setup is not the most efficient solution.

Hence a suggestion would be to run ATLAS as 4-core setup and limit the number of concurrently running tasks to 3.
Those specs can be set using the following app_config.xml in /base_folder_of_your_BOINC_client/projects/lhcathome.cern.ch_lhcathome:
<app_config>
  <app>
    <name>ATLAS</app_name>
    <max_concurrent>3</max_concurrent>
  </app>
  <app_version>
    <app_name>ATLAS</app_name>
    <plan_class>vbox64_mt_mcore_atlas</plan_class>
    <avg_ncpus>4.0</avg_ncpus>
    <cmdline>--nthreads 4</cmdline>
  </app_version>
</app_config>


The 4 core section seems to be ignored for some reason.
However by quitting FAH CPU the tasks run along nicely and very orderly now.


There was an extra < in the first line. Probably a leftover from something else.
ID: 41087 · Report as offensive     Reply Quote
saigon
Avatar

Send message
Joined: 8 Jul 12
Posts: 2
Credit: 1,227,302
RAC: 0
Message 41099 - Posted: 28 Dec 2019, 11:53:29 UTC - in response to Message 40739.  


Those specs can be set using the following app_config.xml in /base_folder_of_your_BOINC_client/projects/lhcathome.cern.ch_lhcathome:
<app_config>
  <app>
    <name>ATLAS</app_name>  <----   </name>
    <max_concurrent>3</max_concurrent>
  </app>
  <app_version>
    <app_name>ATLAS</app_name>
    <plan_class>vbox64_mt_mcore_atlas</plan_class>
    <avg_ncpus>4.0</avg_ncpus>
    <cmdline>--nthreads 4</cmdline>
  </app_version>
</app_config>
ID: 41099 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 41109 - Posted: 29 Dec 2019, 15:16:11 UTC - in response to Message 41099.  


Those specs can be set using the following app_config.xml in /base_folder_of_your_BOINC_client/projects/lhcathome.cern.ch_lhcathome:
<app_config>
  <app>
    <name>ATLAS</app_name>  <----   </name>
    <max_concurrent>3</max_concurrent>
  </app>
  <app_version>
    <app_name>ATLAS</app_name>
    <plan_class>vbox64_mt_mcore_atlas</plan_class>
    <avg_ncpus>4.0</avg_ncpus>
    <cmdline>--nthreads 4</cmdline>
  </app_version>
</app_config>


Thanks...was wondering...but left it this way the whole time.
Just made the change.
That also must be what was causing a no /app problem as well.
ID: 41109 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 41114 - Posted: 30 Dec 2019, 0:57:57 UTC - in response to Message 41109.  


Those specs can be set using the following app_config.xml in /base_folder_of_your_BOINC_client/projects/lhcathome.cern.ch_lhcathome:
<app_config>
  <app>
    <name>ATLAS</app_name>  <----   </name>
    <max_concurrent>3</max_concurrent>
  </app>
  <app_version>
    <app_name>ATLAS</app_name>
    <plan_class>vbox64_mt_mcore_atlas</plan_class>
    <avg_ncpus>4.0</avg_ncpus>
    <cmdline>--nthreads 4</cmdline>
  </app_version>
</app_config>


Thanks...was wondering...but left it this way the whole time.
Just made the change.
That also must be what was causing a no /app problem as well.


That was the one thing causing the hiccup. So now got it down to 4 cores. Thanks for the correction.
ID: 41114 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,088,272
RAC: 104,020
Message 41181 - Posted: 6 Jan 2020, 20:52:13 UTC

Since 19.00 UTC, Atlas in Windows with Virtualbox 5.2.32
have no CPU in use. Have stopped using Atlas in this machines.
ID: 41181 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 41182 - Posted: 7 Jan 2020, 0:36:08 UTC - in response to Message 41181.  
Last modified: 7 Jan 2020, 0:37:50 UTC

Since 19.00 UTC, Atlas in Windows with Virtualbox 5.2.32
have no CPU in use. Have stopped using Atlas in this machines.



What are you doing with a V5 VBOX?!! It's up to 6.1 now.
I would suggest updating and trying to get Atlas tasks again.
You are seriously out of date.

And go through Yeti's checklist to double check everything.
ID: 41182 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,088,272
RAC: 104,020
Message 41183 - Posted: 7 Jan 2020, 2:54:18 UTC

Have one Windows-PC with 6.0.14 running.
This old Version 5.2.32, is in the Linux for Atlas testing and running.
Vbox-Guest-Addition is not running fine with CentOS7-VM.
Have to pull right-STRG to come back to Windows.
5.2.32 have support up to June of 2020.
Have also Linux-VM's running in this Version.
ID: 41183 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 41189 - Posted: 7 Jan 2020, 9:03:47 UTC - in response to Message 41183.  

Have one Windows-PC with 6.0.14 running.
This old Version 5.2.32, is in the Linux for Atlas testing and running.
Vbox-Guest-Addition is not running fine with CentOS7-VM.
Have to pull right-STRG to come back to Windows.
5.2.32 have support up to June of 2020.
Have also Linux-VM's running in this Version.




Well since this is a different kind of problem than what I am experiencing, I would suggest you make your own new thread about it. I run Windows and my problem is completely different than yours.
That's all I can suggest. You will get better help if you make your own thread.
ID: 41189 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,088,272
RAC: 104,020
Message 41196 - Posted: 7 Jan 2020, 18:51:05 UTC - in response to Message 41189.  

:-))
ID: 41196 · Report as offensive     Reply Quote
csbyseti

Send message
Joined: 6 Jul 17
Posts: 22
Credit: 29,430,354
RAC: 0
Message 41213 - Posted: 9 Jan 2020, 9:34:33 UTC - in response to Message 41182.  



What are you doing with a V5 VBOX?!! It's up to 6.1 now.
I would suggest updating and trying to get Atlas tasks again.
You are seriously out of date.

And go through Yeti's checklist to double check everything.


Got many problems with 6.0.14, reinstalled 5.2,xx
ID: 41213 · Report as offensive     Reply Quote

Message boards : ATLAS application : ATLAS tasks not using CPU but still in process


©2024 CERN