Message boards : ATLAS application : Atals 2.0 tasks stall at 99.xxxxxx%
Message board moderation

To post messages, you must log in.

AuthorMessage
greg_be

Send message
Joined: 28 Dec 08
Posts: 204
Credit: 1,324,701
RAC: 2,286
Message 40391 - Posted: 11 Nov 2019, 1:13:57 UTC

Guys,
I am totally baffled by Atlas's behavior.
I just aborted another task because it was only moving .00002% (I set the % up to that decimal level to see how fast it is moving if at all) every 2 seconds and not showing any increase in CPU time.

Most of my tasks show 17 seconds CPU time in BOINC Tasks and then just stall when they reach 99% and show 0% CPU usage.
The task I just aborted, shows only 14 seconds of CPU time for a run time of just short of 20 hrs.

I just started a new task from Atlas it shows 0% CPU usage, 2 seconds CPU time used (00:00:02), but completion is moving along around .00700 per 2 seconds.

And here is all the text below the CPU ID and startup sequence:

00:00:01.265626 APIC: fPostedIntrsEnabled=false fVirtApicRegsEnabled=false fSupportsTscDeadline=false
00:00:01.265652 TMR3UtcNow: nsNow=1 573 433 155 987 609 300 nsPrev=0 -> cNsDelta=1 573 433 155 987 609 300 (offLag=0 offVirtualSync=0 offVirtualSyncGivenUp=0, NowAgain=1 573 433 155 987 609 300)
00:00:01.265671 VMEmt: Halt method global1 (5)
00:00:01.265707 VMEmt: HaltedGlobal1 config: cNsSpinBlockThresholdCfg=50000
00:00:01.265740 Changing the VM state from 'CREATING' to 'CREATED'
00:00:01.266247 Changing the VM state from 'CREATED' to 'POWERING_ON'
00:00:01.266358 Changing the VM state from 'POWERING_ON' to 'RUNNING'
00:00:01.266370 Console: Machine state changed to 'Running'
00:00:01.265161 HM: fUsePauseFilter=false fUseLbrVirt=true fUseVGif=true fUseVirtVmsaveVmload=true
00:00:01.274053 VMMDev: Guest Log: BIOS: VirtualBox 6.0.14
00:00:01.274126 PCI: Setting up resources and interrupts
00:00:01.274424 PIT: mode=2 count=0x10000 (65536) - 18.20 Hz (ch=0)
00:00:01.295477 Display::i_handleDisplayResize: uScreenId=0 pvVRAM=0000000000000000 w=720 h=400 bpp=0 cbLine=0x0 flags=0x0 origin=0,0
00:00:01.329426 VMMDev: Guest Log: CPUID EDX: 0x178bfbff
00:00:01.332260 PIT: mode=2 count=0x48d3 (18643) - 64.00 Hz (ch=0)
00:00:01.357552 Display::i_handleDisplayResize: uScreenId=0 pvVRAM=000000000caf0000 w=640 h=480 bpp=32 cbLine=0xA00 flags=0x0 origin=0,0
00:00:03.809324 Display::i_handleDisplayResize: uScreenId=0 pvVRAM=0000000000000000 w=720 h=400 bpp=0 cbLine=0x0 flags=0x0 origin=0,0
00:00:03.812157 PIT: mode=2 count=0x10000 (65536) - 18.20 Hz (ch=0)
00:00:03.812381 VMMDev: Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032
00:00:03.812704 VMMDev: Guest Log: int13_harddisk: function 02, unmapped device for ELDL=80
00:00:03.812899 VMMDev: Guest Log: BIOS: Boot from Hard Disk 0 failed
00:00:03.813105 VMMDev: Guest Log: BIOS: Boot : bseqnr=2, bootseq=0003
00:00:03.813315 VMMDev: Guest Log: BIOS: CDROM boot failure code : 0002
00:00:03.813482 VMMDev: Guest Log: BIOS: Boot from CD-ROM failed
00:00:03.814177 VMMDev: Guest Log: Could not read from the boot medium! System halted.


So are my questions,
How can this run with no cpu% showing in Boinc Tasks?
Is .00700% every 2 seconds a good rate?
Why does it stall at 99%? i.e. .00002% per 2 seconds

I can run other Vbox tasks just fine.
BOINC is up to date, vbox is up to date.
Memory is fine.
Boinc allocates 8 cores to the tasks, the other 8 cores goto GPU and 8 other tasks from other projects.

So what the heck is going on here?
I've been through Yeti's list, my system is fine.
There is either a bug in the task or something wrong with BOINC, I don't know which.

Brand new Ryzen 7 2700 running at 40.75 GHZ. Max OC is 41.
Temp runs around 76C with Arctic cooling radiator and fans.
BOINC can take all the memory it needs and create a huge chunk of Virtual Memory on the Samsung digital drive if needed. So I don't understand whats going on.
ID: 40391 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 204
Credit: 1,324,701
RAC: 2,286
Message 40423 - Posted: 13 Nov 2019, 10:20:05 UTC

Since the change I can not get ATLAS to run properly on my system.
The main issue is that when the task reaches 98.xxx% it stalls.
Looking at BOINC tasks I see the progress to be .00010 or sometimes as low at .00007 every 2 seconds.
There is no CPU activity identified. CPU usage is between 3-7 seconds, run time in 6 hour is 2 days+.
I abort tasks that run this slow as there is no defined stop time. It's already hogged my system for 2 days and then your only advancing 7 or 10 ten thousands of a percent every 2 seconds there is something seriously wrong.

All my aborts are for this reason.
I had some earlier memory errors as I share my system with other projects and I did not have enough memory, but since then I have gotten more memory.

ATLAS and LHC in general have access to all the virtual memory as well and I have a pretty good size SSD for the tasks to work off of as well.

So I don't know whats going on. VBOX is version 6.0.14 r133895 (Qt5.6.2) which is the latest.
Extension packs are up to date as well.
So really don't know what is going on.
ID: 40423 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 881
Credit: 32,594,782
RAC: 45,785
Message 40424 - Posted: 13 Nov 2019, 10:37:42 UTC - in response to Message 40423.  
Last modified: 13 Nov 2019, 10:54:46 UTC

Wrong thread, sorry. Please transfer messages 40423 and 40424 to
Atals 2.0 tasks stall at 99.xxxxxx%. Thank you

This is my app_config to use 5 Cores.
<app_config>
<app>
<name>ATLAS</name>
<max_concurrent>2</max_concurrent>
</app>
<app_version>
<app_name>ATLAS</app_name>
<avg_ncpus>5</avg_ncpus>
<plan_class>vbox64_mt_mcore_atlas</plan_class>
<cmdline>--memory_size_mb 7500</cmdline>
</app_version>
</app_config>

Have also a Ryzen 2700, but with using of 6 Cores, for example.
You need to get more memory, if you use more CPU's.

Setting in Preferences in LHC@Home for all Computer:
max.Tasks 4 max.Cpu 4
Only Atlas is used with (Home, School or Work).
ID: 40424 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 204
Credit: 1,324,701
RAC: 2,286
Message 40431 - Posted: 13 Nov 2019, 12:13:25 UTC - in response to Message 40424.  
Last modified: 13 Nov 2019, 12:34:07 UTC

Memory is not an issue. 24 gigs and I have yet to see my monitoring software hit even 90%.

Processors..why limit them?

But what is causing the extremely low completion rate?
7 to 10 10,000th of a percent is beyond stupid.
But only once it reaches 98%.
Prior to 98% it runs fine.
ID: 40431 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 311
Credit: 9,922,747
RAC: 8,275
Message 40433 - Posted: 13 Nov 2019, 13:36:30 UTC - in response to Message 40431.  

Please don't pay too much attention to the percentage complete as shown by BOINC. It has nothing to do with what's going on in the task, it's simply how the current elapsed walltime compares to the estimated walltime. If the task goes over the estimate then the percentage complete gradually slows down so as not to go over 100%.

What is more important is whether the CPU is busy, if it is running close to 100% then it's probably doing useful work and it's worth waiting. If it's close to 0% for more than a few minutes then something is wrong and the task should be aborted. I see your host has returned some good WU so there is nothing fundamentally wrong with the set up. If you get another stuck WU could you keep the stderr.txt inside the slot directory before aborting it, and paste it here?
ID: 40433 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 204
Credit: 1,324,701
RAC: 2,286
Message 40434 - Posted: 13 Nov 2019, 14:21:01 UTC - in response to Message 40433.  
Last modified: 13 Nov 2019, 14:21:53 UTC

Thank you David for confirming that my system is ok.
It is the tasks lately that reach 98% and then stall with 0% CPU usage.

I know that one task completed yesterday and it showed all the correct signs of running correctly, CPU % and increased run time vs clock time. So I am glad to get at least 1 task processed to the end with a success.

When I get the next ATLAS that stalls I put the text here.
ID: 40434 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 204
Credit: 1,324,701
RAC: 2,286
Message 40713 - Posted: 28 Nov 2019, 13:16:42 UTC
Last modified: 28 Nov 2019, 13:35:58 UTC

Here we go again.

So here is all the info

From properties:
Application
ATLAS Simulation 2.00 (vbox64_mt_mcore_atlas)
Name
Aj3KDmsfrsvnsSi4apGgGQJmABFKDmABFKDmychaDmABFKDmu1eJ8n
State
Running
Received
11/23/2019 11:27:32 PM
Report deadline
11/30/2019 11:27:32 PM
Resources
8 CPUs
Estimated computation size
43,200 GFLOPs
CPU time
00:00:19
CPU time since checkpoint
---
Elapsed time
2d 09:28:19
Estimated time remaining
00:00:06
Fraction done
99.997%
Virtual memory size
123.99 MB
Working set size
9.96 GB
Directory
slots/17
Process ID
13600
Progress rate
1.800% per hour
Executable
vboxwrapper_26198ab7_windows_x86_64.exe


------------------
From Stderr

2019-11-23 23:32:14 (23956): Detected: vboxwrapper 26197
2019-11-23 23:32:14 (23956): Detected: BOINC client v7.7
2019-11-23 23:32:15 (23956): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-23 23:32:15 (23956): Successfully copied 'init_data.xml' to the shared directory.
2019-11-23 23:32:16 (23956): Create VM. (boinc_15fb9ac1048fc2dc, slot#17)
2019-11-23 23:32:17 (23956): Setting Memory Size for VM. (10200MB)
2019-11-23 23:32:17 (23956): Setting CPU Count for VM. (8)
2019-11-23 23:32:18 (23956): Setting Chipset Options for VM.
2019-11-23 23:32:18 (23956): Setting Boot Options for VM.
2019-11-23 23:32:18 (23956): Setting Network Configuration for NAT.
2019-11-23 23:32:18 (23956): Enabling VM Network Access.
2019-11-23 23:32:19 (23956): Disabling USB Support for VM.
2019-11-23 23:32:19 (23956): Disabling COM Port Support for VM.
2019-11-23 23:32:19 (23956): Disabling LPT Port Support for VM.
2019-11-23 23:32:20 (23956): Disabling Audio Support for VM.
2019-11-23 23:32:20 (23956): Disabling Clipboard Support for VM.
2019-11-23 23:32:20 (23956): Disabling Drag and Drop Support for VM.
2019-11-23 23:32:21 (23956): Adding storage controller(s) to VM.
2019-11-23 23:32:21 (23956): Adding virtual disk drive to VM. (vm_image.vdi)
2019-11-23 23:32:54 (23956): Error in storage attach (fixed disk) for VM: -2135228409
Command:
VBoxManage -q storageattach "boinc_15fb9ac1048fc2dc" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "C:\boinc data\slots\17/vm_image.vdi"
Output:
VBoxManage.exe: error: Medium 'C:\boinc data\slots\17\vm_image.vdi' is not accessible. UUID {5e2342bb-76a4-44ff-81f6-2f3283cde68f} of the medium 'C:\boinc data\slots\17\vm_image.vdi' does not match the value {1cdc7997-f381-4450-89ed-c03e209d66ff} stored in the media registry ('C:\Users\Greg\.VirtualBox\VirtualBox.xml')
VBoxManage.exe: error: Details: code VBOX_E_INVALID_OBJECT_STATE (0x80bb0007), component MediumWrap, interface IMedium, callee IUnknown
VBoxManage.exe: error: Context: "SetIds(fSetNewUuid, bstrNewUuid.raw(), fSetNewParentUuid, bstrNewParentUuid.raw())" at line 694 of file VBoxManageStorageController.cpp
VBoxManage.exe: error: Failed to set the medium/parent medium UUID

Notes:

Another VirtualBox management application has locked the session for
this VM. BOINC cannot properly monitor this VM
and so this job will be aborted.


2019-11-23 23:32:54 (23956): Could not create VM
2019-11-23 23:32:54 (23956): ERROR: VM failed to start
2019-11-23 23:32:59 (23956):
NOTE: VM session lock error encountered.
BOINC will be notified that it needs to clean up the environment.
This might be a temporary problem and so this job will be rescheduled for another time.

2019-11-24 00:04:52 (34824): Detected: vboxwrapper 26197
2019-11-24 00:04:52 (34824): Detected: BOINC client v7.7
2019-11-24 00:04:54 (34824): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-24 00:04:59 (34824): Starting VM using VBoxManage interface. (boinc_15fb9ac1048fc2dc, slot#17)
2019-11-24 00:05:04 (34824): Successfully started VM. (PID = '35080')
2019-11-24 00:05:04 (34824): Reporting VM Process ID to BOINC.
2019-11-24 00:05:04 (34824): Guest Log: BIOS: VirtualBox 6.0.14

2019-11-24 00:05:04 (34824): Guest Log: CPUID EDX: 0x178bfbff

2019-11-24 00:05:04 (34824): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-24 00:05:05 (34824): Preference change detected
2019-11-24 00:05:05 (34824): Setting CPU throttle for VM. (100%)
2019-11-24 00:05:07 (34824): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-24 00:05:07 (34824): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032

2019-11-24 00:05:07 (34824): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=80

2019-11-24 00:05:07 (34824): Guest Log: BIOS: Boot from Hard Disk 0 failed

2019-11-24 00:05:07 (34824): Guest Log: BIOS: Boot : bseqnr=2, bootseq=0003

2019-11-24 00:05:07 (34824): Guest Log: BIOS: CDROM boot failure code : 0002

2019-11-24 00:05:07 (34824): Guest Log: BIOS: Boot from CD-ROM failed

2019-11-24 00:05:07 (34824): Guest Log: Could not read from the boot medium! System halted.

2019-11-24 02:07:30 (34824): Status Report: Elapsed Time: '6000.904350'
2019-11-24 02:07:30 (34824): Status Report: CPU Time: '3.453125'
2019-11-24 04:31:52 (34824): Status Report: Elapsed Time: '12001.578341'
2019-11-24 04:31:52 (34824): Status Report: CPU Time: '3.500000'
2019-11-24 06:25:40 (34824): Status Report: Elapsed Time: '18001.578341'
2019-11-24 06:25:40 (34824): Status Report: CPU Time: '3.562500'
2019-11-24 08:07:59 (34824): Status Report: Elapsed Time: '24001.578341'
2019-11-24 08:07:59 (34824): Status Report: CPU Time: '3.578125'
2019-11-24 09:48:35 (34824): Status Report: Elapsed Time: '30001.578341'
2019-11-24 09:48:35 (34824): Status Report: CPU Time: '3.625000'
2019-11-24 11:29:07 (34824): Status Report: Elapsed Time: '36001.578341'
2019-11-24 11:29:07 (34824): Status Report: CPU Time: '3.671875'
2019-11-24 12:41:09 (34824): Stopping VM.
12:41:27 (34824): BOINC client no longer exists - exiting
12:41:27 (34824): timer handler: client dead, exiting
2019-11-24 12:43:02 (17668): Detected: vboxwrapper 26197
2019-11-24 12:43:02 (17668): Detected: BOINC client v7.7
2019-11-24 12:43:03 (17668): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-24 12:43:04 (17668): Starting VM using VBoxManage interface. (boinc_15fb9ac1048fc2dc, slot#17)
2019-11-24 12:43:10 (17668): Successfully started VM. (PID = '17948')
2019-11-24 12:43:10 (17668): Reporting VM Process ID to BOINC.
2019-11-24 12:43:10 (17668): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-24 12:43:10 (17668): Status Report: Elapsed Time: '40306.578341'
2019-11-24 12:43:10 (17668): Status Report: CPU Time: '3.687500'
2019-11-24 12:43:10 (17668): Preference change detected
2019-11-24 12:43:10 (17668): Setting CPU throttle for VM. (100%)
2019-11-24 12:43:10 (17668): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-24 14:24:01 (17668): Status Report: Elapsed Time: '46306.578341'
2019-11-24 14:24:01 (17668): Status Report: CPU Time: '6.281250'
2019-11-24 16:04:56 (17668): Status Report: Elapsed Time: '52306.578341'
2019-11-24 16:04:56 (17668): Status Report: CPU Time: '6.281250'
2019-11-24 16:07:16 (17668): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-24 16:07:21 (17668): Stopping VM.
16:07:34 (17668): BOINC client no longer exists - exiting
16:07:34 (17668): timer handler: client dead, exiting
2019-11-24 17:33:04 (12748): Detected: vboxwrapper 26197
2019-11-24 17:33:04 (12748): Detected: BOINC client v7.7
2019-11-24 17:33:05 (12748): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-24 17:33:06 (12748): Starting VM using VBoxManage interface. (boinc_15fb9ac1048fc2dc, slot#17)
2019-11-24 17:33:12 (12748): Successfully started VM. (PID = '10576')
2019-11-24 17:33:12 (12748): Reporting VM Process ID to BOINC.
2019-11-24 17:33:12 (12748): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-24 17:33:12 (12748): Status Report: Elapsed Time: '52442.578341'
2019-11-24 17:33:12 (12748): Status Report: CPU Time: '6.281250'
2019-11-24 17:33:12 (12748): Preference change detected
2019-11-24 17:33:12 (12748): Setting CPU throttle for VM. (100%)
2019-11-24 17:33:12 (12748): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-24 19:29:31 (12748): Status Report: Elapsed Time: '58443.312851'
2019-11-24 19:29:31 (12748): Status Report: CPU Time: '8.859375'
2019-11-24 19:54:20 (12748): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-24 19:56:40 (12748): VM state change detected. (old = 'Paused', new = 'Running')
2019-11-24 22:07:28 (12748): Status Report: Elapsed Time: '64443.437390'
2019-11-24 22:07:28 (12748): Status Report: CPU Time: '8.875000'
2019-11-24 22:31:08 (12748): Preference change detected
2019-11-24 22:31:08 (12748): Setting CPU throttle for VM. (100%)
2019-11-24 22:31:10 (12748): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-24 23:00:58 (12748): Preference change detected
2019-11-24 23:00:58 (12748): Setting CPU throttle for VM. (100%)
2019-11-24 23:01:03 (12748): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-24 23:02:32 (12748): Preference change detected
2019-11-24 23:02:32 (12748): Setting CPU throttle for VM. (100%)
2019-11-24 23:02:33 (12748): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-24 23:05:13 (12748): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-24 23:05:20 (12748): Stopping VM.
2019-11-24 23:19:57 (18116): Detected: vboxwrapper 26197
2019-11-24 23:19:57 (18116): Detected: BOINC client v7.7
2019-11-24 23:19:58 (18116): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-24 23:19:59 (18116): Starting VM using VBoxManage interface. (boinc_15fb9ac1048fc2dc, slot#17)
2019-11-24 23:20:03 (18116): Successfully started VM. (PID = '512')
2019-11-24 23:20:03 (18116): Reporting VM Process ID to BOINC.
2019-11-24 23:20:03 (18116): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-24 23:20:03 (18116): Status Report: Elapsed Time: '66767.393921'
2019-11-24 23:20:03 (18116): Status Report: CPU Time: '8.906250'
2019-11-24 23:20:03 (18116): Preference change detected
2019-11-24 23:20:03 (18116): Setting CPU throttle for VM. (100%)
2019-11-24 23:20:04 (18116): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-24 23:43:04 (18116): Stopping VM.
23:43:24 (15940): Can't acquire lockfile (32) - waiting 35s
23:43:59 (15940): Can't acquire lockfile (32) - exiting
23:43:59 (15940): Error: The process cannot access the file because it is being used by another process.

(0x20)
2019-11-24 23:48:06 (18116): VM did not stop when requested.
2019-11-24 23:48:06 (18116): VM was NOT successfully terminated.
2019-11-25 18:13:09 (3644): Detected: vboxwrapper 26197
2019-11-25 18:13:09 (3644): Detected: BOINC client v7.7
2019-11-25 18:13:32 (3644): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-25 18:13:33 (3644): Starting VM using VBoxManage interface. (boinc_15fb9ac1048fc2dc, slot#17)
2019-11-25 18:13:39 (3644): Successfully started VM. (PID = '21156')
2019-11-25 18:13:39 (3644): Reporting VM Process ID to BOINC.
2019-11-25 18:13:39 (3644): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-25 18:13:39 (3644): Status Report: Elapsed Time: '68128.393921'
2019-11-25 18:13:39 (3644): Status Report: CPU Time: '10.875000'
2019-11-25 18:13:39 (3644): Preference change detected
2019-11-25 18:13:39 (3644): Setting CPU throttle for VM. (100%)
2019-11-25 18:13:40 (3644): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-25 19:27:15 (3644): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-25 19:27:20 (3644): Stopping VM.
2019-11-25 19:32:21 (3644): VM did not stop when requested.
2019-11-25 19:32:21 (3644): VM was NOT successfully terminated.
2019-11-25 21:01:13 (21904): Detected: vboxwrapper 26197
2019-11-25 21:01:13 (21904): Detected: BOINC client v7.7
2019-11-25 21:01:14 (21904): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-25 21:01:15 (21904): Starting VM using VBoxManage interface. (boinc_15fb9ac1048fc2dc, slot#17)
2019-11-25 21:01:20 (21904): Successfully started VM. (PID = '8048')
2019-11-25 21:01:20 (21904): Reporting VM Process ID to BOINC.
2019-11-25 21:01:20 (21904): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-25 21:01:20 (21904): Status Report: Elapsed Time: '72452.393921'
2019-11-25 21:01:20 (21904): Status Report: CPU Time: '12.921875'
2019-11-25 21:01:20 (21904): Preference change detected
2019-11-25 21:01:20 (21904): Setting CPU throttle for VM. (100%)
2019-11-25 21:01:21 (21904): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-25 22:43:07 (21904): Status Report: Elapsed Time: '78452.393921'
2019-11-25 22:43:07 (21904): Status Report: CPU Time: '14.953125'
2019-11-26 00:24:59 (21904): Status Report: Elapsed Time: '84452.393921'
2019-11-26 00:24:59 (21904): Status Report: CPU Time: '14.984375'
2019-11-26 00:51:16 (21904): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-26 00:51:20 (21904): Stopping VM.
2019-11-26 00:56:21 (21904): VM did not stop when requested.
2019-11-26 00:56:21 (21904): VM was NOT successfully terminated.
2019-11-26 03:57:10 (24000): Detected: vboxwrapper 26197
2019-11-26 03:57:10 (24000): Detected: BOINC client v7.7
2019-11-26 03:57:12 (24000): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-26 03:57:14 (24000): Starting VM using VBoxManage interface. (boinc_15fb9ac1048fc2dc, slot#17)
2019-11-26 03:57:19 (24000): Successfully started VM. (PID = '22016')
2019-11-26 03:57:19 (24000): Reporting VM Process ID to BOINC.
2019-11-26 03:57:19 (24000): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-26 03:57:19 (24000): Status Report: Elapsed Time: '86002.393921'
2019-11-26 03:57:19 (24000): Status Report: CPU Time: '15.000000'
2019-11-26 03:57:19 (24000): Preference change detected
2019-11-26 03:57:19 (24000): Setting CPU throttle for VM. (100%)
2019-11-26 03:57:20 (24000): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-26 05:39:03 (24000): Status Report: Elapsed Time: '92002.393921'
2019-11-26 05:39:03 (24000): Status Report: CPU Time: '16.890625'
2019-11-26 07:20:17 (24000): Status Report: Elapsed Time: '98002.393921'
2019-11-26 07:20:17 (24000): Status Report: CPU Time: '16.890625'
2019-11-26 09:01:30 (24000): Status Report: Elapsed Time: '104002.393921'
2019-11-26 09:01:30 (24000): Status Report: CPU Time: '16.906250'
2019-11-26 10:42:39 (24000): Status Report: Elapsed Time: '110002.393921'
2019-11-26 10:42:39 (24000): Status Report: CPU Time: '16.906250'
2019-11-26 12:09:15 (24000): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-26 20:21:32 (24000): VM state change detected. (old = 'Paused', new = 'Running')
2019-11-26 20:36:37 (24000): Status Report: Elapsed Time: '116002.393921'
2019-11-26 20:36:37 (24000): Status Report: CPU Time: '16.921875'
2019-11-26 22:20:32 (24000): Status Report: Elapsed Time: '122002.393921'
2019-11-26 22:20:32 (24000): Status Report: CPU Time: '16.921875'
2019-11-27 00:03:31 (24000): Status Report: Elapsed Time: '128002.393921'
2019-11-27 00:03:31 (24000): Status Report: CPU Time: '16.937500'
2019-11-27 01:45:17 (24000): Status Report: Elapsed Time: '134002.393921'
2019-11-27 01:45:17 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 03:26:38 (24000): Status Report: Elapsed Time: '140002.393921'
2019-11-27 03:26:38 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 05:08:02 (24000): Status Report: Elapsed Time: '146002.393921'
2019-11-27 05:08:02 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 06:49:23 (24000): Status Report: Elapsed Time: '152002.393921'
2019-11-27 06:49:23 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 08:30:40 (24000): Status Report: Elapsed Time: '158002.393921'
2019-11-27 08:30:40 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 10:11:49 (24000): Status Report: Elapsed Time: '164002.393921'
2019-11-27 10:11:49 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 11:52:53 (24000): Status Report: Elapsed Time: '170002.393921'
2019-11-27 11:52:53 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 13:33:58 (24000): Status Report: Elapsed Time: '176002.393921'
2019-11-27 13:33:58 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 15:15:30 (24000): Status Report: Elapsed Time: '182002.393921'
2019-11-27 15:15:30 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 16:56:59 (24000): Status Report: Elapsed Time: '188002.393921'
2019-11-27 16:56:59 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 18:38:55 (24000): Status Report: Elapsed Time: '194002.393921'
2019-11-27 18:38:55 (24000): Status Report: CPU Time: '17.218750'
2019-11-27 19:45:01 (24000): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-28 08:17:54 (24000): VM state change detected. (old = 'Paused', new = 'Running')
2019-11-28 08:57:04 (24000): Status Report: Elapsed Time: '200002.691839'
2019-11-28 08:57:04 (24000): Status Report: CPU Time: '17.390625'
2019-11-28 10:39:50 (24000): Status Report: Elapsed Time: '206002.691839'
2019-11-28 10:39:50 (24000): Status Report: CPU Time: '17.390625'
2019-11-28 12:21:35 (24000): Status Report: Elapsed Time: '212002.729229'
2019-11-28 12:21:35 (24000): Status Report: CPU Time: '17.406250'
2019-11-28 14:01:56 (24000): Status Report: Elapsed Time: '218002.729229'
2019-11-28 14:01:56 (24000): Status Report: CPU Time: '17.546875'
2019-11-28 14:02:40 (24000): VM state change detected. (old = 'Running', new = 'Paused')
2019-11-28 14:02:44 (24000): Stopping VM.
2019-11-28 14:06:55 (13600): Detected: vboxwrapper 26197
2019-11-28 14:06:55 (13600): Detected: BOINC client v7.7
2019-11-28 14:06:56 (13600): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2019-11-28 14:06:57 (13600): Starting VM using VBoxManage interface. (boinc_15fb9ac1048fc2dc, slot#17)
2019-11-28 14:07:01 (13600): Successfully started VM. (PID = '20428')
2019-11-28 14:07:01 (13600): Reporting VM Process ID to BOINC.
2019-11-28 14:07:01 (13600): VM state change detected. (old = 'PoweredOff', new = 'Running')
2019-11-28 14:07:01 (13600): Status Report: Elapsed Time: '218044.729229'
2019-11-28 14:07:01 (13600): Status Report: CPU Time: '17.546875'
2019-11-28 14:07:01 (13600): Preference change detected
2019-11-28 14:07:01 (13600): Setting CPU throttle for VM. (100%)
2019-11-28 14:07:02 (13600): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds))


Stalled at 99.997%
Shows 0% CPU usage.

Note: BOINC MGR seems to have opened itself 2x extra in triplicate.
Went to shut down client and got another version, went to shut down that one and get a multiple instance running error. Shut that down and finally got to the first.

Task name: Aj3KDmsfrsvnsSi4apGgGQJmABFKDmABFKDmychaDmABFKDmu1eJ8n (now aborted)

I have been getting errors on my other ATLAS tasks, but did not notice them. Think it is related to the multiple BOINC clients being open.

System restarted. Tasks are running and showing progress, but no CPU time is being reported to Boinc Tasks.

CPU usage time has stopped at 8 seconds on one task and 2 seconds on another task. The tasks are still running.

This behavior in the past means they will stall out when in the 90+% range or complete at a rate of something like .0002% per 2 second update on BOINC tasks.
ID: 40713 · Report as offensive     Reply Quote

Message boards : ATLAS application : Atals 2.0 tasks stall at 99.xxxxxx%


©2020 CERN