Message boards :
ATLAS application :
Question: Vbox 6.1.22 vs 6.1.18 effect on tasks
Message board moderation
Author | Message |
---|---|
Send message Joined: 28 Dec 08 Posts: 341 Credit: 5,096,087 RAC: 2,428 ![]() ![]() ![]() |
Anyone know if .22 is causing task failures vs .18 which seems stable? I upgraded to .22 at the end of the month and did not have much success with tasks. I just downgraded to .18 and I think it will run fine now. (fingers crossed) But why does .22 cause troubles and .18 runs ok? Is this limited to my system or have any the rest of you had this issue? |
Send message Joined: 27 Sep 08 Posts: 861 Credit: 710,886,256 RAC: 205,631 ![]() ![]() ![]() |
I didn't see any change to my error rate for ATLAS |
Send message Joined: 14 Jan 10 Posts: 1446 Credit: 9,709,772 RAC: 763 ![]() ![]() |
I've processed ATLAS- and Theory-tasks with VBox v6.1.22 without issues. No CMS-task done yet. |
Send message Joined: 28 Dec 08 Posts: 341 Credit: 5,096,087 RAC: 2,428 ![]() ![]() ![]() |
I didn't see any change to my error rate for ATLAS Weird...I can't figure out what my system sees different from everyone else. I goto .22 and nothing but crashes. Win 10 fully updated. Ryzen 2700 (not x) with tons of memory. Run FAH in addition to a wide variety of other BOINC projects. But none of this should affect ATLAS. Using 1 core instead of 4 or 8. Was running 4 but still it bugged out in .22. Ran 8 and same problem. Ran 1 in .22 and still problems. Now in .18, 1 core, 1:25 into it (11 seconds computation) according to BoincTasks. Using .22% of the core capacity. Advancing at .014% per 2 second update intervals. Estimated 3:20 left. That would make it a nearly 5 hour task where as I used to knock them out in just over 4 hours. Stderr text: VBoxManage -q storageattach "boinc_b0a685e70dc1fb64" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "C:\boinc data\slots\37/vm_image.vdi" Output: VBoxManage.exe: error: Medium 'C:\boinc data\slots\37\vm_image.vdi' is not accessible. UUID {5e2342bb-76a4-44ff-81f6-2f3283cde68f} of the medium 'C:\boinc data\slots\37\vm_image.vdi' does not match the value {d7120b97-71c4-46f7-aa18-788f76bbccdf} stored in the media registry ('C:\Users\Greg\.VirtualBox\VirtualBox.xml') VBoxManage.exe: error: Details: code VBOX_E_INVALID_OBJECT_STATE (0x80bb0007), component MediumWrap, interface IMedium, callee IUnknown VBoxManage.exe: error: Context: "SetIds(fSetNewUuid, bstrNewUuid.raw(), fSetNewParentUuid, bstrNewParentUuid.raw())" at line 694 of file VBoxManageStorageController.cpp VBoxManage.exe: error: Failed to set the medium/parent medium UUID Notes: Another VirtualBox management application has locked the session for this VM. BOINC cannot properly monitor this VM and so this job will be aborted. <<---- But yet it is still running on my system! 2021-05-04 17:14:29 (36364): Could not create VM 2021-05-04 17:14:29 (36364): ERROR: VM failed to start 2021-05-04 17:14:36 (36364): NOTE: VM session lock error encountered. BOINC will be notified that it needs to clean up the environment. This might be a temporary problem and so this job will be rescheduled for another time. (weird because this is a fresh Vbox and it shows only 1 task running and nothing else in the system) 021-05-04 20:06:32 (8856): VM state change detected. (old = 'PoweredOff', new = 'Running') 2021-05-04 20:06:32 (8856): Preference change detected 2021-05-04 20:06:32 (8856): Setting CPU throttle for VM. (100%) 2021-05-04 20:06:34 (8856): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 900 seconds)) But no checkpoints |
Send message Joined: 27 Sep 08 Posts: 861 Credit: 710,886,256 RAC: 205,631 ![]() ![]() ![]() |
I have seen in the past after a vbox upgrade that they all bugged out until the backlog of task flushed out. I think any checked pointed task didn't like to restore the check point in a different version of vbox, you could try to clean out all the check points and/or make a full shutdown of each VM. Could also be that there is some "zombie" VMs that cause vbox to be un-happy, you normally get an error message when you open vbox |
Send message Joined: 14 Jan 10 Posts: 1446 Credit: 9,709,772 RAC: 763 ![]() ![]() |
Weird...I can't figure out what my system sees different from everyone else. I see that you had some valids with v22 and also errors with v18. Is seems not related to VirtulaBox's version. |
Send message Joined: 28 Dec 08 Posts: 341 Credit: 5,096,087 RAC: 2,428 ![]() ![]() ![]() |
Weird...I can't figure out what my system sees different from everyone else. So then what do you guys think is going on? Does a version change with active tasks or queued tasks cause problems? But look...70+ errors out of how many tasks? A lot of these ran 10+ hrs or a day or two days depending on how busy I was to be able to check them. Bugs in the tasks? I only know bugs from Rosetta, not ATLAS. The current task is chugging away at .010% per 2 seconds and now 50% with 2:47 to go. I'm hoping it does not bog down in the last hour of processing. That was another reason I was trying the upgrade. To try and beat the bog down in the last hour that slows down to .002% per 2 seconds and the CPU% goes down to something like .2% I don't want to jinx things...but...if I want to upgrade...then what? Set the project to no new tasks and let everything clear out and then upgrade? One other thing..what is the difference between 4 cores on a task and a single core or 8 cores? That is something else I don't understand. |
![]() Send message Joined: 15 Jun 08 Posts: 2628 Credit: 267,291,999 RAC: 129,263 ![]() ![]() |
greg_be wrote: Ryzen 2700 (not x) with tons of memory Your computer details page shows 24 GB RAM. That's not "tons of memory" as you configure each ATLAS VM to allocate 14 GB, even the 1 core tasks. 2021-05-04 17:07:14 (40320): Setting Memory Size for VM. (14000MB) 2021-05-04 17:07:22 (40320): Setting CPU Count for VM. (1) The BOINC server limits the RAM setting sent along with each task to 10200 MB and your client uses that value to estimate if an additional task can be started. You overwrite the RAM allocation for the VM (NOT for BOINC!) via app_config.xml. Depending on the total RAM usage from all processes (not only BOINC) your computer starts to swap and becomes slower and slower. At a certain point the whole system switches to an error handling mode which can be seen in log entries like this: 2021-05-04 17:09:51 (40320): VM is no longer is a running state. It is in 'GuruMeditation'. 2021-05-04 17:09:51 (40320): VM state change detected. (old = 'Running', new = 'GuruMeditation') A VBox version change does not crash tasks that are downloaded but not yet started. It may occasionally crash work in progress Looking at the BOINC client's progress information is useless as it never shows the real progress from inside the VMs. This has often been explained throughout the forum. |
Send message Joined: 28 Dec 08 Posts: 341 Credit: 5,096,087 RAC: 2,428 ![]() ![]() ![]() |
greg_be wrote:Ryzen 2700 (not x) with tons of memory Well I am running only one task. So 14 from ATLAS (one task only) and and the other projects plus web usage makes only 16. That leaves 8 available if needed. So again, if not RAM and not VM then what is going on? |
Send message Joined: 28 Dec 08 Posts: 341 Credit: 5,096,087 RAC: 2,428 ![]() ![]() ![]() |
[quote]greg_be wrote:Ryzen 2700 (not x) with tons of memory ---------------- So my conclusion then is, just eliminate app_config and leave the project to do what it needs? Seems to me that I am not helping much by using app_config. But the cores...again...I am confused on the difference between 1,4,8 cores per task. Does allocating more cores increase the computation speed or not? Do more cores need more memory? |
Send message Joined: 27 Sep 08 Posts: 861 Credit: 710,886,256 RAC: 205,631 ![]() ![]() ![]() |
For 1 CPU 10GB is tons, I think 3GB would be fine, but I use 10GB so it matches BOINCs allocation and I can't go over. |
Send message Joined: 28 Dec 08 Posts: 341 Credit: 5,096,087 RAC: 2,428 ![]() ![]() ![]() |
For 1 CPU 10GB is tons, I think 3GB would be fine, but I use 10GB so it matches BOINCs allocation and I can't go over. But here is the same question as below, what does 1,4,8 cpu's do to the running of the task and how much memory needs to be allocated for each group of cores? That is information I don't know. |
![]() Send message Joined: 15 Jun 08 Posts: 2628 Credit: 267,291,999 RAC: 129,263 ![]() ![]() |
The ATLAS-RAM-formula is: 3000 MB + 900 MB * n with "n" = the number of cores the VM allocates. This formula didn't change for more than 2 years. The scientific app running inside the VM will go through a setup phase on 1 core and once this is finished it starts n threads to do the event processing. |
Send message Joined: 28 Dec 08 Posts: 341 Credit: 5,096,087 RAC: 2,428 ![]() ![]() ![]() |
The ATLAS-RAM-formula is: Thanks for that information. The extra cores, is that automatic or based on what you set on your account? Does it speed up the process any by adding extra cores? Based on your formula, for 4 cores it needs 15,600MB of memory? |
![]() Send message Joined: 28 Sep 04 Posts: 759 Credit: 53,689,956 RAC: 42,267 ![]() ![]() ![]() |
The ATLAS-RAM-formula is: The amount of cores used is what you specify on your preferences on your account. Unless you have changed it with an app_config.xml file. The runtime is shorter but the CPU time should be about the same than compared to single core task. The required memory for a 4 core task is 3000 + (4 * 900) = 6600 MB. ![]() |
Send message Joined: 28 Dec 08 Posts: 341 Credit: 5,096,087 RAC: 2,428 ![]() ![]() ![]() |
The ATLAS-RAM-formula is: Well ok...trying no app_config and 4 cores to start. running .18 VBOX fingers crossed! Thanks everyone for your input. |
©2025 CERN