Message boards :
Number crunching :
Computation Errors
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Nov 09 Posts: 4 Credit: 1,660,744 RAC: 39 |
This has recently started happening (within the last few days). The tasks all ends at about 25 seconds and the log shows the following in each case: 2024-02-02 6:51:06 AM | LHC@home | Output file Theory_2687-2563509-84_0_r635125659_result for task Theory_2687-2563509-84_0 absent Any ideas what is happening? Thanks |
Send message Joined: 2 May 07 Posts: 2228 Credit: 173,796,552 RAC: 18,386 |
Please set a limit for Theory-Tasks in pref, before the reason is found. You have successful Tasks in the last month. Are there yellow triangle in vboxmanager? |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 251,910,281 RAC: 128,443 |
A similar VirtualBox error has been described here together with steps to solve it: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6079&postid=49046 Instead of the CMS vdi you need to cleanup the Theory vdi. |
Send message Joined: 15 Nov 09 Posts: 4 Credit: 1,660,744 RAC: 39 |
Thanks for the info, will give that a try. |
Send message Joined: 15 Nov 09 Posts: 4 Credit: 1,660,744 RAC: 39 |
I set my prefs for CMS only and the tasks download and run correctly so the issue seems to be with the Theory tasks. I monitored the startup in Vbox and noticed that the instance would begin to start then it would fail and a cleanup would remove the files. I did capture the vbox trace file to see what was happening and the following error pops up every time an instance tries to start: Command: VBoxManage -q showvminfo "boinc_23c1e4f58830b1c0" --machinereadable Exit Code: -2135228415 Output: VBoxManage.exe: error: Could not find a registered machine named 'boinc_23c1e4f58830b1c0' VBoxManage.exe: error: Details: code VBOX_E_OBJECT_NOT_FOUND (0x80bb0001), component VirtualBoxWrap, interface IVirtualBox, callee IUnknown VBoxManage.exe: error: Context: "FindMachine(Bstr(VMNameOrUuid).raw(), machine.asOutParam())" at line 3139 of file VBoxManageInfo.cpp 2024-02-03 21:54:36 (12296): Command: VBoxManage -q showhdinfo "D:\Program Data\BOINC\slots\0/vm_image.vdi" Exit Code: -2135228412 Output: VBoxManage.exe: error: Could not find file for the medium 'D:\Program Data\BOINC\slots\0\vm_image.vdi' (VERR_FILE_NOT_FOUND) VBoxManage.exe: error: Details: code VBOX_E_FILE_ERROR (0x80bb0004), component MediumWrap, interface IMedium, callee IUnknown VBoxManage.exe: error: Context: "OpenMedium(Bstr(pszFilenameOrUuid).raw(), enmDevType, enmAccessMode, fForceNewUuidOnOpen, pMedium.asOutParam())" at line 205 of file VBoxManageDisk.cpp 2024-02-03 21:54:36 (12296): Command: VBoxManage -q createvm --name "boinc_23c1e4f58830b1c0" --basefolder "D:\Program Data\BOINC\slots\0" --ostype "Linux26_64" --register Exit Code: 0 And later in the log: 2024-02-03 21:54:40 (12296): Command: VBoxManage -q storageattach "boinc_23c1e4f58830b1c0" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --mtype multiattach --medium "D:\Program Data\BOINC/projects/lhcathome.cern.ch_lhcathome/Theory_2023_12_13.vdi" Exit Code: -2135228409 Output: VBoxManage.exe: error: Cannot attach medium 'D:\Program Data\BOINC\projects\lhcathome.cern.ch_lhcathome\Theory_2023_12_13.vdi': the media type 'MultiAttach' can only be attached to machines that were created with VirtualBox 4.0 or later VBoxManage.exe: error: Details: code VBOX_E_INVALID_OBJECT_STATE (0x80bb0007), component SessionMachine, interface IMachine, callee IUnknown VBoxManage.exe: error: Context: "AttachDevice(Bstr(pszCtl).raw(), port, device, DeviceType_HardDisk, pMedium2Mount)" at line 785 of file VBoxManageStorageController.cpp 2024-02-03 21:54:41 (12296): Command: VBoxManage -q closemedium "D:\Program Data\BOINC/projects/lhcathome.cern.ch_lhcathome/Theory_2023_12_13.vdi" Exit Code: -2135228404 Output: VBoxManage.exe: error: Cannot close medium 'D:\Program Data\BOINC\projects\lhcathome.cern.ch_lhcathome\Theory_2023_12_13.vdi' because it has 1 child media VBoxManage.exe: error: Details: code VBOX_E_OBJECT_IN_USE (0x80bb000c), component MediumWrap, interface IMedium, callee IUnknown VBoxManage.exe: error: Context: "Close()" at line 1878 of file VBoxManageDisk.cpp Since I have been doing Theory tasks in the past something must have changed but I have not made any changes to the host. Ideas would be appreciated where to look. Thanks |
Send message Joined: 14 Jan 10 Posts: 1411 Credit: 9,433,926 RAC: 11,615 |
In your logs: "VBoxManage.exe: error: Cannot close medium 'D:\Program Data\BOINC\projects\lhcathome.cern.ch_lhcathome\Theory_2023_12_13.vdi' because it has 1 child media" Use VirtualBox Manager. Go to media and remove the Theory_2023_12_13.vdi description from the list, but keep the file itself. |
Send message Joined: 15 Nov 09 Posts: 4 Credit: 1,660,744 RAC: 39 |
Thank you for the direction. That worked perfectly and I can now process Theory Tasks. Something else to add to the Troubleshooting wallboard lol. |
Send message Joined: 23 Aug 21 Posts: 3 Credit: 2,999,096 RAC: 501 |
Every LHC task is failing at 00:00:08 with a computation error: Tue Apr 30 15:27:54 2024 | LHC@home | Starting task CMS_860485_1714503592.613793_0 Tue Apr 30 15:28:05 2024 | LHC@home | Computation for task CMS_860485_1714503592.613793_0 finished Any idea why this is happening? |
Send message Joined: 14 Jan 10 Posts: 1411 Credit: 9,433,926 RAC: 11,615 |
Any idea why this is happening? You have your machine(s) hidden, so we can't see the results |
Send message Joined: 23 Aug 21 Posts: 3 Credit: 2,999,096 RAC: 501 |
Visible now? |
Send message Joined: 14 Jan 10 Posts: 1411 Credit: 9,433,926 RAC: 11,615 |
Yeah, visible now! Thanks. On 1 machine I see child media. That are remnants of Virtual Machines that should been cleaned by BOINC's wrapper but didn't. On the other machine you probably started several CMS-tasks at once and there I see an error not seen before, but maybe special for Darwin OS: 2024-04-30 21:47:48 (27867): Could not set race mitigation lock. 2024-04-30 21:47:48 (27867): Lockname: '/boinc_vboxwrapper_lock_c94b628801c684e7' 2024-04-30 21:47:48 (27867): Error: 63, File name too long 2024-04-30 21:47:48 (27867): Attempts: 1 The easiest way probably is on both machines to reset LHC@home project to start clean. After the reset don't ask tasks immediately, but remove remnants from the VM's by using VirtualBox Manager Right from Tools you see a pinned button. Select Media and remove all LHC related media and even delete related files, when asked. To start with a first task, set in your project preferences only Theory and only 1 job. For the second problem it's best to start VM-tasks with 1 minute interval. That problem is solved for Windows and Linux OS, but it looks like Darwin has a problem with longer file names there. |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 251,910,281 RAC: 128,443 |
Looks like on Darwin the lock name must not exceed 31 characters while Linux and Windows allow >250. This can't be solved without a new vboxwrapper. |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
New versions are available for Theory and CMS with the new wrapper. |
Send message Joined: 2 May 07 Posts: 2228 Credit: 173,796,552 RAC: 18,386 |
Thank you Laurence, for me (Win11pro) no Problems so long. Good Work. |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 251,910,281 RAC: 128,443 |
Last change was for Apple only. ;-) |
Send message Joined: 2 May 07 Posts: 2228 Credit: 173,796,552 RAC: 18,386 |
70.30 (vbox64_mt_mcore_cms) 29 Apr 2024, 12:37:49 UTC 300.30 (vbox64_theory) 29 Apr 2024, 19:43:57 UTC It's the sight of Virtualbox. |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 251,910,281 RAC: 128,443 |
Apps for Windows and Linux from earlier this week are running fine. The general announcement can be found here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6149 This discussion is about Computation Errors. The recent ones affect Apple and even the app published this morning didn't solve it. Taken from url=https://lhcathome.cern.ch/lhcathome/apps.php: Intel 64-bit Mac OS 10.5 or later 300.40 (vbox64_theory) 3 May 2024, 8:42:28 UTC Most likely by accident it still uses a vboxwrapper not including the required patch. |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
I have bumped the apple version again using a more recent build of vboxwrapper. |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 251,910,281 RAC: 128,443 |
At least some of the Theory tasks succeeded now: Intel 64-bit Mac OS 10.5 or later 300.50 (vbox64_theory) 3 May 2024, 13:08:28 UTC 5 GigaFLOPS |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 251,910,281 RAC: 128,443 |
Meanwhile CMS also receives valid results from apple hosts: Intel 64-bit Mac OS 10.5 or later 70.50 (vbox64_mt_mcore_cms) 3 May 2024, 13:12:17 UTC 12 GigaFLOPS |
©2024 CERN