1) Message boards : ATLAS application : Unspecified error (0x80004005) - exit code 2147500037 (0x80004005) (Message 49660)
Posted 26 Feb 2024 by greg_be
Post:
There is a Scriptproblem to stop the Task,
when cvmfs or other files to connect in the startphase can not mount.

computezrmle wrote some messages for this problem and tested the bootstrap with the Help of Cern-IT.
So, some Tasks must be stopped from our side.
There is no 100% Solution.


oh well....better luck next time
2) Message boards : ATLAS application : Unspecified error (0x80004005) - exit code 2147500037 (0x80004005) (Message 49657)
Posted 26 Feb 2024 by greg_be
Post:
RDP = RemoteDesktop.
means making with this Tool a connect to a PC.
When you google (0x80004005) there are under Windows some reasons listed.
Alternativ microsoft.com support and searching this error.



Cool thanks

But another question, why did the task(s) keep running when they could not access the file (at least that seems to be the theory based on what I have found)? One had 4 hrs run time and the other had 10 hrs run time.

And then if you look at the workunit, 2 others after me crashed. https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=220274336 . First guy has pretty similar error messages as me in the hypervisor section. His bombed after 23 seconds vs my hours and he has windows 10 as well and a older Vbox version. Third guy was a linux user and he didn't even get it to start:
<core_client_version>7.7.0</core_client_version>
<![CDATA[
<message>
process got signal 7
</message>
]]>
3) Message boards : ATLAS application : Unspecified error (0x80004005) - exit code 2147500037 (0x80004005) (Message 49650)
Posted 26 Feb 2024 by greg_be
Post:
RDP? No idea what that is.
Home user hardlined into telecom router to the outside world.
It is possible they disrupted my signal for a short second or so, but that would be super rare.
And at two different times while I am home watching tv? Very unlikely.

Command:
VBoxManage -q startvm "boinc_91b24e19d2299ab1" --type headless
Output:
VBoxManage.exe: error: The VM session was aborted
VBoxManage.exe: error: Details: code E_FAIL (0x80004005), component SessionMachine, interface ISession
Waiting for VM "boinc_91b24e19d2299ab1" to power on...
Are you working with RDP and your networkconnection was lost?
4) Message boards : ATLAS application : Unspecified error (0x80004005) - exit code 2147500037 (0x80004005) (Message 49648)
Posted 25 Feb 2024 by greg_be
Post:
win10 pro
not a internet issue, something internal in vbox or the task is buggy for these two.
5) Message boards : ATLAS application : Unspecified error (0x80004005) - exit code 2147500037 (0x80004005) (Message 49644)
Posted 25 Feb 2024 by greg_be
Post:
What is causing this?
Look here https://lhcathome.cern.ch/lhcathome/result.php?resultid=406915906 and https://lhcathome.cern.ch/lhcathome/result.php?resultid=406607573
The first runs almost 10hours and 0 cpu time, the second 4hrs and 7mins using 14hrs and 56 mins cpu time on 4 cores.

I looked at another thread here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5905 but I am not sure I understand why ATLAS vdi file has to be removed or if my error is the same error or not.

Any ideas whats going on?
I've had this error before but never dug into it.

I have completed 7/9 tasks ok.
6) Message boards : Theory Application : Latest errors on tasks (Message 49610)
Posted 21 Feb 2024 by greg_be
Post:
Today they are all clean.
There needs to be a debugger that checks the code before they release them, that was a waste of computing power and time.
7) Message boards : Theory Application : Latest errors on tasks (Message 49606)
Posted 20 Feb 2024 by greg_be
Post:
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=219902109
194 (0x000000C2) EXIT_ABORTED_BY_CLIENT

Before me: 1 (0x00000001) Unknown error code
Before them: -2135228404 (0x80BB000C) Unknown error code

And then this (I am the first, it is being resent): https://lhcathome.cern.ch/lhcathome/result.php?resultid=406297248 Same thing...aborted by client

Really? Come on.....
8) Message boards : Theory Application : cranky: [ERROR] No output found - SOLVED (Message 49605)
Posted 20 Feb 2024 by greg_be
Post:
Stuff from the 17th is bad as well.
I just errored out on a task from then and before me was another error and the first person aborted.
9) Message boards : Theory Application : Problem of the day (Message 49602)
Posted 19 Feb 2024 by greg_be
Post:
What this?
2024-02-19 16:20:37 (21232): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 60 seconds) or (Vbox_job.xml: 600 seconds))
2024-02-19 16:20:45 (21232): Guest Log: 00:57:41.410265 timesync vgsvcTimeSyncWorker: Radical host time change: 2 046 321 000 000ns (HostNow=1 708 356 041 695 000 000 ns HostLast=1 708 353 995 374 000 000 ns)
2024-02-19 16:20:52 (21232): Guest Log: 00:57:51.417034 timesync vgsvcTimeSyncWorker: Radical guest time change: 2 046 347 637 000ns (GuestNow=1 708 356 051 728 222 000 ns GuestLast=1 708 354 005 380 585 000 ns fSetTimeLastLoop=true )
2024-02-19 17:13:25 (21232): Status Report: Job Duration: '864000.000000'
2024-02-19 17:13:25 (21232): Status Report: Elapsed Time: '6004.483085'
2024-02-19 17:13:25 (21232): Status Report: CPU Time: '6404.296875'
2024-02-19 17:36:43 (21232): Guest Log: job: run exitcode=0
2024-02-19 17:36:43 (21232): Guest Log: job: diskusage=4132
2024-02-19 17:36:43 (21232): Guest Log: job: logsize=72 k
2024-02-19 17:36:43 (21232): Guest Log: job: times=
2024-02-19 17:36:43 (21232): Guest Log: 0m0.008s 0m0.012s
2024-02-19 17:36:43 (21232): Guest Log: 128m8.477s 0m45.881s
2024-02-19 17:36:43 (21232): Guest Log: job: cpuusage=7734
2024-02-19 17:36:43 (21232): Guest Log: 17:36:43 CET +01:00 2024-02-19: cranky: [INFO] Container 'runc' finished with status code 0.
2024-02-19 17:36:43 (21232): Guest Log: 17:36:43 CET +01:00 2024-02-19: cranky: [INFO] Preparing output.
2024-02-19 17:36:43 (21232): Guest Log: 17:36:43 CET +01:00 2024-02-19: cranky: [ERROR] No output found.
2024-02-19 17:36:43 (21232): Guest Log: [ERROR] Job Failed
2024-02-19 17:36:43 (21232): Guest Log: [INFO] Shutting Down.
2024-02-19 17:36:43 (21232): VM Completion File Detected.
2024-02-19 17:36:43 (21232): VM Completion Message: Job Failed

Radical host time change???
What burped an hour in to make it fail?

https://lhcathome.cern.ch/lhcathome/result.php?resultid=406272110

You will see it stop and powers off and then restarts. I have other projects that I do, so this tasks time was up for the moment and another project started in it place. Then it restarts and runs an hour and dies.
10) Message boards : Theory Application : cranky: [ERROR] No output found - SOLVED (Message 49595)
Posted 17 Feb 2024 by greg_be
Post:
I've suspended this specific project for a bit to let the buggy stuff get out of the queue
11) Message boards : Theory Application : cranky: [ERROR] No output found - SOLVED (Message 49592)
Posted 17 Feb 2024 by greg_be
Post:
Windows 10 failing as well

https://lhcathome.cern.ch/lhcathome/result.php?resultid=406207676
12) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49583)
Posted 16 Feb 2024 by greg_be
Post:
Question: What caused the problem with Vbox in the first place?
---

Queue is full still from other projects. Maybe tonight there will be something from here.
I'll keep an eye on it.
13) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49576)
Posted 16 Feb 2024 by greg_be
Post:
Have your Virtualbox in manager yellow triangle?
This child error is normally from this.
This was a clear hint, but you did not look at the right place.
Use VirtualBox Manager. Right from Tools you see a pin and three small lines.
Select Media and remove CMS_2022_09_07_prod.vdi from the list, but don't delete the disk file itself.



Thanks Crystal, that's something I did not know how to do.
Even Atlas was all lit up in there.
Not much of CMS was lit up, but I cleared it out.
I'm off to work, so I'll check when I get home.
14) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49573)
Posted 15 Feb 2024 by greg_be
Post:
Have your Virtualbox in manager yellow triangle?
This child error is normally from this.


No..I just reinstalled it this morning (EU time) as well as the extension pack.
The last successful task was 5 days ago and then everything went to hell.
But that was a Theory task.
Last CMS to complete ok was 9 February and nothing since then has completed.

At around 0830 CET I reinstalled Vbox. I let it delete the previous copy and install a fresh copy.
After Vbox was finished with the install I ran extension manager.
Nothing changed.

I will try a test overnight. I will use Revo uninstalled to remove Vbox from the system and registry.
I will use Wise365 to clean my system.
I will reinstall Vbox and restart BOINC.

If in the morning there is still problems with CMS, then I don't know whats going on.
15) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49571)
Posted 15 Feb 2024 by greg_be
Post:
What is this all about?

VBoxManage -q closemedium "D:\data/projects/lhcathome.cern.ch_lhcathome/CMS_2022_09_07_prod.vdi"
Output:
VBoxManage.exe: error: Cannot close medium 'D:\data\projects\lhcathome.cern.ch_lhcathome\CMS_2022_09_07_prod.vdi' because it has 1 child media
VBoxManage.exe: error: Details: code VBOX_E_OBJECT_IN_USE (0x80bb000c), component MediumWrap, interface IMedium, callee IUnknown
VBoxManage.exe: error: Context: "Close()" at line 1875 of file VBoxManageDisk.cpp

2024-02-15 18:53:38 (25156): Could not create VM
2024-02-15 18:53:38 (25156): ERROR: VM failed to start
2024-02-15 18:53:38 (25156): Powering off VM.
2024-02-15 18:53:38 (25156): Deregistering VM. (boinc_418a52e6b5534c75, slot#26)
2024-02-15 18:53:38 (25156): Removing network bandwidth throttle group from VM.
2024-02-15 18:53:39 (25156): Removing VM from VirtualBox.

Every single CMS task I get bombs like this.
I think it time to take a break from this project until you guys can figure out whats going on.
My RAC is tanked to almost nothing and I have over 60 errors so far between CMS and Theory.
16) Message boards : Theory Application : Buugy workunit (Message 49168)
Posted 10 Jan 2024 by greg_be
Post:
Next guy in line ran: 202,302.98 201,417.20 and aborted.
That's 2 abortions and error while computing.
Unusual.

My longest run so far: 23,549.14 23,258.41
Most are 6-7,000.
17) Message boards : Theory Application : Buugy workunit (Message 49148)
Posted 6 Jan 2024 by greg_be
Post:
Someone want to have a look at this work unit?

https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=218364700


The first person errored out, I got it next rain 6 days and said screw it. All kinds of object not found errors and other stuff.

Based on this I think the third person will tail as well.
18) Message boards : ATLAS application : Exit code -2147467259 (0x80004005) - is this related to running 2 ATLAS at the same time? (Message 47656)
Posted 6 Jan 2023 by greg_be
Post:
The problem probably is, when you run two or more and you shutdown your machine BOINC wants to write those tasks to disk within 30 seconds, but can't.
When restarting those tasks the next day one or more tasks are not written correctly to disk and can't restart.


Ah ha! That makes more sense.
But also long ago, for some reason, as you might recall, I could not run 2 tasks at the same time continuously without one or both erroring out which is why I switched to 1 by 1.
That seems to have gone away, but then what you describe happens instead.
So for me 1 by 1 is a safe bet with no problems.

Thanks for the explanation.
19) Message boards : ATLAS application : Exit code -2147467259 (0x80004005) - is this related to running 2 ATLAS at the same time? (Message 47644)
Posted 5 Jan 2023 by greg_be
Post:
well again, due to the way the system processes these tasks, it is a one by one task. Finish one, download a new, repeat.
20) Message boards : ATLAS application : Exit code -2147467259 (0x80004005) - is this related to running 2 ATLAS at the same time? (Message 47642)
Posted 4 Jan 2023 by greg_be
Post:
I've been doing ATLAS this way for years.
Only time I have had this problem is trying to run 2 at the same time.
I think that is the cause, just double checking.


Next 20


©2024 CERN