InfoMessage
1) Message boards : Number crunching : Output File missing and tasks not stopping
Message 53284
Posted 16 days ago by rob
This delightful feature is the result of the way the duration estimate does not take notice of a major part of the calculations - the "final integration" step, which can take hundreds of steps is not used (or at least appears not to be used) in the initial duration calculation. This may be because the actual number of integration steps is not known at the outset, but that's only an excuse, it would be perfectly possible to take a sensible estimate (say 1000 of these steps) at the outset then use the time for successive steps to recalibrate the estimated duration.

Another real gotcha is that if for any reason the task is not stopped correctly the whole calculation is restarted from the very first part (this may only apply to tasks running in a vb under Windows). As a result you may have a task that really nearing completion having taken several days to get there starts again, rattles through the first few stages (they normally take a few minutes), only to apparently stall at 100% completion, but still have many hundred integration steps to complete, so taking another few days.....
2) Message boards : Theory Application : This gonna be long
Message 53282
Posted 17 days ago by rob
There are several sections to the file.
First is covers setting up, input parameters, temporary files and the like. (This section is about 50 lines long)
Next comes the unpacking the histograms (after about 50 lines), lots more configuration stuff (up to about line 100)
More setting up of temporary files and locations
Resolve the process parameters
and now we get to the first "real" section of calculations (matchbox & herwig)
the progress of this stage is recorded in a number groups of lines that look like
determining subprocesses for p p l l j j
building matrix elements.

0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************

created 350 subprocesses.

Now there's a few lines of results (interim results)
Now comes what is often the biggest section of the file - the Herwig integration
As ever some parameters are listed
Now progress of the integration
First is a line that gives an indication of the number of steps to be completed
Integrate 1 of 760:


Next is whole raft of groups of lines similar to these:
Integrate 1 of 760:
integrating ME[bbar,bbar->mu+,bbar,bbar,mu-].SubtractedReal : bbar bbar -> mu+ bbar bbar mu- , iteration 1
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
integrated ( 0.000226617 +/- 4.00611e-05 ) nb
epsilon = 9.47764e-05
---------------------------------------------------
integrating ME[bbar,bbar->mu+,bbar,bbar,mu-].SubtractedReal : bbar bbar -> mu+ bbar bbar mu- , iteration 2
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
integrated ( 0.000213381 +/- 1.31485e-05 ) nb
epsilon = 0.000474245 chi2 = 0.122347
---------------------------------------------------
integrating ME[bbar,bbar->mu+,bbar,bbar,mu-].SubtractedReal : bbar bbar -> mu+ bbar bbar mu- , iteration 3
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
integrated ( 0.0002039 +/- 5.61121e-06 ) nb
epsilon = 0.00133473 chi2 = 0.379022
---------------------------------------------------
integrating ME[bbar,bbar->mu+,bbar,bbar,mu-].SubtractedReal : bbar bbar -> mu+ bbar bbar mu- , iteration 4
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
integrated ( 0.000208295 +/- 4.15852e-06 ) nb
epsilon = 0.000642876 chi2 = 0.706263
---------------------------------------------------


As you can imagine this last stage can take many hours or day to complete, is very difficult to break the process as unless you stop and thus save the task properly (it's no use just closing the BOINC GUI you have to stop the virtual task as well) the whole job will start afresh from the very top.
3) Message boards : Theory Application : Feedback on the Theory docker app
Message 53248
Posted 18 days ago by rob
There may be an issue when you request work from all the sub-projects. I had the same problem until I turned them all off except Theory. Whereupon I got a task that said it would run in a matter of an hour or so, but in reality is showing 100% completion (BOINC progress), is actually at about integrate 340 of 760 and just under two days, so the back of my envelope suggests another 2 days at least to completion.

This is the last work I'll be having from LHC until such time as they can clearly demonstrate that the initial estimates are about right.....

(Windows 10, vbox64:
https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10787143
4) Message boards : Theory Application : Task at 100% and still running
Message 53232
Posted 20 days ago by rob
The progress estimate of these long duration tasks is very misleading. It would appear that a whole section of the calculation is not considered in the duration calculation - for example my current task reached ~100% in about 4 hours, it is now at 15hours elapsed, but runRvet.log shows that it's at iteration 71 of 760. I assume this very slow phase starts at iteration 1, and continues to iteration 760, thus it should be a fairly simple change to include this potentially quite substantial part of the calculation into expected runtime calculation, and so give a better estimate of the time remaining.
5) Message boards : Number crunching : Didn't resend lost task, work package is marked as canceled
Message 52811
Posted 4 Jan 2026 by rob
Once a task has been marked as "cancelled" it will not be resent to you, or anyone else. There is nothing that you can do about this, nor should you worry about it.

Sobald eine Aufgabe als „storniert“ markiert wurde, wird sie weder Ihnen noch anderen erneut zugesendet. Sie können daran nichts ändern und sollten sich auch keine Sorgen machen.
6) Questions and Answers : Getting started : New guy here, and...
Message 52783
Posted 22 Dec 2025 by rob
...and we can now see your computer.
7) Questions and Answers : Getting started : New guy here, and...
Message 52781
Posted 22 Dec 2025 by rob
We are always able to see our own computers, but if hidden others cannot. Yours are still hidden.
To un-hide you need to go to you account page on the LHC website https://lhcathome.cern.ch/lhcathome/prefs.php?subset=project and make sure that the
Should LHC@home show your computers on its web site?
option is ticked.

With your computers unhidden while others can see their configuration they cannot change anything about your computers.
8) Questions and Answers : Windows : still can't get LHC@home to run
Message 52745
Posted 13 Dec 2025 by rob
While you have installed Virtual Box you do not appear to have enabled CPU virtualisation in the BIOS of either of your computers.
9) Message boards : ATLAS application : Hi, I'm receiving Atlas tasks that are supposed to run on 8 cores, but each task is actually using much less than 1 core.
Message 52552
Posted 21 Oct 2025 by rob
Thanks. Recently I've been suffering a similar problem with running tasks that are nearly complete dropping from 4-cores to 1-core and the remaining extending dramatically from a few minutes to several hours along with the progress barely incrementing (say 0.001% per minute).

So time to update Virtual Box from its current 7.0.14 to whatever the latest (and best?) is. The link you give goes to the 7.0.26 change text page, and has a warning that versions 7.0.x are no longer supported, and there is no (obvious) link to download that version. Checking on the "download" page the current version is 7.2.2, again with no obvious way of downloading historic versions.
So, the questions are:
Is 7.2.2 a "good version, will it run under Windows 10, do I download the "Windows Hosts" package, run that and keep my fingers crossed?
10) Message boards : Xtrack/SixTrack : Xtrack (Xboinc)
Message 52324
Posted 25 Sep 2025 by rob
So, if I want to run Xtrack (beta) on a windows computer what options should I have set?
Currently I have the following set:
Use CPU and Run test applications, SixTrack, CMS Simulation, Theory Simulation, ATLAS Simulation, XTrack

Which leaves these two not selected::
If no work for selected applications is available, accept work from other applications?, and Run native if available? (Not recommended for Windows)

Out of habit I run a very small cache, and shut down every night - do either of these get in the way?
11) Questions and Answers : Getting started : Please Enable Docker
Message 52274
Posted 23 Sep 2025 by rob
Thanks.
Long run times don't worry me - I recall the early days of CPDN when tasks would run for weeks.....
12) Questions and Answers : Getting started : Please Enable Docker
Message 52272
Posted 22 Sep 2025 by rob
I assume that this line (from your linked page)
Microsoft Windows running on an AMD x86_64 or Intel EM64T CPU 0.04 16 Sep 2025, 12:36:12 UTC 0 GigaFLOPS

indicates that a modern processor running modern (64-bit) will run an XTrack (beta test) task in native mode. Of course this assumes that one is a member of the beta-test "chosen few", or such tasks are available. Also said tasks aren't out in the wild as the average processing rate is zero.
13) Message boards : CMS Application : CMS Simulation error
Message 51997
Posted 3 Jul 2025 by rob
After sorting out the lack of virtualisation on my PC (not turning it on in the BIOS....) it got a few tasks that got through the "no virtualisation" few seconds and ran for about 20 minutes then failed. All these "running" tasks appeared to have the same error - right a the top of Stderr output:

<core_client_version>8.0.2</core_client_version>
<![CDATA[
<message>
The global filename characters, * or ?, are entered incorrectly or too many global filename characters are specified.
(0xd0) - exit code 208 (0xd0)</message>

<stderr_txt>
2025-07-03 08:38:07 (3204): vboxwrapper version 26210
2025-07-03 08:38:07 (3204): BOINC client version: 8.0.2
2025-07-03 08:38:07 (3204): Detected: VirtualBox VboxManage Interface (Version: 7.0.14)
2025-07-03 08:38:08 (3204): Detected: Heartbeat check (file: 'heartbeat' every 1200.000000 seconds)
2025-07-03 08:38:08 (3204): Successfully copied 'init_data.xml' to the shared directory.


The rest of the stderr output looks "fairly normal"....

Hmm, that's a strange one?

Help?!?!?!?!?

[edit to add]
I've set No New Tasks.
14) Questions and Answers : Getting started : Computation Error, Mac
Message 51568
Posted 18 Feb 2025 by rob
With your computers hidden it is difficult for people to try and help you.
15) Message boards : Number crunching : what does "timed out" status mean?
Message 51491
Posted 2 Feb 2025 by rob
Deadline is, for most projects, the time the task should be completed, returned and reported to the project. Some projects do give a bit of lee-way, but not all.

[edit to add]
Both your recent tasks that failed due to exceeding the deadline were sent out to others within a few minutes of the deadline, one has already been completed, returned and validated.
16) Message boards : Number crunching : what does "timed out" status mean?
Message 51487
Posted 1 Feb 2025 by rob
Exactly as you surmise - your version of that task is dead, and so you might as well abort it.
This message appears when the task has been on your computer beyond its deadline. One way to reduce the possibility of this in the future is to have a small cache, but even this is not guaranteed to work if your PC is slow or only runs BOINC for very few hours a day.
17) Message boards : ATLAS application : All tasks failing
Message 51131
Posted 24 Nov 2024 by rob
I had four "good" tasks on the 22nd Nov, since then all (16) have failed with "validate error" as the headline. Lots of strange messages:
2024-11-24 13:37:11 (7136): Guest Log: *** Starting ATLAS job. (PandaID=6416690328 taskID=42161013) ***
2024-11-24 13:39:31 (7136): Guest Log: *** Job finished ***
2024-11-24 13:39:31 (7136): Guest Log: *** The last 20 lines of the pilot log: ***
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:22,732 | INFO | generated guid for lfn=HITS.42161013._131760.pool.root.1: 45DB498D-73E2-4806-8741-CB186C50CDEB
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:22,732 | WARNING | aborting payload error diagnosis since an error has already been set: [127, 1187]
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:23,775 | INFO | [payload] execute_payloads thread has finished
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:24,235 | INFO | only monitor.control thread still running - safe to abort: ['<_MainThread(MainThread, started 140077397043008)>', '<ExcThread(monitor, started 140077103560448)>']
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:24,490 | WARNING | job_aborted has been set - aborting pilot monitoring
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:24,490 | INFO | [monitor] control thread has ended
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,240 | INFO | all workflow threads have been joined
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,240 | INFO | end of generic workflow (traces error code: 0)
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,241 | INFO | traces error code: 0
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,241 | INFO | pilot has finished (exit code=0, shell exit code=0)
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,299 [wrapper] ==== pilot stdout END ====
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,303 [wrapper] ==== wrapper stdout RESUME ====
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,306 [wrapper] pilotpid: 5928
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,309 [wrapper] Pilot exit status: 0
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,417 [wrapper] pandaids: 6416690328
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,442 [wrapper] cleanup supervisor_pilot 5934 5929
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,445 [wrapper] Test setup, not cleaning
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,450 [wrapper] apfmon messages muted
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,453 [wrapper] ==== wrapper stdout END ====
2024-11-24 13:39:31 (7136): Guest Log: 2024-11-24 13:39:29,456 [wrapper] ==== wrapper stderr END ====


Then after a few more lines I get:


2024-11-24 13:39:31 (7136): Guest Log: -rw-r--r--. 1 atlas atlas 10776 Nov 24 13:39 runtime_log.err
2024-11-24 13:39:31 (7136): Guest Log: -rw-------. 1 atlas atlas 636 Nov 24 13:39 4ILLDma7QZ6nsSi4ap6QjLDmwznN0nGgGQJmq4hLDmSMhKDm50VHnm.diag
2024-11-24 13:39:31 (7136): Guest Log: Looking for outputfile HITS.42161013._131760.pool.root.1
2024-11-24 13:39:31 (7136): Guest Log: No HITS file was produced
2024-11-24 13:39:31 (7136): Guest Log: Successfully finished the ATLAS job!
2024-11-24 13:39:31 (7136): Guest Log: Copying the results back to the shared directory!
2024-11-24 13:39:31 (7136): Guest Log: *** Contents of shared directory: ***
2024-11-24 13:39:32 (7136): Guest Log: total 269908
2024-11-24 13:39:32 (7136): Guest Log: -rwxrwxrwx. 1 root root 275766805 Nov 24 13:36 ATLAS.root_0
2024-11-24 13:39:32 (7136): Guest Log: -rwxrwxrwx. 1 root root 9433 Nov 24 13:36 init_data.xml
2024-11-24 13:39:32 (7136): Guest Log: -rwxrwxrwx. 1 root root 499895 Nov 24 12:42 input.tar.gz
2024-11-24 13:39:32 (7136): Guest Log: -rwxrwxrwx. 1 root root 81920 Nov 24 2024 result.tar.gz
2024-11-24 13:39:32 (7136): Guest Log: -rwxrwxrwx. 1 root root 17569 Nov 24 12:42 start_atlas.sh
2024-11-24 13:39:32 (7136): Guest Log: *** Success! Shutting down the machine. ***
2024-11-24 13:39:32 (7136): VM Completion File Detected.
2024-11-24 13:39:32 (7136): Powering off VM.
2024-11-24 13:39:32 (7136): Successfully stopped VM.


and the VM stops in an orderly manner.

(Meanwhile CMS tasks are happily running on the same computer)
18) Message boards : ATLAS application : Last days a lot of validate errors or No Hits file produced
Message 51042
Posted 10 Nov 2024 by rob
No need to disconnect from the project, just simply select "no new tasks", abort any tasks you have. Then sit back and wait until the project staff announce that the problem is solved.
19) Questions and Answers : Windows : I would really like to have some help here this is frusterating adn depressing and is driving me to depression
Message 50745
Posted 8 Oct 2024 by rob
We don't have access to your computer names which makes it difficult to work out which of your computers you are talking about.
That said if you follow this link https://lhcathome.cern.ch/lhcathome/hosts_user.php you will wee your list of computers and be able to work out which is which. After selecting the computer you want to look at you will see there's a few filters, select the "error" one and you will only see tasks that ended with an error, then its simply a case of selecting a task in the list (I'd suggest that you are looking for the most recent one), scroll and near the top of the information is the error number you are looking for, scrolling down the page is all the detail of the error (mostly groups of repeating lines which may mean something to others)
20) Questions and Answers : Windows : I would really like to have some help here this is frusterating adn depressing and is driving me to depression
Message 50741
Posted 7 Oct 2024 by rob
Make sure you don't have the Windows own "version" of virtualisation installed. Crudely there are two "types" of virtualisation, and they don't always work well together.
It's a bit of a pain to make sure that the Windows one (Hyper-V) is not installed - it sometimes gets installed without you doing anything :-( The best thing I can suggest if for you to do a web search for "removing Hyper-V from Windows x" and follow the instructions step by step (it may take several re-boots).
Next 20


©2026 CERN