Message boards : Theory Application : This gonna be long
Message board moderation
Previous · 1 · 2 · 3 · 4
| Author | Message |
|---|---|
|
Send message Joined: 5 Apr 25 Posts: 82 Credit: 2,609,773 RAC: 9,636 |
In reply to computezrmle's message of 2 Apr 2026: As for your powheg-box Thanks a lot for the explanations! I'll let them both run, see what happens. Will remember to be patient with powheg-box tasks in the future.
|
|
Send message Joined: 5 Apr 25 Posts: 82 Credit: 2,609,773 RAC: 9,636 |
The powheg-box task resumed writing to the log and is now at around 60k events, advancing quite fast.
|
|
Send message Joined: 13 Jan 24 Posts: 48 Credit: 9,492,042 RAC: 19,196 |
Here's a sherpa job with characteristics that I can't remember seeing before, https://lhcathome.cern.ch/lhcathome/result.php?resultid=434823962 After about 16 hours running, runRivet.log is nearly 2 GB in size. Task Properties: Log excerpts: ===> [runRivet] Tue Apr 14 04:51:24 PM UTC 2026 [boinc pp z1j 8000 - - sherpa 2.2.9 default 2000 839] There are more that sort of message block than I would care to count. There doesn't seem to be much chance of a successful outcome, so I'll (figuratively) put it out of its misery. Since the number of events is only 2000, it seems that this isn't the first time this has failed. |
|
Send message Joined: 15 Jun 08 Posts: 2753 Credit: 303,928,827 RAC: 110,786 |
Good decision to cancel that task. According to mcplots none of them succeeded so far: run events attempts success failure unknown pp z1j 8000 - - sherpa 2.2.9 default 0 56 0 13 43 |
|
Send message Joined: 3 Aug 11 Posts: 2 Credit: 773,488 RAC: 9,757 |
Another long runner, Theory_2922-4904088-748_1, https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=240311562 ===> [runRivet] Sat Apr 18 15:21:12 UTC 2026 [boinc pp zinclusive 13000 - - sherpa 2.2.0 default 6000 748] The Timed out- no reponse is already reported, but there is still progress reported in runRivet.log: integration time: ( 3d 22h 53m 5s elapsed / 2900d 1h 8m 52s left ) [14:32:52] 7004.33 pb +- ( 3784.28 pb = 54.0277 % ) 83000000 ( 144430443 -> 57.5 % ) integration time: ( 3d 22h 54m 31s elapsed / 2894d 22h 7m 30s left ) [14:34:19] Is there any scientific value in keep running this Task or should I abort it on my host? |
|
Send message Joined: 15 Jun 08 Posts: 2753 Credit: 303,928,827 RAC: 110,786 |
The runspec tells you that #events had already been reduced from 100000 to 6000. Nonetheless mcplots reports only lost tasks (9 of 9), which indicates they run much too long. Your log snippet shows a estimated time left around 2894 days (~7.9 years!) for the integration phase. Even if you let it run that long mcplots will mark the task as unknown (=lost). |
|
Send message Joined: 28 Dec 08 Posts: 353 Credit: 6,786,362 RAC: 1,468 |
regarding Theory_2922-xxxxx tasks, the Boinc Manager shows just over an hour estimated for run time. I just aborted at 21 hour runtime task that was at 99,999% and not moving. The clock counter showed an increase in run time each time it checkpointed. But no progress increase. This is the second time now..whats going on? One of my aborted tasks was also bombed before me and after me. |
|
Send message Joined: 28 Sep 04 Posts: 804 Credit: 65,980,234 RAC: 28,290 |
In reply to greg_be's message of 22 Apr 2026: regarding Theory_2922-xxxxx tasks, the Boinc Manager shows just over an hour estimated for run time. I just aborted at 21 hour runtime task that was at 99,999% and not moving. The task progress shown in Boinc Manager has nothing to do with the actual task progress as the progress is not reported from VBox back to Boinc manager. What you see is Boinc managers estimate based on previously run tasks. This can be wildly off because Theory tasks are so varied in run times.
|
|
Send message Joined: 3 Aug 11 Posts: 2 Credit: 773,488 RAC: 9,757 |
In reply to computezrmle's message of 22 Apr 2026: The runspec tells you that #events had already been reduced from 100000 to 6000.Thanks for the explanation, I will abort the task. One more question: if the #events in a task is already reduced and the runtime is getting towards the BOINC-limit of 10 days, it should be aborted? |
©2026 CERN