Message boards : Theory Application : Task at 100% and still running
Message board moderation
Previous · 1 · 2
| Author | Message |
|---|---|
rilianSend message Joined: 12 Jul 08 Posts: 23 Credit: 941,384 RAC: 15 |
I have 8 tasks, still running on 100% for from 4 to 9 days Project % Done Elapsed Deadline Status Procs WU name LHC@home 100.00% 9 days 20:01:12 26-Apr-2026 02:39:34 executing 1 CPU Theory_2922-4858714-845 LHC@home 100.00% 9 days 02:51:15 26-Apr-2026 02:39:34 executing 1 CPU Theory_2922-4893018-844 LHC@home 100.00% 8 days 11:57:37 26-Apr-2026 02:39:34 executing 1 CPU Theory_2922-4784078-845 LHC@home 100.00% 7 days 16:37:05 26-Apr-2026 02:39:35 executing 1 CPU Theory_2922-4855406-845 LHC@home 100.00% 5 days 08:15:02 26-Apr-2026 02:39:35 executing 1 CPU Theory_2922-4799558-845 LHC@home 100.00% 5 days 09:24:38 26-Apr-2026 02:39:34 executing 1 CPU Theory_2922-4832757-845 LHC@home 100.00% 5 days 00:38:17 26-Apr-2026 02:39:35 executing 1 CPU Theory_2922-4870278-845 LHC@home 100.00% 4 days 17:21:01 26-Apr-2026 02:39:34 executing 1 CPU Theory_2922-4871537-845 Today they were marked in my tasks list as "Timeout - no response" for example https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=240676463 I see after timeout task was picked up by other computer and finished in 30 mins .. Should i abort or let them run? I crunch for Ukraine |
|
Send message Joined: 9 Apr 23 Posts: 8 Credit: 108,016 RAC: 1,224 |
I have a similar question @rilian. I have two LHC Theory tasks that are taking 2 - 9 days to complete so far. The Computer is an older one running Windows 11, here: https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=11058658 I am using BOINC version 8.2.11 as part of BOINC testing. I obtained the Theory type using the Work unit link and the Input link. Many other tasks have completed with hours, but these two long tasks are: Sent 19 Apr 15:41 and deadline 30 Apr 15:41, type pythia8, https://lhcathome.cern.ch/lhcathome/result.php?resultid=434988958 Sent 24 Apr 13:56 and deadline 5 May 13:56, type sherpa, https://lhcathome.cern.ch/lhcathome/result.php?resultid=434988958 The tasks have run for 7 days 8 hours (pythia8) and 2 days 7 hours (sherpa) as of 28 April 04 AM. About 20-24 hours ago, I saw LHC disk use grow at a alarming rate so I rebooted and it seemed better. LHC disk use reached about 4 GB before reboot when viewed in BOINC Manager's Disk tab. Last night I increased BOINC disk quota because other BOINC projects had disk space problems reported in the Event Log. LHC disk usage is now at about 100 MB. It there a way to view progress on a windows system? I saw a tail command mentioned in message forums but that seems to be a Linux command. |
|
Send message Joined: 2 May 07 Posts: 2305 Credit: 179,727,092 RAC: 7,030 |
What for info is in runRivet.log? Are there xxxxx events processed? A small number of Tasks from Theory would be better canceled. For me up to five in the last few days. Some Sherpa's are more special, because they need days before first events line is shown. |
|
Send message Joined: 14 Jan 10 Posts: 1559 Credit: 10,102,701 RAC: 699 |
In reply to Michael E.'s message of 28 Apr 2026: About 20-24 hours ago, I saw LHC disk use grow at a alarming rate so I rebooted ...When you have docker Theory's running, a reboot will kill those tasks and they will start from scratch after the restart. The pythia8 probably will not succeed. From the attempts by others so far no one succeeded. For the sherpa job you provided the same link. It there a way to view progress on a windows system? I saw a tail command mentioned in message forums but that seems to be a Linux command.With Windows Powershell you can use the tail command. Get-Content BOINC'sDataDir\slots\slotnumber\shared\runRivet.log -tail 8 |
rilianSend message Joined: 12 Jul 08 Posts: 23 Credit: 941,384 RAC: 15 |
In reply to maeax's message of 28 Apr 2026: What for info is in runRivet.log? i have no such file in any of slots folders of LHC tasks /var/lib/boinc/slots/8$ ls -la total 77456 drwxrwx--x 2 boinc boinc 4096 Apr 28 00:23 . drwxrwxr-x 11 boinc boinc 4096 Apr 27 19:27 .. -rw-r--r-- 1 boinc boinc 0 Apr 16 02:54 boinc_lockfile -rw-r--r-- 1 boinc boinc 8192 Apr 28 13:45 boinc_mmap_file -rw-r--r-- 1 boinc boinc 0 Apr 16 02:54 boinc_setup_complete -rw-r--r-- 1 boinc boinc 499 Apr 28 13:45 boinc_task_state.xml -rw-r--r-- 1 boinc boinc 474 Apr 16 02:54 Dockerfile -rwxr-xr-x 1 boinc boinc 1209104 Apr 16 02:54 docker_wrapper -rwxr-xr-x 1 boinc boinc 28909 Apr 16 02:54 entrypoint.sh -rw-r--r-- 1 boinc boinc 6077 Apr 28 00:23 init_data.xml -rw-r--r-- 1 boinc boinc 359841 Apr 16 02:54 input -rw-r--r-- 1 boinc boinc 148 Apr 16 02:54 job.toml -rw-r--r-- 1 boinc boinc 77666330 Apr 28 13:45 stderr.txt contents of stderr.txt, last lines are the same: running docker command: ps --all -f "name=boinc__lhcathome.cern.ch_lhcathome__theory_2922-4858714-845_1"
program: podman
command output:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
af37de79728c localhost/boinc__lhcathome.cern.ch_lhcathome__theory_2922-4858714-845:latest /bin/sh -c ./entr... 18 hours ago Created 0.0.0.0:56657->80/tcp boinc__lhcathome.cern.ch_lhcathome__theory_2922-4858714-845_1
running docker command: stats --no-stream --format "{{.CPUPerc}} {{.MemUsage}}" boinc__lhcathome.cern.ch_lhcathome__theory_2922-4858714-845_1
program: podman
command output:
0.00% 0B / 0B
invalid usage stats; using defaultsdoes this indicate anything ? I crunch for Ukraine |
|
Send message Joined: 9 Apr 23 Posts: 8 Credit: 108,016 RAC: 1,224 |
Thank you both @maeax and @Crystal Pellet. @rilian, I just looked in each slot number directory for a shared directory. If you find a shared directory in .../slot/n/, the large runRivet.log file should be there. I assume the runRivet.log data exists for active tasks only? Thank you for the Powershell tip @Crystal Pellet. Here is the data requested by @maeax: The last lines of the C/ProgramData/BOINC/slots/1/shared contain: PS C:\Users\muser> Get-Content C:/ProgramData/BOINC/slots\1\shared\runRivet.log -tail 12 The sherpa task in slot 2 does not seem to have events yet: PS C:\Users\muser> Get-Content C:/ProgramData/BOINC/slots\2\shared\runRivet.log -tail 8 I will let sherpa task run until near the deadline. Should I also let PYTHIA task run for another day? I am suspending other BOINC tasks for now (I also run WCG) so the two LHC tasks both get closer to 100% CPU time (instead of 2/3). And yes the I really appreciate the help! |
|
Send message Joined: 2 May 07 Posts: 2305 Credit: 179,727,092 RAC: 7,030 |
In MCPLOT you can search for this task. [url]http://mcplots-dev.cern.ch/production.php?view=status&plots=hourly#plots [/url] |
|
Send message Joined: 13 May 20 Posts: 64 Credit: 3,172,142 RAC: 1,979 |
Pour infos,les taches theory sur virtualbox dure environ 30 mn a 2 ou 3 heures maximum.c'est tres rare une tache qui dure 8 heures.on est ,2 fois sur 3,en dessous d'une heure et ça marche.Je ne sais pas ou en est theory sur docker,mais a chaque que j'ai voulu essayer,ça plantait en permanence que ce soit sous linux ou windows 10 ou windows 11.je n'ai jamais eu une tache theory qui dure une journée.En moyenne on est a 45 mn. For your information, the theory tasks on virtualbox last about 30 minutes to 2 or 3 hours maximum. It’s very rare for a task to last 8 hours. You’re twice out of three, under an hour, and it works. I don’t know where Docker’s theory is, but every time I’ve wanted to try,It was constantly crashing, whether on Linux or Windows 10 or Windows 11. I have never had a theory task that lasts a day. On average, we are at 45 minutes. https://lhcathome.cern.ch/lhcathome/results.php?userid=611803&offset=20&show_names=0&state=4&appid= |
Magic Quantum MechanicSend message Joined: 24 Oct 04 Posts: 1318 Credit: 98,477,163 RAC: 106,294 |
I have many VB Theory that were over 30 hours and even over 50 hours and even over 6 days and Docker version over 50 hours and even over 4 days total. (and I save them since they tend to delete them here after a few days) BTW the way I watch all of my Theory tasks with VB or Docker is leaving my Slots page open so I can check any as they are running to make sure they are actually running using This PC > Windows (C:) > ProgramData > BOINC > slots > 1 > shared windows 11 |
|
Send message Joined: 2 May 07 Posts: 2305 Credit: 179,727,092 RAC: 7,030 |
runRivet.log show atm for one running Theory-Docker-Task: 92000 events processed after 4 days and 11 hour. ===> [runRivet] Fri Apr 24 16:17:49 UTC 2026 [boinc pp jets 13000 180 - pythia8 8.244 CP1-CR1 100000 884] This or next day Task will find the exit. ===> [runRivet] Mon Apr 27 08:22:46 UTC 2026 [boinc pp z1j 13000 - - sherpa 2.1.1 default 17000 878] ===> [runRivet] Mon Apr 27 09:42:24 UTC 2026 [boinc pp zinclusive 13000 - - sherpa 2.2.0 default 4000 878] Those two running now 5 days! |
|
Send message Joined: 9 Apr 23 Posts: 8 Credit: 108,016 RAC: 1,224 |
BTW the way I watch all of my Theory tasks with VB or Docker is leaving my Slots page open so I can check any as they are running to make sure they are actually running using That is helpful! runRivet.log show atm for one running Theory-Docker-Task: Very helpful. The LHC Theory Docker tasks that ran very long on my old PC were also pythia8 and sherpa. I have a general question. I use 2 Win 11 PCs with Docker, one almost 10 (!) years old and the other about two years old. The long multi-day tasks seem to occur mostly on the older one so far. Is the LHC Theory app optimized for certain instruction sets? If so, I should run most Theory tasks on the newer PC and fewer on the old one. |
|
Send message Joined: 14 Jan 10 Posts: 1559 Credit: 10,102,701 RAC: 699 |
In reply to Michael E.'s message of 1 May 2026: I have a general question. I use 2 Win 11 PCs with Docker, one almost 10 (!) years old and the other about two years old. The long multi-day tasks seem to occur mostly on the older one so far. Is the LHC Theory app optimized for certain instruction sets? If so, I should run most Theory tasks on the newer PC and fewer on the old one.No, the Theory tasks you get is totally random. |
|
Send message Joined: 28 Dec 08 Posts: 367 Credit: 6,825,140 RAC: 3,190 |
Here is a Gemini summary based upon a bunch of data within runrivet.log which is in the slots shared directory. This is if you are using Docker. I don't know if the same thing is generated with virtual box. ---------------- --- ### **Theory Simulation Analysis: Understanding the "Heavyweight" 100k Tasks** If you are running **Theory** (formerly Test4Theory) on BOINC and notice a task that seems stuck or is taking days to finish, you’ve likely caught a "Heavyweight" simulation. Here is an explanation of what’s happening under the hood, based on the `runRivet.log` found in your shared folder. #### **1. The Configuration (The Input)** This task is a 3.5-day run because of its specific physics parameters. Here is the raw setup for this 8 TeV jet simulation: | Parameter | Value | Description | | --- | --- | --- | | **Generator** | Pythia 8.301 | The C++ "engine" creating the virtual collisions. | | **Process** | $pp$ / Jets | Simulating Proton-Proton collisions that create "Jets." | | **Energy** | 8000 GeV | 8 TeV energy scale (Standard LHC Run 1). | | **Params ($p_{\perp\text{min}}$)** | **350** | The "floor" for how violent the collision must be. | | **Tune** | **CP1-CR1** | A complex CMS model for how particles "reconnect." | | **Event Goal** | 100,000 | The total number of collisions required. | --- #### **2. Technical Breakdown: Why this task is a "Pig"** Not all 100,000-event tasks are equal. Here are the three reasons why this specific configuration is so CPU-intensive: * **The `params=350` Factor:** This sets a 350 GeV floor for the "transverse momentum." Because this floor is so high, every single event generated is a high-energy "violent" collision. These produce massive showers of secondary particles that the CPU must track and cluster. Unlike "Minimum Bias" tasks that fly through simple collisions, this task does heavy-duty math for every single event. * **The CP1-CR1 Tune:** This is a CMS-specific configuration. **CR1** stands for **Color Reconnection**, a mathematically intensive model that calculates how the "color strings" between quarks snap and rearrange. At 350 GeV, the number of possible rearrangements is huge, putting a massive load on your processor. * **The "Selected=12" Analysis:** Inside the log, you will see a line stating `Total histograms unpacked=18 / selected=12`. This means for **every single collision**, the CPU must run the **Anti-$k_t$ jet algorithm**—a heavy geometric calculation—to group particles into jets. It then checks those jets against 12 different scientific "histograms" (plots) to see where the data fits. You are essentially running 12 analyses simultaneously on every event. --- #### **3. The "100% Complete" BOINC Glitch** You may notice the BOINC Manager reporting the task as **100% complete** while the simulation is still crunching. **Don't panic!** BOINC’s progress bar is often based on **estimated time**, not the actual event count. Because a "Pig" task takes much longer than the project average, BOINC's timer runs out and says "100%," but the generator is still working toward the physical goal. The task will only finish and upload when the internal counter hits **100,000**. You can track the *real* progress at the bottom of your log, where it updates in blocks of 100: > `Pythia::next(): 83400 events have been generated` > `83500 events processed` > `...` The simulation is only "done" when that counter reaches 100k, regardless of what the BOINC percentage says. So if you get a PP with jets at 350 gigawatts, your in for a long crunch. I use an older Ryzen 3700x so I will not process this as fast as you guys with the latest and greatest. On Michael's computer (the first one with a I7..it would take 1.5 to 2 days. On the older I5 it would take 3-3.5 days computing time. |
|
Send message Joined: 2 May 07 Posts: 2305 Credit: 179,727,092 RAC: 7,030 |
Sherpa oh sherpa, Theory_2922-4904088-878_0 Arbeitspaket 240844448 Laufzeit 18 Tage 2 Stunden 28 min. 35 sek. CPU Zeit 17 Tage 4 Stunden 47 min. 46 sek. Theory_2922-4883450-878_0 Arbeitspaket 240841852 Laufzeit 18 Tage 6 Stunden 58 min. 45 sek. CPU Zeit 17 Tage 3 Stunden 39 min. 28 sek. Never ending story since a lot of Years!! |
|
Send message Joined: 28 Dec 08 Posts: 367 Credit: 6,825,140 RAC: 3,190 |
ouch! |
©2026 CERN