Message boards :
Theory Application :
Problem of the day
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,866,264 RAC: 0 ![]() ![]() |
https://lhcathome.cern.ch/lhcathome/result.php?resultid=354189019 ===> [runRivet] Wed May 11 07:14:58 UTC 2022 [boinc pp mb-inelastic 2760 - - sherpa 1.2.3 default 100000 254] Ran ok till 22400 events then did nothing more for 12+hrs when I killed it. Full usage of a core but nothing actually being done. I had another Sherpa 1.x.x a week or so ago that did similar. I thought the early Sherpas were deprecated ages ago? |
Send message Joined: 2 May 07 Posts: 2260 Credit: 175,581,097 RAC: 11,545 ![]() ![]() ![]() |
Have seen a Download of hours, Running Theory First time on Threadripper today in Win11pro! Combination of Atlas AND Theory stopped Tasks for Theory, because of Tasklimit of 8. LHC-Prefs are set to nolimit. Have deleted App-config for using Atlas. Now Download of Theory-Tasks is ok. |
Send message Joined: 2 May 07 Posts: 2260 Credit: 175,581,097 RAC: 11,545 ![]() ![]() ![]() |
output.tgz with 17 Byte is not deleted in the Theory-SLOT-Folder after finishing a Theory Task. Boinc is 7.16.11. Have running Atlas and Theory as native in a CentOS8-VM. Atlas have no problems, folder is empty after closing task. When 400 Slots reached no more task is running neither Atlas not Theory. |
Send message Joined: 2 May 07 Posts: 2260 Credit: 175,581,097 RAC: 11,545 ![]() ![]() ![]() |
<app> <name>Theory</name> <max_concurrent>40</max_concurrent> </app> Win11pro Get only 20 Theory-Tasks. |
Send message Joined: 2 May 07 Posts: 2260 Credit: 175,581,097 RAC: 11,545 ![]() ![]() ![]() |
Seeing .vdi with 2 GByte for Theory? (781 MByte in the most of the Theory-Tasks) |
Send message Joined: 2 May 07 Posts: 2260 Credit: 175,581,097 RAC: 11,545 ![]() ![]() ![]() |
Setting environment... grep: /etc/redhat-release: No such file or directory seeing in all Theory Win11pro and Win10pro. |
Send message Joined: 2 May 07 Posts: 2260 Credit: 175,581,097 RAC: 11,545 ![]() ![]() ![]() |
396590990 10797673 23 Jul 2023, 4:08:11 UTC 24 Jul 2023, 5:20:31 UTC Fehler beim Berechnen 31,383.27 56.78 --- Theory Simulation v300.07 (vbox64_theory) windows_x86_64 396640378 10766984 24 Jul 2023, 7:53:29 UTC 4 Aug 2023, 16:01:35 UTC Fehler beim Berechnen 979,582.08 976,997.30 --- Theory Simulation v300.07 (vbox64_theory) windows_x86_64 397097333 10815119 4 Aug 2023, 10:33:23 UTC 18 Aug 2023, 7:07:57 UTC Abgebrochen 1,191,012.29 1,184,630.00 --- Theory Simulation v300.07 (native_theory) x86_64-pc-linux-gnu 397672641 10797673 13 Aug 2023, 0:47:01 UTC 14 Aug 2023, 8:09:40 UTC Abgebrochen 82,726.89 81,947.66 --- Theory Simulation v300.07 (vbox64_theory) windows_x86_64 397724905 9995505 14 Aug 2023, 10:24:48 UTC 24 Aug 2023, 12:59:12 UTC Fehler beim Berechnen 864,760.23 853,816.60 --- Theory Simulation v300.07 (vbox64_theory) windows_x86_64 398041850 10812982 24 Aug 2023, 17:29:15 UTC 4 Sep 2023, 17:29:15 UTC In Bearbeitung --- --- --- Theory Simulation v300.07 (vbox64_theory) x86_64-pc-linux-gnu This Theory Tasks need some control from Cern-IT. |
Send message Joined: 2 May 07 Posts: 2260 Credit: 175,581,097 RAC: 11,545 ![]() ![]() ![]() |
2023-09-14 08:42:50 (54888): Guest Log: 2.5.2.0 4077 3 28120 27003 3 1 244127 4096000 0 65024 0 0 n/a 0 0 http://s1cern-cvmfs.openhtc.io/cvmfs/sft.cern.ch http://10.116.178.201:3128 1 2023-09-14 08:42:50 (54888): Guest Log: Probing /cvmfs/grid.cern.ch... Failed! 2023-09-14 08:42:50 (54888): Guest Log: 08:42:48 CEST +02:00 2023-09-14: cranky: [ERROR] 'cvmfs_config probe grid.cern.ch' failed. 80 hour for nothing! 8 Tasks a 10 hour. |
![]() Send message Joined: 28 Sep 04 Posts: 747 Credit: 52,003,543 RAC: 30,070 ![]() ![]() ![]() |
Several tasks have failed lately because Probing /cvmfs/... Failed! Unfortunately the task doesn't detect this and keeps on running although nothing happens in the VM anymore. ![]() |
Send message Joined: 2 May 07 Posts: 2260 Credit: 175,581,097 RAC: 11,545 ![]() ![]() ![]() |
such a faulty task seen also yesterday. bootstrap problem. Computer ID 10795955 Laufzeit 15 Stunden 4 min. 20 sek. CPU Zeit 2 min. 5 sek. |
![]() ![]() Send message Joined: 24 Oct 04 Posts: 1193 Credit: 59,160,541 RAC: 67,800 ![]() ![]() |
I don't run as many at the same time as before but I like these since we can check and see they are actually running in the log so that many hours are fine if they are really working. https://lhcathome.cern.ch/lhcathome/result.php?resultid=401733280 funny Theory works fine on the same host that had problems last week with the CMS all of a sudden And with that problem a clean install of VB newest and newest Boinc made no difference so maybe the latest Windows update so I may see if it needs a reinstall of the CMS vdi ....later |
Send message Joined: 8 Jul 08 Posts: 20 Credit: 32,305,738 RAC: 446 ![]() ![]() |
I'm having a problem with WUs not being downloaded due to "11/19/2023 8:40:03 AM | LHC@home | Not requesting tasks: don't need (CPU: ; NVIDIA GPU: )" even though I ask for "<project_max_concurrent>18</project_max_concurrent>" in app_config.xml, and have "use 100%" for both CPU # and CPU time. This is computer ASUS570 ID=10837826, 32GB, AMD 5950X. I get only one at a time when the previous WU ends. I only run Theory VBOX64 on this computer. My other three computers are working fine and work is flowing. What else do I need to check? |
Send message Joined: 2 May 07 Posts: 2260 Credit: 175,581,097 RAC: 11,545 ![]() ![]() ![]() |
have you different prefs? home, school.... |
Send message Joined: 8 Jul 08 Posts: 20 Credit: 32,305,738 RAC: 446 ![]() ![]() |
No. All are set to the same - home, school, work, default. There is something strange going on today, though. Only a few are actually making progress. The others are ... stalled? Of the 17 active on my A3900X1, only 3 are showing more than a fraction of a percent of a CPU time in my monitor. Those three are going 100% of a cpu each. Peculiar, but I have a meeting to run in a few minutes so I can't pursue just now. I haven't looked at the other computers yet. |
Send message Joined: 2 May 07 Posts: 2260 Credit: 175,581,097 RAC: 11,545 ![]() ![]() ![]() |
Your tasks finish now. Was there an ISP problem? |
Send message Joined: 8 Jul 08 Posts: 20 Credit: 32,305,738 RAC: 446 ![]() ![]() |
Perhaps you looked at one of my other computers. 401889030 217094866 19 Nov 2023, 6:11:52 UTC 30 Nov 2023, 6:11:52 UTC In progress --- --- --- Theory Simulation v300.07 (vbox64_theory) windows_x86_64 is the ONLY WU in process for the computer I can't get/keep more than 1 WU running. Please take another look, and I have had NO ISP problems. Thanks. |
Send message Joined: 14 Jan 10 Posts: 1440 Credit: 9,659,590 RAC: 1,098 ![]() ![]() |
is the ONLY WU in process for the computer I can't get/keep more than 1 WU running. Please take another look, and I have had NO ISP problems. Thanks.Set in your preferences Max # CPUs to No limit |
Send message Joined: 8 Jul 08 Posts: 20 Credit: 32,305,738 RAC: 446 ![]() ![]() |
ALL of my preferences have NO LIMIT for number of jobs and number of CPUs. My other computers are downloading and running 15-20 WUs each, so I know the preferences work. I did a "diff" between this and another computer's app_config. They were identical except the other (running) computer did NOT have the entry, below: <app> <name>Theory</name> </app> <app_version> <app_name>Theory</app_name> <plan_class>vbox64_theory</plan_class> <avg_ncpus>1.0</avg_ncpus> </app_version> After removing this, and having BOINC reload configs, I got 4 new tasks. Is this mis-coded in some way? I am curious! |
Send message Joined: 14 Jan 10 Posts: 1440 Credit: 9,659,590 RAC: 1,098 ![]() ![]() |
After removing this, and having BOINC reload configs, I got 4 new tasks. Is this mis-coded in some way? I am curious!After I posted yesterday, I saw that on your mentioned machine you sometimes had 2 tasks in progress for a very short period. Another thing I noticed is, that the machine never had run BOINC benchmarks. You could try running the benchmarks to improve the situation. Now the server has: Measured floating point speed 1 billion ops/sec Measured integer speed 1 billion ops/sec Another machine of yours shows: Measured floating point speed 4.6 billion ops/sec Measured integer speed 18.89 billion ops/sec |
Send message Joined: 8 Jul 08 Posts: 20 Credit: 32,305,738 RAC: 446 ![]() ![]() |
I tried running the benchmarks, but it made no difference. Besides, another of my computers had not had the benchmarks run and it was processing just fine. I did try another thing, though. I drained my (one) WU, cancelled a replacement that had started and reset the project. Then I removed app_config, adjusted the preferences to only 50% of the CPUs, and restarted. This worked so far, but I have NO idea why. I plan to follow up on this approach to see what I can see. |
©2025 CERN