Message boards :
Theory Application :
New native version v300.08
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Send message Joined: 3 Nov 12 Posts: 55 Credit: 138,569,303 RAC: 111,364 |
03:30:54 CET +01:00 2024-01-06: cranky-0.1.4: [INFO] Can't find '/etc/cvmfs/domain.d/cern.ch.local'. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5594&postid=48539 |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
Will I still be able to run native ATLAS with this new 300.08 Theory configuration? |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,932,647 RAC: 128,073 |
Yes, if ATLAS worked before. Theory does not affect ATLAS. But you should revise your CVMFS setup following the advice here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6075&postid=49145 |
Send message Joined: 2 May 07 Posts: 2220 Credit: 173,696,209 RAC: 24,770 |
Will I still be able to run native ATLAS with this new 300.08 Theory configuration? When you have a account in -dev, you can test it there. |
Send message Joined: 12 Jul 11 Posts: 95 Credit: 1,129,876 RAC: 0 |
I have 2 native theory tasks that have been running for days (2 and 4) and I realize boinc says 100% is achieved and they are still running (using CPU) Is this possible ? it is still running and useful (and not stalled or dead) ? |
Send message Joined: 4 Mar 17 Posts: 23 Credit: 10,023,478 RAC: 9,154 |
I have such long running tasks right now too. one Sherpa (52hours so far)like you mcplots runspec: boinc pp winclusive 7000 10 - sherpa 2.2.5 default 100000 126 https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=218564767 and one with 30hours so far mcplots runspec: boinc ppbar mb-inelastic 900 - - pythia8 8.306 dire-default 100000 136 https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=218596130 Both long running tasks have nearly the same Name as yours, so guess that are just long running experiments. But i did have before theory tasks that have run for days and been successful. The only unusual thing with that 2 tasks is that they don't write how many events are done to /var/lib/boinc/slots/*/cernvm/shared/runRivet.log like all the other tasks have so far. Runtime of recent Theory tasks in hours: average, min, max 3.09 (0.01 - 238.65)Theory Tasks can run very long. |
Send message Joined: 2 May 07 Posts: 2220 Credit: 173,696,209 RAC: 24,770 |
11:27:47 CET +01:00 2024-01-08: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4 mkdir: das Verzeichnis „/sys/fs/cgroup/unified“ kann nicht angelegt werden: Das Dateisystem ist nur lesbar Have changed properties for /sys/fs/cgroup from read to read and/or write. Get the same message as before in -native, but the task finished with zero. using no script so far (CentOS9). https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10806714 |
Send message Joined: 12 Jul 11 Posts: 95 Credit: 1,129,876 RAC: 0 |
Well it's not very nice, it turns out I realized today boinc had "stopped" without further notice yesterday at noon in the VM (I couldn't figure out why) and after I restarted boinc the 2 tasks have "reset", one is not even started and the 2nd has very little computing time now :( |
Send message Joined: 12 Jul 11 Posts: 95 Credit: 1,129,876 RAC: 0 |
I still have them running, always stating 100% is done ! and restarting forever, now it says only 1 day of calculation, so I guess each time I need to restart the VM they restart from 0 and again they don't end I decided to abort them, enough CPU cycles waste ! |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,932,647 RAC: 128,073 |
I guess each time I need to restart the VM they restart from 0 Yes, that's what they do. But it is well known that Theory tasks start from scratch when you restart the computer (or the VM like in this case). So, why don't you let the tasks finish before you restart the VM? Or, why do you shut the VM down instead of just suspend/resume it? Also well known: Theory tasks run between a few minutes (min) and 10 days (max). This depends on the task's input data. Locate "runRivet.log" below the worker slot to check how many events a task is configured to process and how many are already done. Together with the time already used you can estimate the remaining runtime. BOINC is not aware of those numbers, hence presents fake estimates based on averages. This fact has also been discussed many times in this forum. |
Send message Joined: 12 Jul 11 Posts: 95 Credit: 1,129,876 RAC: 0 |
Thanks for the answer, I never stop this VM except I was trying to run some yoyo on it and it would kill boinc due to memory saturation (OOM killer) so it cost me various boinc restart during the past before I understood the issue and could limit to 1 concurrent yoyo and now it seems OK. Too late since I already cancelled the 2 tasks, but I'll know this for next time, I had never experienced such long runners, but it had been a long I hadn't worked again with LHC tasks. |
Send message Joined: 22 Jan 21 Posts: 5 Credit: 270,750 RAC: 1,179 |
setting up the native version of Theory I got as far as installing cvmfs but when I get to the command: "cvmfs_config setup" UBUNTU tells me that I need root privileges for that. Can anybody help me with that? |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,932,647 RAC: 128,073 |
If you run a command on Linux that requires root privileges prefix it with "sudo " and enter root's password when asked. Hence, here run "sudo cvmfs_config setup". Be aware that cvmfs_config allows some subcommands to be run as normal user and some subcommands require root privileges. |
Send message Joined: 22 Jan 21 Posts: 5 Credit: 270,750 RAC: 1,179 |
Tried that. It did nothing, just opened a new prompt line. "cvmfs_config probe" afterwards did the same. |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,932,647 RAC: 128,073 |
It did nothing, ... It may have done the setup without printing a comment. ... "cvmfs_config probe" afterwards did the same. Your CVMFS configuration may be incomplete. If you have trouble with it follow the HowTo here and post questions here. Leave this thread for comments/questions related to Theory native v300.08. |
Send message Joined: 17 Aug 17 Posts: 81 Credit: 8,410,301 RAC: 4,238 |
I get the error "Found Sudo-Version 1.9.9. This sudo version is lower than 1.9.10. It does not support regular expressions. Hence, sudoers will not be modified. Error running /tmp/prepare_theory_native_environment" Bit of A Linux n00b, does this mean I will need to wait for software updates before I can run Theory? (Running Linux Mint) I did have theory running briefly, but had to reinstalled and now its failing every time. |
Send message Joined: 2 May 07 Posts: 2220 Credit: 173,696,209 RAC: 24,770 |
Have one Task in -native with no ending time of running. Starting more than three times from the beginning: ppbar mb-inelastic 1800 - - pythia8 8.303 dire-default 0 2 0 0 2 https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=219290397 -------- PYTHIA Event Listing (hard process) ----------------------------------------------------------------------------------- no id name status mothers daughters colours p_x p_y p_z e m 0 90 (system) -11 0 0 0 0 0 0 0.000 0.000 0.000 1800.000 1800.000 1 2212 (p+) -12 0 0 3 0 0 0 0.000 0.000 900.000 900.000 0.938 2 -2212 (pbar-) -12 0 0 4 0 0 0 0.000 0.000 -900.000 900.000 0.938 3 21 (g) -21 1 0 5 6 101 102 0.000 0.000 2.291 2.291 0.000 4 21 (g) -21 2 0 5 6 102 103 0.000 0.000 -10.836 10.836 0.000 5 21 g 23 3 4 0 0 101 104 -1.281 -2.649 1.025 3.116 0.000 6 21 g 23 3 4 0 0 104 103 1.281 2.649 -9.570 10.012 0.000 Charge sum: 0.000 Momentum sum: 0.000 0.000 -8.545 13.127 9.966 -------- End PYTHIA Event Listing ----------------------------------------------------------------------------------------------- Rivet.AnalysisHandler: INFO Only using nominal weight. Variation weights will be ignored. 0 events processed |
Send message Joined: 4 Mar 11 Posts: 27 Credit: 3,842,802 RAC: 909 |
There is something strange going on with the initial estimated run time vs. the actual run time of some, if not all of these tasks: On my PC: Initial estimated run time = 61.5 minutes During the first 60 minutes of running the elapsed time increments at about 1second per second of clock time, and continues at this rate - only a couple of seconds out after an hour. However the remaining time only drops by about 15 seconds to 60.25 minutes. At 61.5 elapsed minutes the remaining time jumps to 9 days 23 hours and 46 minutes. This has the effect that my computer downloads a number of tasks that are, initially predicted to be finished by the deadline, but at the "1 hour" adjustment the majority will not even be started never mid finished by the deadline (10 days). This is potentially a rather unproductive waste of bandwidth, not to mention frustration for me. It will be interesting to see what the actual run time of these tasks is. |
Send message Joined: 2 May 07 Posts: 2220 Credit: 173,696,209 RAC: 24,770 |
Not all Theory finishing in a short time of a few hours. When you get a Sherpa.... We seeing some ending after 10 days in a crash. So, we have to control the duration for us. What you can do, is using an app_config.xml to control the number of input Theory tasks. mcplots show an info how many Tasks have done. |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,932,647 RAC: 128,073 |
BOINC's runtime estimation (like the credit calculation) is not really good when real runtimes are highly variable. In case of Theory tasks runtime can be between few seconds and up to 10 days. ATM it looks like runtimes of many tasks are much longer than usual (a couple of days) while some weeks ago many were much shorter. BOINC usually needs a couple of days, sometimes even weeks, to catch up and adjust the average. The only thing that helps is to slightly modify BOINC's work buffer size. Although there's a myth claiming it every now and then for years app_config.xml does not support a parameter that limits the number of tasks a project server sends. |
©2024 CERN