Message boards :
Theory Application :
Pythia8 looooooong runner!
Message board moderation
Author | Message |
---|---|
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 15 |
So far task 265306266 reports ===> [runRivet] Fri Feb 28 02:13:12 UTC 2020 [boinc PbPb heavyion-mb 2760 - - pythia8 8.235 default 100000 42]so it's 40% done 13436 boinc 39 19 157236 112724 3076 R 88.2 1.4 1944:57 pythia8.exeafter 32 hours... |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 15 |
Now 89% done, after 70 hours... |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 15 |
So task 265306266 finished yesterday after about 80 hrs and I didn't spot any error messages in the log on the occasions that I checked it, up to events processed = about 99700. BOINC reports a success and assigns me 3k credits. On the other hand, the final log reports 02:13:14 GMT +00:00 2020-02-28: cranky-0.0.31: [INFO] ===> [runRivet] Fri Feb 28 02:13:12 UTC 2020 [boinc PbPb heavyion-mb 2760 - - pythia8 8.235 default 100000 42]not zero as usual, and the MC Plots page (updated "2020-03-03 14:02:35") has gone from showing run events attempts success failure lostto run events attempts success failure lostimplying that it thinks the job failed. (OK, so that might not be my specific task.) But, did those 80 CPU hours actually achieve anything useful, or not? |
Send message Joined: 24 Oct 04 Posts: 1172 Credit: 54,757,835 RAC: 14,217 |
YES yours is Valid and Theory tasks are not a certain length of time so we have short ones and long ones. I have had a few of those 3 day tasks in the past (testing the versions) and one thing about these Theory tasks is you can get one that runs for 10 days and then end as a Computer Error so I just abort them if they get to 5 days and still try running (BUT I always watch mine start running via the VM Console since you can see if it has any *Fail* problems right at the start and then you can watch for them on the last page to see if it is *Failed* and if that happens in either places it will run for days but end up a Computer Error. This is how you want them to be on the VM Console after running |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 15 |
YES yours is Valid and Theory tasks are not a certain length of time so we have short ones and long ones.I understand that different tasks model different physics cases with different codes and so run for different times. My point is that what looks to me and to BOINC like a task that has successfully generated 100k lead-ion events, isn't reflected on the MC Plots page even several days later: run events attempts success failure lostI'm assuming that MC Plots represents the physics "customer" view, so either those 100k events aren't useful for physics, or else (N.B. attempt numbers 10 vs 42) the results take a long time to percolate through. In turn, wouldn't the latter mean that the status reported on MC Plots is meaningless for deciding whether or not to cull tasks locally, as implied in the Sherpa thread? |
Send message Joined: 14 Jan 10 Posts: 1417 Credit: 9,441,051 RAC: 798 |
A valid long runner: https://lhcathome.cern.ch/lhcathome/result.php?resultid=265712521 ===> [runRivet] Wed Mar 4 07:53:35 UTC 2020 [boinc pp jets 7000 40,-,460 - pythia8 8.240 cr1 100000 44]CPU time 2 days 2 hours 3 min 43 sec Peak disk usage 2.45 GB |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 15 |
Another valid long runner: 271586073 ===> [runRivet] Wed Apr 15 06:58:20 UTC 2020 [boinc pp jets 7000 40,-,610 - pythia8 8.301 dire-default 100000 2]Run time 5 days 12 hours 19 min 22 sec CPU time 5 days 10 hours 33 min 17 sec Peak disk usage 1.91 MB |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
It is satisfying that the last Theory I will run for a while was also the longest I have ever had, at 3 days 17 hours 27 min 8 sec. It was a pythia8, and I had looked for it on MCplots, but it was not there, so I just let it run. https://lhcathome.cern.ch/lhcathome/result.php?resultid=271822210 |
Send message Joined: 15 Jun 08 Posts: 2531 Credit: 253,722,201 RAC: 41,981 |
I had looked for it on MCplots, but it was not there It is a Theory_2378-1045515-2_2, hence can be found here: http://mcplots-dev.cern.ch/production.php?view=runs&rev=2378&display=all You may filter the complete list for: pp jets 7000 25,-,480 - pythia8 8.301 dire-default attempts: 4 success: 2 failure: 0 unknown: 2 |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
It is a Theory_2378-1045515-2_2 OK, I was searching in Run 2279, which must have been the last one I looked for. |
Send message Joined: 27 Sep 08 Posts: 847 Credit: 691,261,472 RAC: 104,653 |
I have an 8 day one! but it's not going the make the deadline so probally wasted effort from a BOINC perspective. Theory_2390-1128549-16 pp jets 7000 80,-,1460 - pythia8 8.301 dire-default |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 15 |
I spotted 289594409 as it was taking so long: ===> [runRivet] Sat Nov 28 23:15:38 UTC 2020 [boinc PbPb heavyion-mb 2760 - - pythia8 8.230 default 90000 150] Run time 1 days 23 hours 53 min 55 sec CPU time 1 days 23 hours 50 min 55 sec Peak working set size 186.35 MB At least this lead-lead task might actually have succeeded: Container 'runc' finished with status code 0. Meanwhile, elsewhere: 2688018 boinc 39 19 53544 20764 7116 R 96.3 0.3 3561:32 pythia8.exe has reached "34300 events processed" after more than two days... luckily it's only going for 59k events. ===> [runRivet] Sun Nov 29 10:25:39 UTC 2020 [boinc pp jets 7000 150,-,1860 - pythia8 8.301 dire-default 59000 150] |
Send message Joined: 24 Oct 04 Posts: 1172 Credit: 54,757,835 RAC: 14,217 |
It doesn't happen often for me but once in a while I do get the pythia8's to run between 24 and 35 hours Valid ( running Theory Simulation v5.21) https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2933395 one quick example I have an 8 day one! but it's not going the make the deadline so probally wasted effort from a BOINC perspective. \pythia8 8.301 dire-default is one you should probably abort since they always run for 10 days and fail (all the ones I have checked) |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 15 |
\pythia8 8.301 dire-default is one you should probably abort since they always run for 10 days and fail (all the ones I have checked)Mine's been updating the log file with believable, if slow, progress: 58100 events processedso I've let it run. Guess we find out this evening... "dire" does indeed seem to be code for troublesome, though. |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 15 |
\pythia8 8.301 dire-default is one you should probably abort since they always run for 10 days and fail (all the ones I have checked)Mine's been updating the log file with believable, if slow, progress... 10:25:41 GMT +00:00 2020-11-29: cranky-0.0.32: [INFO] ===> [runRivet] Sun Nov 29 10:25:39 UTC 2020 [boinc pp jets 7000 150,-,1860 - pythia8 8.301 dire-default 59000 150] 17:34:27 GMT +00:00 2020-12-03: cranky-0.0.32: [INFO] Container 'runc' finished with status code 0. Run time 4 days 7 hours 8 min 53 sec CPU time 4 days 6 hours 37 min 23 sec Credit 3,085.19 Peak working set size 293.17 MB Peak swap size 600.68 MB Peak disk usage 1.86 MB |
Send message Joined: 24 Oct 04 Posts: 1172 Credit: 54,757,835 RAC: 14,217 |
https://lhcathome.cern.ch/lhcathome/result.php?resultid=289569109 I would say that was pure luck but after checking yours I see it was a Linux running the Native version of this particular Theory pythia8 8.301 dire-default so maybe they work that way but I have never seen one Valid running on a Windows OS with the regular version of Theory. Maybe they should make sure these are only run on Linux Native Some people never noticed this problem when they have 100+ cores running 24/7 but I happen to check others when I find mine having the problem and another member pointed that out a couple months ago when I was testing them. example https://lhcathome.cern.ch/lhcathome/result.php?resultid=288750991 |
Send message Joined: 24 Oct 04 Posts: 1172 Credit: 54,757,835 RAC: 14,217 |
either my isp is running like a snail or this server is trying to........just delete this with your magical powers |
©2024 CERN