1) Message boards : ATLAS application : Atlas runs very slowly after 94% (Message 36689)
Posted 12 Sep 2018 by Profile thomasroderick
Post:
If you search through the threads, there are previous discussions on this. Frequently occurs on my machine as well. The Task will be moving along just fine until it gets closer to completion. Usually somewhere around 80% complete on my machine, and then the Task will extend. For example, a Task estimated to take 6 hours, when it gets to about 4 hours elapsed (and 80% complete), will continue running for the next several hours instead of the estimated 2 hours. I currently have one running that was estimated 6 hours, and as I type is Progress: 80.64%, Elapsed: 11 h 34 m, Remaining: 2 h 46 m. Anytime I see this, it usually means there are at least another 5 to 6 hours before it will finish.
For me... this is common. I only run one task at a time, so the machine is committed to working on only this one task. Like I said, my opinion is that you are not experiencing anything unusual or special... it simply happens.

- Tom.
2) Message boards : ATLAS application : Estimates and long running tasks (Message 35359)
Posted 23 May 2018 by Profile thomasroderick
Post:
My apologies if this is somewhere in the message boards, please point me there if possible. For the past few weeks I have been experiencing some inaccuracies which have been causing Tasks to abort. For example, my computer will download 4 tasks, each with an estimated 5h30m remaining run time. The first task will begin, and stay on track until it hits about 90% complete with about 30m remaining on the time. Those last 10% will take about 10h-12h to complete. Why is the task having trouble once it hits 90%, to take this long to complete? A 5h estimate turns into a 15-18h task. The subsequent issue is that the other tasks that have downloaded (and perform in the same manner), usually have to be aborted because they have been distributed to someone else.

Here is another example, which is quite extreme:
Task: 191622299
WU: 93613125
Computer: 10481956

This one started as a 5h50m estimate, is now into 1d15h23m... and still going (I am going to get HUGE credit for this one, right! :) . Right now is it 99.883% complete, and has been continuing to make progress. Like the others, it went through about 90% in 4h30m, but the last 9.883% has taken 1d11h... The task has been assigned to another user / computer, so it looks like even if this finished, it will probably get aborted by the server. Any assistance would be greatly appreciated. At one point I changed my preferences to only bring in minimal numbers of tasks, but 4 are always downloaded. If there is a way to only bring 1 at a time, I would do that.

Thank you all for the advice!
3) Message boards : ATLAS application : Unable to upload an Atlas task (Message 34560)
Posted 7 Mar 2018 by Profile thomasroderick
Post:
Yes, better now. I had two tasks stuck for over 24 hours, and this morning when I booted the machine, both of them immediately completed their upload and transferred. Thank you!
4) Message boards : ATLAS application : Unable to upload an Atlas task (Message 34360)
Posted 10 Feb 2018 by Profile thomasroderick
Post:
Same. Task completed two days ago, status is 'uploading' Look on the transfer tab and it says 100% complete uploading, but it has been stuck on this for well over 24 hours now. These were the signs from a couple weeks ago when the server / db issues started. I fear this task and the work will be lost, like last time. Even though it says 100% transferred, it will never 'complete' and on February 15th the system will say the deadline was missed, and the server will abort the task.
5) Message boards : ATLAS application : 'drive limit' error (Message 32729)
Posted 9 Oct 2017 by Profile thomasroderick
Post:
I had a similar issue, with the error, "Disk Limit Exceeded." If you can, check Task 158531023 on Work Unit 76138693 (https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=76138693). Mine is computer 10481956. Looks like several people have attempted this WU, and none have succeeded so far.
6) Message boards : ATLAS application : Atlas task running over 45 hours, 100% complete (Message 32338)
Posted 7 Sep 2017 by Profile thomasroderick
Post:
It finally gave up the ghost a few minutes ago. On the BOINC manager, came up with a status: Aborted, File disk full. The task output can be found at:

https://lhcathome.cern.ch/lhcathome/result.php?resultid=154132448

Run time and CPU time were drastically different, so there was something corrupt with my working on this task. Maybe a power or network glitch or something. Chalk it up to the gremlins.
7) Message boards : ATLAS application : Atlas task running over 45 hours, 100% complete (Message 32336)
Posted 7 Sep 2017 by Profile thomasroderick
Post:
I let it run for a little while longer... elapsed time of 2d 1:33:33. Still sitting at 100% with ----- remaining. I checked my tasks online and that specific one is now saying, "Timed out - no response." So it appears this one will be lost, and I will abort from my system. Thanks for looking into the situation. It has been a valuable learning experience for me, with you guiding me through.
8) Message boards : ATLAS application : Atlas task running over 45 hours, 100% complete (Message 32331)
Posted 7 Sep 2017 by Profile thomasroderick
Post:
F1: Immediately takes me to the login.
F2: Empty black screen, save for the single line at the top, "Event Processing information will appear here." But no additional lines of information.
F3: Image below.

9) Message boards : ATLAS application : Atlas task running over 45 hours, 100% complete (Message 32298)
Posted 6 Sep 2017 by Profile thomasroderick
Post:
Here's what I got on the Properties:
CPU Last Checkpoint: 47:20
CPU - Time: 47:20
Elapsed Time: 1d 21:56:42

Every subsequent check was similar, the CPU Last Check and Time increased and were the same, elapsed goes up.

Other Properties:
Received 9/2/2017 10:02:24am
Report Deadline: 9/5/2017 10:02:23pm
Est. Computation size: 16,020 GFLOPS
Est. Time Remaining -----
Fraction Done: 100.000%
Virtual mem size: 112.37MB
Working set size: 5.66 GB
Progress Rate: 2.160% per hour

Alas... it appears I may have run out of time. The deadline is only 45 minutes from now. I will let it run until then and see what happens. The other Atlas tasks will kick in after it clears.

Thank you for your your comments and assistance. The checklist has been beneficial as well.
10) Message boards : ATLAS application : Atlas task running over 45 hours, 100% complete (Message 32286)
Posted 5 Sep 2017 by Profile thomasroderick
Post:
Thank you, Yeti, for the assistance. Greatly appreciated. I have run through the checklist previously, looked at 16e specifically today. In the VM, I can get to the login and password screen, that loads quickly. I tried the Alt/F2 to see what was processing. The screen reads, "Event Processing information will appear here" and the screen is black. Of course, the task says it is 100% progress, but the elapsed time is continuing to run. I have had two other Atlas tasks run and complete this morning, while this one was suspended.

Any other suggestions? Or is this one simply a lost cause...
11) Message boards : ATLAS application : No tasks are available for ATLAS Simulation (Message 32274)
Posted 5 Sep 2017 by Profile thomasroderick
Post:
Adding to the conversation... I have regularly been receiving Atlas tasks without interruption, for the past few weeks. Behaviour on my computer: Atlas will download 8 tasks, run 1 (4 CPU), as soon as it completes and transmits, a new task will download. This keeps a queue of 8 tasks always on my computer (1 running, 7 waiting). This behaviour appeared immediately after the deadlines were cut from 2 weeks to 1 week. Late last week, the deadlines moved to 3 days, now they appear to be back out to 1 week. But I have not seen any times when my computer was without Atlas tasks.
12) Message boards : Number crunching : Stuck at 100% (Message 32272)
Posted 5 Sep 2017 by Profile thomasroderick
Post:
I have the exact same issue with a single Atlas task, just posted something in the Atlas page about this. Every other task is running / completing as normal. This one was fine up to about 97% complete, then got to 100% over the next 43 hours. It is sitting at 100% after 45 elapsed hours. Task: 154132448, WU: 73907132. Doing some troubleshooting on it now, the deadline is in 12 hours.
13) Message boards : ATLAS application : Atlas task running over 45 hours, 100% complete (Message 32254)
Posted 5 Sep 2017 by Profile thomasroderick
Post:
My computer has been humming away for a couple weeks, loading 8 tasks at a time, and running through them one by one, at about 2 hours per task. A couple days ago, 1 task started, and currently sits at 100.000% complete after 1d 21:46:25 elapsed time. It was going at normal rate until it hit 97% (after about 2 hours), and then has crawled to 100% over the next 43 hours. No other tasks started or ran, 4 CPUs devoted to this one task. I tried suspending, resuming, updating the project, restarting BOINC, rebooted the computer... nothing has kicked it over. I have suspended and resumed other tasks, and they are all running and completing appropriately.

This is task: 154132448, Work Unit: 73907132. It has a deadline about 13 hours from right now. I do not really care about the credit. I simply hate to see a completed research effort get destroyed.

Any thoughts on how to get this over the line? Or is this a case of aborting the task and moving on? Have not seen anything in the logs to indicate there was an issue, and other tasks around it did not have problems. Thoughts greatly appreciated, thank you in advance.

- Tom.
14) Message boards : ATLAS application : Download failures (Message 31747)
Posted 1 Aug 2017 by Profile thomasroderick
Post:
Coming full circle on the thread... I was able to successfully download Atlas files and tasks this evening. The downloads started off a little slow on the throughput, otherwise there were no issues. All is well again, thank you!
15) Message boards : ATLAS application : Download failures (Message 31711)
Posted 30 Jul 2017 by Profile thomasroderick
Post:
Ever since the day that there was the issue after the cleanup of old files, I have been experiencing the same issues. LHC was running Atlas fine for a long time, where it would download 4 tasks and run them through without issue.. What I am seeing now (and since the day of the server cleanup) is my machine will attempt to download files for tasks, and get stuck retrying on a few for several hours.

7/29/2017 7:22:43 PM | LHC@home | Started download of jf_f3ff3ac08153d0ee04ea606f0dea9a0e
7/29/2017 7:23:05 PM | | Project communication failed: attempting access to reference site
7/29/2017 7:23:05 PM | LHC@home | Temporarily failed download of jf_f3ff3ac08153d0ee04ea606f0dea9a0e: connect() failed
7/29/2017 7:23:05 PM | LHC@home | Backing off 01:05:41 on download of jf_f3ff3ac08153d0ee04ea606f0dea9a0e
7/29/2017 7:23:06 PM | | Internet access OK - project servers may be temporarily down.

I have Updated, Restarted, Removed and re-added the LHC project several times, over several days. Suggestions? Tried 2 different networks (home and work), same issues. Connection to other projects are no issue.



©2024 CERN