Message boards :
ATLAS application :
Task processing slowing down considerably beyond ~85% progress
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1815 Credit: 118,672,440 RAC: 39,813 |
On my i7-4930k (6+6H cores), @ 3.6GHz, I have running 4 2-core ATLAS tasks. When they started yesterday, the runtime prediction was about 13 hours. What I noticed this morning was that meanwhile more than 29 hours have passed, and the progress was at about 86%; the remaining time is shown as about 3 1/2hours. Now, some 5 hour later, the progress is at about 90%, and the remaining time is shown as about 3 hours! Watching the progress percentage, I see that processing obviously has become awfully slow. It takes the value of the third digit right from the comma (i.e. the 1/1000th percent digit) about 6-7 seconds to move ahead. With this, I guess the remaining time will not be 3 hours, but probably a manyfold of it. Is this normal behaviour? I had crunched many ATLAS tasks before, but this is new to me. |
Send message Joined: 18 Dec 15 Posts: 1815 Credit: 118,672,440 RAC: 39,813 |
a close look at the tasks within this afternoon shows that currently the progress is about 1% per hour. From all my past experience with ATLAS tasks, I can say that this is totally unusual. Something must be wrong with these tasks. I am not even sure whether I should abort them, as I suspect that - if they besome even slower and slower - they will not finish in time (Sept. 6th). Is anyone making the same experience? |
Send message Joined: 15 Jun 08 Posts: 2534 Credit: 254,137,209 RAC: 54,451 |
You may check the output at console 2. It shows how many events are already processed and how many seconds (average) your computer needs to process a single event. The process bar of your BOINC client is not reliable, especially as ATLAS sends out different types of jobs. |
Send message Joined: 28 Sep 04 Posts: 728 Credit: 49,144,463 RAC: 29,814 |
These long tasks are here, see https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5105 |
Send message Joined: 18 Dec 15 Posts: 1815 Credit: 118,672,440 RAC: 39,813 |
You may check the output at console 2.the tasks show about 52 processed events. Processing time about 2000 secs (which is awfully high, compared to what it was informer days). As I remember from before, there were tasks with 100 events and tasks with 200 events. Any idea how many events the current tasks have? Should they have 200, in my case the leadtime (Sept. 6th) would definitely be exceeded. |
Send message Joined: 14 Jan 10 Posts: 1419 Credit: 9,474,701 RAC: 2,980 |
You told, you were running 2 core tasks, so you have to add the last 2 events together.You may check the output at console 2.the tasks show about 52 processed events. Processing time about 2000 secs (which is awfully high, compared to what it was informer days). The total events are 200 at the moment, so you are a little past midway. |
Send message Joined: 18 Dec 15 Posts: 1815 Credit: 118,672,440 RAC: 39,813 |
okay, the first one just finished after 47 hours and succesfully produced a HITS file :-) The other three tasks are at round 75% progress (as seen in console2). |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
The current tasks still process 200 events but they are rather heavy on CPU due to the more complex physics involved. The average CPU time per event is roughly 3 times higher than the tasks we had a few weeks ago. So don't give up on them if they are still crunching! |
Send message Joined: 18 Dec 15 Posts: 1815 Credit: 118,672,440 RAC: 39,813 |
So don't give up on them if they are still crunching!no, I didn't give up :-) All 4 got finished okay already; right now, besides 2 Theory, another ATLAS task is running. |
Send message Joined: 9 Aug 05 Posts: 36 Credit: 7,698,293 RAC: 0 |
I have one wu sitting at 100% complete but still running.... 32 hours now. |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,652 |
Filipe, when you click on your Atlas-task in Boinc-manager - Show VM-Console - Open the RDP - with F2 you can see how many Collisions your Task had made so long. 200 is the max. There are tasks running for the moment with more than one Day when you use 4 CPU's. |
Send message Joined: 9 Aug 05 Posts: 36 Credit: 7,698,293 RAC: 0 |
I am running 2-core tasks. VM-Console doesn't open anymore. i get an error message when a try to open it. But maybe because it shows 100% complete? It has been at 100% for more than 12 hours now. Total elapsed time now 34hours. still running. |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,652 |
If the vboxheadless.exe from this task in Taskmanager show no CPU-use, save the Boinc-Manager if possible and start one's more. To understand how Atlas is working see Yeti's-Checklist in the Atlas-Forum. |
Send message Joined: 9 Aug 05 Posts: 36 Credit: 7,698,293 RAC: 0 |
@maeax: It was thanks to Yeti's check list that i manage to have Atlas VM's running on my computer. I saw your tasks run for +/- 40 cpu hours. How many cpu-cores are you using on each task? is it 4? i'm running 2 cores-tasks. |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,652 |
Have 4, 5 and 6 CPU's using for Atlas. |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,652 |
Server-Status shows 22.48 Hours for the Duration of the last 100 Tasks of Atlas. |
Send message Joined: 9 Aug 05 Posts: 36 Credit: 7,698,293 RAC: 0 |
Mine is still running after 68 hours... Is there a wall-clock time to worry about? |
Send message Joined: 27 Sep 08 Posts: 847 Credit: 692,009,834 RAC: 113,666 |
I abort mine if they run pass the deadline. |
Send message Joined: 9 Aug 05 Posts: 36 Credit: 7,698,293 RAC: 0 |
Finished and validates after running for 84 hours! |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,652 |
Now you can run 4 or more cores instead of 2. :-) |
©2024 CERN