Message boards :
ATLAS application :
Long running tasks
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Sep 08 Posts: 807 Credit: 652,242,820 RAC: 285,643 |
It seems like ATLAS has changed to a longer running tasks? |
Send message Joined: 16 May 08 Posts: 4 Credit: 1,320,525 RAC: 0 |
yea I have also one already running more then 24 h. Is that normal? Greetings Marcus |
Send message Joined: 15 Jun 08 Posts: 2411 Credit: 226,331,649 RAC: 132,112 |
... Is that normal? Yes. At least for some parameter sets. It looks like there are different types of tasks in the queue. Some with shorter runtimes, others with longer runtimes. Everything is fine as long as the tasks finish successfully. Just let them run. |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
Indeed there are longer tasks in the system now. On the image on this post you can see the different bunches of tasks. The larger ones at the bottom are the older shorter tasks and the smaller ones are the new longer tasks. The reason is that the kind of physics being simulated is different, with more complicated particle interactions in the new tasks which require more CPU time to process. In terms of the physics, the previous tasks consisted of simulating leptons (electrons and muons) which make nice clean tracks through the detector. The new tasks simulate hadrons (particles made up of quarks) and when they interact with the detector they produce "jets" of particles which are much more complex to simulate. |
Send message Joined: 2 May 07 Posts: 2099 Credit: 159,815,788 RAC: 143,603 |
|
Send message Joined: 27 Sep 08 Posts: 807 Credit: 652,242,820 RAC: 285,643 |
Thanks for the details, I was worried when I saw one more than 24hrs. hadrons make a mess ;) |
Send message Joined: 16 May 08 Posts: 4 Credit: 1,320,525 RAC: 0 |
:( https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=120012872 Validate error after 252,537.20 CPU time. Too many errors (may have bug) |
Send message Joined: 2 May 07 Posts: 2099 Credit: 159,815,788 RAC: 143,603 |
Your Atlas-task was starting and stopped a few times. Atlas begin from the start after the task is stopped. You can see how quick (2 hours with 12 CPUs) the task was finished from an other user. It is very useful in Atlas to run the task from the begin to the end without interruption, normally. |
Send message Joined: 15 Jun 08 Posts: 2411 Credit: 226,331,649 RAC: 132,112 |
That's bad luck. At least the fact that 3 (!) of your wingmen misconfigured their computers (missing CVMFS) which caused their tasks to fail. In addition - as maeax already pointed out - your own task has been suspended a couple of times for several hours each. Unlike ATLAS native ATLAS vbox should not restart from the scratch if the VM has written a snapshot but you might have hit a maximum runtime limit. 2019-08-04 21:14:16 (2824): vboxwrapper (7.7.26196): starting 2019-08-05 00:24:51 (2824): Successfully stopped VM. 2019-08-05 16:35:47 (4544): vboxwrapper (7.7.26196): starting 2019-08-06 00:43:38 (4544): Successfully stopped VM. 2019-08-06 07:08:23 (9120): vboxwrapper (7.7.26196): starting 2019-08-06 07:12:47 (9120): Guest Log: Starting ATLAS job. (PandaID=4437421466 taskID=18722495) 2019-08-06 18:51:47 (9876): vboxwrapper (7.7.26196): starting 2019-08-07 00:18:56 (9876): Successfully stopped VM. 2019-08-07 07:00:19 (9332): vboxwrapper (7.7.26196): starting 2019-08-07 07:40:16 (9332): Stopping VM. 2019-08-07 07:40:16 (9332): Error 0x80070005 in vbox51::VBOX_VM::stop (c:\src\boinc\boinc\samples\vboxwrapper\vbox_mscom_impl.cpp:1449) 2019-08-07 07:40:16 (9332): Error Source : SessionMachine 2019-08-07 07:40:16 (9332): Error Description: The object is not ready 2019-08-07 17:50:25 (6120): vboxwrapper (7.7.26196): starting 2019-08-08 00:20:30 (6120): Successfully stopped VM. 2019-08-08 06:28:17 (5324): vboxwrapper (7.7.26196): starting 2019-08-08 07:01:44 (5324): Successfully stopped VM. 2019-08-08 18:21:13 (10020): vboxwrapper (7.7.26196): starting 2019-08-09 00:05:55 (10020): Successfully stopped VM. 2019-08-09 07:04:36 (7756): vboxwrapper (7.7.26196): starting 2019-08-09 07:31:37 (7756): Successfully stopped VM. 2019-08-09 14:48:36 (5900): vboxwrapper (7.7.26196): starting 2019-08-10 01:46:32 (5900): VM Completion File Detected. 01:46:44 (5900): called boinc_finish(0) |
Send message Joined: 27 Sep 08 Posts: 807 Credit: 652,242,820 RAC: 285,643 |
Some of my tasks went past the deadline so I aborted them. They were at 7-8 days |
©2024 CERN