| Info | Message |
|---|---|
| 21) Message boards : ATLAS application : 2-core tasks with process "athena.py" running 4 times
Message 32501 Posted 22 Sep 2017 by HerveUAE
|
Hi, I have seen a few times some ATLAS tasks where athena.py is running 4 times inside the VM instead of 2 times. I have configured both "LHC Preferences" and "app_config.xml" to use only 2 cores, but from time to time I see tasks where: - "Alt-F2" in the console shows "Event nr. 1" 4 times, and the same for all other events. - "Alt-F3" in the console shows 4 "athena.py" processes, each running at 50% (normal since the VM only has 2 cores allocated to it). The result is a task that takes twice the time to finish. Is that a known issue? If yes, what can be done to avoid it? If no, how can I help to further investigate it? Regards, Herve |
| 22) Message boards : ATLAS application : Low CPU usage on ATLAS and SixTrack tasks
Message 32485 Posted 21 Sep 2017 by HerveUAE
|
Good to know that your problem has been fixed, and welcome to BOINC and LHC@Home community. |
| 23) Message boards : ATLAS application : Low CPU usage on ATLAS and SixTrack tasks
Message 32452 Posted 17 Sep 2017 by HerveUAE
|
Hi, ATLAS and SixTrack tasks do not use GPU, so it is normal that your graphic card is not used. I see that you currently have both ATLAS and SixTrack tasks pending to be executed. SixTrack have a more straightforward behaviour when it comes to CPU usage, compared to ATLAS. So to test your CPU usage setting, I would recommend that you cancel all your ATLAS tasks, change in your settings to download only SixTrack tasks, and verify that your CPU usage setting is properly followed by the SixTrack tasks. Once you are satisfied with your CPU usage setting and BOINC behaviour, then start downloading ATLAS tasks: set in your preferences to use only 4 cores and try with 1 task. Test if that task goes through properly. Depending on the bandwidth of your Internet connection, the task will spend some 10 to 30 minutes using only 1 core, and then will go to using your full 4 cores. ATLAS tasks have a starting and ending phases that use only 1 core. During those phases it is normal that your actual CPU usage is below your setting (75 or 85%). Adding more SixTrack tasks will actually not use the remaining cores, and that is normal behaviour. Hoping it helps. Herve |
| 24) Message boards : ATLAS application : Only 6 concurrent tasks per computer?
Message 32389 Posted 11 Sep 2017 by HerveUAE
|
Thanks David, All my hosts are back to their usual crunching routine. I will configure the slow host to have a smaller number of pending tasks in order to have all them processed within a day. I am not sure what would be the best setting in order to make sure the ATLAS jobs are processed quickly by the community. Reduce the deadline even further? Regards, Herve |
| 25) Message boards : ATLAS application : No tasks are available for ATLAS Simulation
Message 32374 Posted 9 Sep 2017 by HerveUAE
|
Hi Crystal, I have Max # jobs 24 on my 3 machines and can only get 6 pending ATLAS tasks. See this thread: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4432 |
| 26) Message boards : ATLAS application : Only 6 concurrent tasks per computer?
Message 32370 Posted 9 Sep 2017 by HerveUAE
|
Hi, At the beginning of the week, there was a configuration where "reliable" hosts were given priority for dispatching ATLAS tasks. As a result only one of my 3 hosts were given ATLAS tasks. This issue was discussed in this thread: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4280&postid=32239#32239 Once the server was reconfigured, my 3 hosts went back to their routine: 24 ATLAS tasks pending, as per the LHC@home preferences. However: - Within 1 day, the fastest host https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10455317, crunshing usually 9 2-core tasks at the same time, could only have 6 pending tasks. - Today morning, the medium speed host https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10416269, crunshing usually 3 2-core tasks at the same time, could also only have 6 pending tasks. - And the slowest host https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10420599, crunshing usually 1 4-core task at a time, is still having 9 pending tasks, but the number of pending tasks is gradually decreasing. A similar observation was posted here as well: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4280&postid=32369#32369 So it seems an infrastructure issue. Could someone from the CERN team have a look? Thanks, Herve |
| 27) Message boards : Number crunching : squid configuration for LHC@Home?
Message 32354 Posted 8 Sep 2017 by HerveUAE
|
Hello, Could someone recommend me a squid configuration that works for all projects in LHC@Home? Thanks in advance Herve |
| 28) Message boards : ATLAS application : No tasks are available for ATLAS Simulation
Message 32231 Posted 4 Sep 2017 by HerveUAE
|
On every of three PC's one SL69 native. Same for me: 3 Windows PCs with VirtualBox, same settings in LHC@Home preferences (ATLAS + SixTrack), but only one can get ATLAS tasks. I just tried removing and re-adding LHC@Home in BOINC on one of the machines but the result is the same: "No tasks are available for ATLAS Simulation". |
| 29) Message boards : ATLAS application : highly variable credit points
Message 31955 Posted 15 Aug 2017 by HerveUAE
|
I have observed the same: on all my 3 computers the allocated credits are either "as before" or nearly half. It started some 3 days ago. I have not been able to identify any difference between those tasks. It seems random from what I see, but surely there are some explanation. |
| 30) Message boards : ATLAS application : Some Validate errors
Message 31895 Posted 8 Aug 2017 by HerveUAE
|
The initialisation and end phases of ATLAS tasks are single core, while the calculation phase is multicore. If the highest RAM requirements are from the initialisation phase, then it is logical to have similar RAM requirements for 1-core and 2-core tasks. |
| 31) Message boards : ATLAS application : Download failures
Message 31734 Posted 31 Jul 2017 by HerveUAE
|
From what I saw, the problem only occurs when downloading the biggest file (110 - 120Mbytes), the other files of the task download without problem. Also the problem occurred progressively, I mean that 2 days ago the download was possible, but extremely slow and after multiple re-tries. Now the download fails systematically, with the message "server backoff". |
| 32) Message boards : Number crunching : downloading problems...
Message 31733 Posted 31 Jul 2017 by HerveUAE
|
Same here. This problem occurs when downloading the big part of ATLAS tasks, that file that is above 100 Mbytes. |
| 33) Message boards : ATLAS application : all ATLAS tasks fail after about 10 minutes
Message 31705 Posted 29 Jul 2017 by HerveUAE
|
Hi Jim, Erich, I have tried several memory settings over time. My own experience is that the RAM requirement is not fix and varies from one ATLAS task to another. The higher the allocated RAM, the lower is the probability that the task will fail. However, from time to time, one task out of many will fail. I personally think it does not depend on the number of allocated cores, but on the ATLAS algorithm itself. And it could very well be that a given set of tasks has a higher RAM requirement than other sets. I personally have set the RAM to 7000MB and very seldom have issues related to a lack of memory. My laptop has only 8Gbytes so I could allocate only 5800MB to ATLAS. In recent days, I have not had any memory related problems on that machine. There was some extensive tests and discussions in this thread: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4146#29171 where 5000MB was suggested as a good minimum. Try increasing progressively and see if it helps. |
| 34) Message boards : ATLAS application : all ATLAS tasks fail after about 10 minutes
Message 31701 Posted 29 Jul 2017 by HerveUAE
|
would be nice if we had some detailed description somewhere as to what "error Code 65" is. Hi Erich, When looking at the stderr with error code 65, look at the line that starts with WARNING Transform now exiting early with exit code 65 At the very end of this line (you need to scroll all the way to the right), there are some details that can be of help. I saw the same error as for Jim: FATAL makePool failed Which is the same root cause: not enough RAM allocated (the default 3400MB is not sufficient). Hoping it helps. |
| 35) Message boards : ATLAS application : all ATLAS tasks fail after about 10 minutes
Message 31700 Posted 29 Jul 2017 by HerveUAE
|
Hi Jim, Looking at the log of the tasks on your machine, I can see 2 traces of interest: Setting Memory Size for VM. (3400MB) You need more than 3400 Mbytes to run Atlas tasks. You use the default RAM setting for 1 CPU tasks, which is not enough. You should write your own app_config.xml file to overwrite the default setting and set to, say, 5000MB. FATAL makePool failed From my own experience, this error occurs when you do not have enough RAM allocated, confirming the above observation. |
| 36) Message boards : ATLAS application : Tasks failing with "G4 exception at line 3445"
Message 31271 Posted 4 Jul 2017 by HerveUAE
|
Can you try a reboot to see if it helps? I have rebooted the PC and it solved the issue. It is not the first time it occurs on that computer specifically and I don't remember seeing G4 error in any of my other 2 computers. Strange... You may try a project reset and delete remaining trash (like old vdi files) from the slots directory. Next time it occurs, I will try that suggestion instead of rebooting the computer. I don't think this is coming from 2 tasks running. The output is coming from the same procID 19104, I have seen stderr.txt logs with lines repeated multiple times and / or intertwined several times in the past. I also do not think it is linked to a task failure. |
| 37) Message boards : ATLAS application : taskID=11283914, Error "No events to process (skipEvents) >= (inputEvents)"
Message 31239 Posted 2 Jul 2017 by HerveUAE
|
I found some other tasks with the same taskID and the same error: https://lhcathome.cern.ch/lhcathome/result.php?resultid=150238793 https://lhcathome.cern.ch/lhcathome/result.php?resultid=149949813 https://lhcathome.cern.ch/lhcathome/result.php?resultid=150207189 https://lhcathome.cern.ch/lhcathome/result.php?resultid=150267184 (Workunit 72577115 failed 4 times) And also with taskID=11364822: https://lhcathome.cern.ch/lhcathome/result.php?resultid=150197071 https://lhcathome.cern.ch/lhcathome/result.php?resultid=149813710 https://lhcathome.cern.ch/lhcathome/result.php?resultid=150246249 |
| 38) Message boards : ATLAS application : taskID=11283914, Error "No events to process (skipEvents) >= (inputEvents)"
Message 31222 Posted 2 Jul 2017 by HerveUAE
|
I had a task with the following error:exit code 15 (No events to process: 4300 (skipEvents) >= 2000 (inputEvents of EVNT) I remember this occurred before and was due to some error is the configuration of the task. I suspect other crushers have had the same error as well. Tasks: https://lhcathome.cern.ch/lhcathome/result.php?resultid=150190838 https://lhcathome.cern.ch/lhcathome/result.php?resultid=150213326 taskID=11283914 |
| 39) Message boards : ATLAS application : Tasks failing with "G4 exception at line 3445"
Message 31215 Posted 1 Jul 2017 by HerveUAE
|
On only one of my 4 computers, several of the tasks fail with "Validate error" and the Stderr output has the following error:Transform executor raised TransformValidationException: EVNTtoHITS got a SIGABRT signal (exit code 134); G4 exception at line 3445 (see jobReport for further details) Here are example of such tasks: https://lhcathome.cern.ch/lhcathome/result.php?resultid=150102272 https://lhcathome.cern.ch/lhcathome/result.php?resultid=150089802 https://lhcathome.cern.ch/lhcathome/result.php?resultid=150085059 https://lhcathome.cern.ch/lhcathome/result.php?resultid=150081883 Anybody has an idea of what could cause the error? Should I just reboot the computer to see if it goes away? |
| 40) Message boards : ATLAS application : ATLAS Queue is empty
Message 31061 Posted 26 Jun 2017 by HerveUAE
|
Is this a different type of task now ("longrunners" as we had them some time ago)? Same observation for me: the new tasks seem to be longrunners. |
©2026 CERN