21)
Message boards :
ATLAS application :
LHC shuts down, but simulation continues!
(Message 37512)
Posted 3 Dec 2018 by gyllic Post: I would be great if at least one of the subprojects other than ATLAS would have tasks...Sooner or later there will be jobs/tasks be reliably available again. Just for your information: It seems like you have a PC connected to boinc that is running Windows Vista. Since Windows Vista isn't getting any security updates from Microsoft anymore, it's probably a bad idea to connect this PC to the internet. So you might consider upgrading to a newer Windows version or switch to a Linux based OS. |
22)
Message boards :
ATLAS application :
LHC shuts down, but simulation continues!
(Message 37502)
Posted 3 Dec 2018 by gyllic Post: Which means that many crunchers with machines with low RAM (and no possibility to upgrade) will NOT be able to crunch LHC projects (for example: what concerns myself - only 2 out of my 5 PCs have more than 4GB RAM, with the other three I cannot crunch ATLAS because of only 4GB and 3 GB RAM).Machines with 4GB RAM can crunch ATLAS native tasks (but it has to be at least 4GB RAM, otherwise you won't get tasks). |
23)
Message boards :
ATLAS application :
Guide for building everything from sources to run native ATLAS on Debian 9 (Stretch) Version 2
(Message 37496)
Posted 3 Dec 2018 by gyllic Post: Looks like you are missing the python setuptools package. To fix your problem, type: sudo apt install python-setuptoolsand execute the cmake ../command again. Then follow the shown procedure. |
24)
Message boards :
Number crunching :
Memory requirements for LHC applications
(Message 37412)
Posted 23 Nov 2018 by gyllic Post: Thanks David! Are you talking about Boinc tasks or VB jobs or what?About VB jobs (that run inside the vbox/boinc tasks) and native ATLAS tasks (these are the same as the vbox jobs that run inside the vbox/boinc tasks). So the entire vbox/boinc task will need much more RAM than shown in the plots from David (because of the OS and all other stuff that needs to be virtualized/emulated). |
25)
Message boards :
Number crunching :
Memory requirements for LHC applications
(Message 37400)
Posted 22 Nov 2018 by gyllic Post: Thanks for the info david! Just out of interest, do you get these values you used for the plot from memory_monitor_out (or something like this) files? If not, how do you get those values? How big are the differences in used/needed RAM depending on the task IDs (probably small because the vbox app uses a fixed value for all different task IDs)? |
26)
Message boards :
Number crunching :
Memory requirements for LHC applications
(Message 37372)
Posted 18 Nov 2018 by gyllic Post: I get different numbers than bronco, but thats probably because bronco has not considered the data that has been swapped out in the 4x1-core tasks case, see below. The used machine is a dedicated, headless machine that is only used for native ATLAS tasks. The test procedure was very similar to bronco’s: Reboot the machine, check the “used memory” value from the “top” command, start a new native ATLAS task, check the “used memory” output after ~2 hours of runtime (wall clock time), wait until the task finished, reboot machine again, and so on. So the only value that is taken into account here is the “used memory” output from the “top” command. This was done for one 1-core, two 2-core, one 3-core and one 4-core tasks. All tasks were from the same task ID. But the memory requirements probably won’t change hugely with different task IDs? Here are the numbers (I have rounded some of the values to get nicer numbers which lead to entire memory is not equal to free+avail+...): After restart (~constant for all reboots): KiB Mem : 6106000 total, 5870900 free, 110828 used, 124012 buff/cache KiB Swap: 6280000 total, 6280000 free, 0 used. 5799740 avail Mem With one 1-core task: KiB Mem : 6106000 total, 714116 free, 2346892 used, 3044732 buff/cache KiB Swap: 6280000 total, 6280000 free, 0 used. 3480912 avail Mem==> ( 2346892 - 110828)/1024 ~ 2200MB RAM used by one 1-core native ATLAS task. Bronco calculated 1250MB RAM for a 1-core task based on the 6100MB RAM that were used. But if we consider the swapped data the “RAM used” would get to ~ “(6100MB+3000MB)/4 ~ 2300MB” which is much closer to my values. With two concurrently 2-core tasks: KiB Mem : 6106000 total, 140884 free, 5037536 used, 927320 buff/cache KiB Swap: 6280000 total, 6270268 free, 9732 used. 824744 avail Mem==> ((5037536 – 110828)/2)/1024 ~ 2400MB RAM used by one 2-core native ATLAS task. Relatively good agreement with bronco’s data. With one 3-core task: KiB Mem : 6106000 total, 329500 free, 2814716 used, 2961520 buff/cache KiB Swap: 6280000 total, 6280000 free, 0 used. 3014728 avail Mem==> ( 2814716 - 110828)/1024 ~ 2600MB RAM used by one 3-core native ATLAS task With one 4-core task: KiB Mem : 6106000 total, 337988 free, 3010464 used, 2757288 buff/cache KiB Swap: 6280000 total, 6280000 free, 0 used. 2820824 avail Mem==> ( 3010464 - 110828)/1024 ~ 2800MB RAM used by one 4-core native ATLAS task Formula suggestion for native ATLAS tasks with some additional safety margin and considering the fact that the longer the tasks run the more memory they need (at least the “used memory” value rises with run time): 2100MB + 300MB*nCPUs |
27)
Message boards :
Theory Application :
[ERROR] No jobs were available to run.
(Message 37367)
Posted 17 Nov 2018 by gyllic Post: +1Something is running very wrong over there, and obvioulsly they don't have the experts to get that fixed. Critical and objective feedback is always good and welcome, but everyone here who is part of LHC@home should keep in mind before raging and complaining that this is a part of a huge research project, there will always be some things that won't work perfectly, break or something else. The entire infrastructure is not trivial, the admins here most probably also have other additional stuff to do, ... |
28)
Message boards :
Number crunching :
Memory requirements for LHC applications
(Message 37351)
Posted 15 Nov 2018 by gyllic Post: @bronco: thanks for your work! Your mentioned formula probably goes in the right direction. Maybe I find some time in the next couple of days to do some tests as well (from 1-core to 4-core tasks) to get more data. Will take a couple of days though... @BITLab Argo: Using singularity probably won't have big effects on the used/required memory (just a guess). Maybe the file "memory_monitor_output.txt" within the PandaJob directory within the boinc's slot directory gives helpful informations to more advanced linux users. |
29)
Message boards :
Number crunching :
Memory requirements for LHC applications
(Message 37327)
Posted 12 Nov 2018 by gyllic Post: @gyllicI have not tested this formula, so I can't give you an answer. Unfortunately I don't have the time to look into this deeper/test this out at the moment. But if you want and if you have the time for it, you could test your mentioned formula since you are running native ATLAS tasks ;-). Try to variate the #cores/task and compare the value from your function with the actual amount of needed RAM. |
30)
Message boards :
Number crunching :
Memory requirements for LHC applications
(Message 37319)
Posted 12 Nov 2018 by gyllic Post: ATLAS works differently compared to the other vbox apps like Theory, LHCb or CMS. ATLAS tasks don't use HTCondor or something like that, so the job distribution is done by the boinc server. Here you can see which task IDs are currently crunched by LHC@home ATLAS tasks https://lhcathome.cern.ch/lhcathome/img/progresschart.png. To see more details on that you can go to https://bigpanda.cern.ch/ (you maybe get an invalid/insecure SSL certification issue which can be solved easily). No difference is made between low-spec and high-spec machines regarding which tasks are sent to the machines. |
31)
Message boards :
Number crunching :
Memory requirements for LHC applications
(Message 37306)
Posted 11 Nov 2018 by gyllic Post: I think we're done. Can someone with the authority please take this last copy and post it as a pinned message?I still think that the mentioned formula for native ATLAS tasks is wrong! It's not necessarily ATLAS that will be swapped out.True. Since I still have no access to the 6GB RAM machine, the folloing values are from a 4-core 8GB RAM machine, which, according to the mentioned formula, should also not be able to run 4-core native ATLAS tasks. The top command shows for a 4-core native ATLAS task: top - 15:51:50 up 1:02, 3 users, load average: 4,87, 5,74, 5,63 Tasks: 238 total, 5 running, 233 sleeping, 0 stopped, 0 zombie %Cpu(s): 1,4 us, 0,5 sy, 98,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st KiB Mem : 8055880 total, 1722144 free, 4131688 used, 2202048 buff/cache KiB Swap: 8263676 total, 8257064 free, 6612 used. 3396372 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12793 boinc 39 19 2661932 1,846g 96724 R 99,2 24,0 38:24.48 athena.py 12795 boinc 39 19 2662484 1,841g 97640 R 98,3 24,0 38:27.32 athena.py 12794 boinc 39 19 2662208 1,842g 96476 R 98,1 24,0 38:31.26 athena.py 12792 boinc 39 19 2661656 1,832g 91272 R 95,7 23,8 38:24.15 athena.py 1695 user 20 0 3187056 92364 59832 S 2,1 1,1 3:25.62 kwin_x11 1709 user 20 0 4591228 205860 95756 S 1,8 2,6 1:56.66 plasmashell 632 root 20 0 386968 96460 63920 S 1,5 1,2 4:44.54 Xorg 1697 root 20 0 441096 8108 6252 S 1,0 0,1 0:00.91 udisksd 2200 user 20 0 2312260 81116 67628 S 0,4 1,0 0:22.24 boincmgr 10829 root 20 0 1004540 313696 11596 S 0,4 3,9 2:04.87 savscand 2203 user 20 0 581716 62712 52692 S 0,3 0,8 0:11.70 konsole 18145 user 20 0 45080 3864 3068 R 0,2 0,0 0:00.06 top 753 boinc 30 10 285464 15788 12264 S 0,1 0,2 0:13.10 boinc 861 sophosav 20 0 1266080 15360 12892 S 0,1 0,2 0:02.71 mrouter 1182 sophosav 20 0 810228 19516 16032 S 0,1 0,2 0:04.30 magent 1711 user 20 0 503692 31044 27552 S 0,1 0,4 0:03.27 xembedsniproxy 3278 boinc 30 10 13468 2960 2496 S 0,1 0,0 0:03.99 wrapper_26015_xSo basically no swap is used, free memory is about 1.7GB, available memory is over 3GB and cached memory is over 2GB. The high load average probably comes from the fact that I used the PC for other stuff as well while the simulation was runnning. |
32)
Message boards :
Number crunching :
Memory requirements for LHC applications
(Message 37277)
Posted 8 Nov 2018 by gyllic Post: What could be helpful is that you also monitor the amount of free RAM, cache size and used swap.Currently I have no access to this particular machine, so I cant tell you/monitor these values. But since the CPU usage (that is shown in the stderr_txt file) is for almost every task > 385% for a 4-core task (e.g. https://lhcathome.cern.ch/lhcathome/result.php?resultid=208242279), probably no swapping takes place. |
33)
Message boards :
Number crunching :
Memory requirements for LHC applications
(Message 37266)
Posted 7 Nov 2018 by gyllic Post: It works because it uses virtual memory.For a 4-core native ATLAS task, the "top" command shows 4 athena.py processes which need ~30% RAM each (for this particular machine), so a total of 120% RAM, which obviously can't be correct. One of the advantages of the multicore is that RAM is shared between the athena.py processes. This might lead to the discrepancy between the actual needed amount of RAM and the shown one in the "top" command. But still, if someone wants to know if his PC can crunch native ATLAS tasks, your mentioned formula is deceptive. If someone is new here and wants to check if he can run a, e.g., 4-core native ATLAS task with a 8GB RAM machine, he would conclude with your mentioned formula that it is not possible, although it is more than enough. |
34)
Message boards :
Number crunching :
Memory requirements for LHC applications
(Message 37262)
Posted 7 Nov 2018 by gyllic Post: The formula is wrong. It should be 100 + 2000 * nCPU.The formula 100 + 2000*nCPU is also wrong. I don't know the correct one but since my PC with 6GB RAM can crunch a 4 core native ATLAS task without any problem, the mentioned formula can't be correct (according to the formula the native task would need 100 + 2000*4 which is 8100MB on a 6000MB RAM machine which obviously would not work). It is also possible to crunch two 2-core native ATLAS tasks concurrently with 6GB RAM, so the correct formula has to be another one. |
35)
Message boards :
ATLAS application :
ATLAS issues
(Message 37136)
Posted 30 Oct 2018 by gyllic Post: Thanks for responding, but all your help and Google's and over 5 days of my own input have failed to get this bloody project running.I'm sorry that getting this native ATLAS app up running is such a pain in the ass for you. Maybe it should be mentioned that the native app is still in beta, so things may be harder to setup compared with other applications. It is a little bit weird because I have tested it and basically copy pasted all the commands from the previous post and from the guide, and it all worked well. I found that in /var/cache/dnf there is a file called "expired_repos.json", and it seems the main entry that is keeps getting passed to this file is ["cvmfs"].If you added a dnf/yum repository earlier that does not work or you don't need it anymore, you can try to disable it with "sudo dnf config-manager --set-disabled repository" where "repository" is the name of the corresponding repository. Testing Singularity --version, produces the version number so I thought all was now OK.This sounds good, although (as we learned just a couple of days/weeks ago) getting just the output is not sufficient in order to determine if the ATLAS tasks will work. But as you already said, for now cvmfs is the reason why your tasks are not successfull. Doing 'cvmfs_config setup' and 'chksetup' seemed to work (at least at first), but 'cvmfs_config probe' fails every time.As long as the probing fails, all native ATLAS tasks will also fail. What is the output of the chksetup command? Found that the label for /scratch/cvmfs/ is set wrong, but no matter what I do and try I can get the label to change (from 'default_t' to 'cvmfs_cache_t'), I keep getting the error that I have used an "Invalid argument".Not sure what you mean with that. If you followed the guide, then "/scratch/cvmfs/" will be the directory where cvmfs will place/search for the local cache. You have to create that folder manually first, and you can choose a different location if you want. To simplify things for the moment, I recomend you to remove the lines "CVMFS_CACHE_BASE=/scratch/cvmfs" and "CVMFS_QUOTA_LIMIT=4096" from your "/etc/cvmfs/default.local" file. You probably will have to execute "sudo cvmfs_config setup" or "sudo cvmfs_config reload" again. CVMFS does not seem to start and this is borne out but my failed work units, which sayI'm not sure why it says that, maybe because the probing fails. If you type "cvmfs_config --help" in the terminal, do you get an output? When I do 'cvmfs_config probe' the result isObviously this should not be the case. Have you tried to run the command "sudo service autofs restart" i.e. the fedora equivalent command and then tried to probe again? When I look in the folder /cvmfs/ there is nothing in it.Which directory do you mean? "/etc/cvmfs"? So it looks like this volunteer just wont be helping ATLAS do anything. I can at least run Sixtrack I suppose.You can still try to run the virtualbox based ATLAS app or the other virtualbox applications (Theory and LHCb at the moment). For that, yeti has written a very nice checklist which you will find on the message boards. If you want you can send me a PM and I will look at your computer through teamviewer and see if we can get it up running (but i am no fedora expert). |
36)
Message boards :
ATLAS application :
ATLAS issues
(Message 37100)
Posted 26 Oct 2018 by gyllic Post: Hi conan, with the things you described, probably the easiest way to get cvmfs running for you is to build it from sources. To do so on fedora 25 workstation, follow these steps: 1. login with your default user (that has sudo rights) and clean dnf: sudo dnf clean all2. Install all packages that are required to build cvmfs on fedora 25 as well as other packages like nano (I'm not sure if all following packages are needed, but to simplify things just install all of them) sudo dnf install -y cmake make automake gcc gcc-c++ kernel-devel autofs fuse fuse-devel python-devel libcap-devel git attr valgrind-devel sqlite-devel libuuid-devel uuid-devel tar patch bzip2 zlib-devel openssl-devel unzip nano3. continue with step 3 in the "build and install cvmfs" section from this guide https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4840&postid=36880#36880, i.e.: cd mkdir cvmfs_source git clone https://github.com/cvmfs/cvmfs.git cvmfs_source ... Hopefully this solves your porblem. On my minimal installation of fedora 25 workstation cvmfs compiled and worked after the additional set up steps that are described in the guide. |
37)
Message boards :
ATLAS application :
ATLAS issues
(Message 37096)
Posted 25 Oct 2018 by gyllic Post: I found the CVMFS Package and this mostly installed what was needed but still did not compile.Which cvmfs package do you mean? If you already have cvmfs installed from a package, there is no need to compile it any more (you can try to type the command "cvmfs2 --version" into the terminal and see if you get any output). You can download rpm packages from here: https://cernvm.cern.ch/portal/filesystem/downloads. Since you are on Fedora 25, I don't know if the packages that are provided for Fedora 27 and 28 will work for you. You can also add the yum repositories and try to install from them. This may help you: https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html "CMake Error at CMakeLists.txt:10 (project)Have you installed all packages needed to build cvmfs (i.e. cmake, gcc, etc.)? Since the guide is for Debian systems, the packages you have to install in order to compile cvmfs are maybe called differently on Fedora and maybe you have to install more than listed in the guide. |
38)
Message boards :
ATLAS application :
An issue with singularity
(Message 37051)
Posted 16 Oct 2018 by gyllic Post: I have set up a virtual machine with Linux Mint 17.3 and singularity also did not work with the default kernel. I then updated the kernel from the default 3.19 to 4.4.0 generic, and now singularity with the "singularity --debug exec ..." command works. So it is propably something with the default kernel that singularity does not like. My uname -a output is: Linux testing-VirtualBox 4.4.0-98-generic #121~14.04.1-Ubuntu SMP Wed Oct 11 11:54:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
39)
Message boards :
ATLAS application :
An issue with singularity
(Message 37044)
Posted 16 Oct 2018 by gyllic Post: Thanks for your information. I am no expert on these kind of things, but my first guess is that your kernel is missing the overlay filesystem module or it is failing to load it automatically (since this is only a guess, your problem might be located somewhere completely different). It should be in the kernel since (I think) version 3.18, but maybe it is not in yours. To check that, please run the command "singularity --debug exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/images/singularity/x86_64-slc6.img hostname" and then run the command "lsmod". Look if the module called "overlay" is shown in the output of the lsmod command (and that it is used). If you dont see a module called overlay in the lsmod output, look into the folder /lib/modules/*your kernel*/kernel/fs/ and search for a folder called overlayfs (at least this is the path on debian). I have not tested the following, so this is just to give you an idea on how you may be able to fix your problem: If the overlayfs directory is present, you can try to manually load it with the "modprobe" command and the according module name, so "modprobe overlay" or something like that. You can also type the module name into the /etc/modules file and restart your PC. This way the module should be automatically loaded at boot. If no such folder exists, your kernel probably does not have the overlay module which is needed (according to the singularity debug output). You maybe can try to use a backport kernel or upgrade your Mint 17.x installation to a newer one (which you should do because Mint 17.x is only supported until April 2019). |
40)
Message boards :
ATLAS application :
An issue with singularity
(Message 37041)
Posted 15 Oct 2018 by gyllic Post: Have you run the command "sudo make install" after compiling singularity? Can you run singularity with your default user (i.e. with a different user than boinc)? Which version of singularity do you use? Type "singularity --version" into your terminal and please post the output. Which OS are you running? It shows Linux 3.19, current Debian is using 4.9.0 and olstable Debian is using 3.16 (the guide has only been tested on Debian stretch, but singularity obviously should work on other OS's as well). So please post the output of "uname -a" here as well. |
©2024 CERN