Message boards :
Theory Application :
New Native Theory Version 1.1
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Send message Joined: 6 Sep 08 Posts: 118 Credit: 12,657,568 RAC: 4,021 ![]() ![]() ![]() |
Not sure if I'm making progress or not... found and installed libseccomp 2.4.1 (the regular way from a lxc PPA) that produced this 18:17:01 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Updating config.json. 18:17:02 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Running Container 'runc'. container_linux.go:336: starting container process caused "process_linux.go:293: applying cgroup configuration for process caused \"mountpoint for devices not found\"" 18:17:02 BST +01:00 2019-08-03: cranky-0.0.29: [ERROR] Container 'runc' terminated with status code 1. 18:17:02 (3846): cranky exited; CPU time 0.005645 18:17:02 (3846): app exit status: 0xce 18:17:02 (3846): called boinc_finish(195) 18:17:03 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] found this https://stackoverflow.com/questions/22555264/docker-hello-world-not-working/22555932#22555932 and this https://github.com/opencontainers/runc/issues/798 so inatalled cgroups-lite 1.11 with this result 18:34:16 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Checking runc. 18:34:16 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Creating the filesystem. 18:34:16 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm3 18:34:16 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Updating config.json. 18:34:16 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Running Container 'runc'. 18:34:48 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] ===> [runRivet] Sat Aug 3 17:34:48 UTC 2019 [boinc pp mb-inelastic 7000 - - phojet 1.12a default 100000 84] 18:34:53 BST +01:00 2019-08-03: cranky-0.0.29: [ERROR] Container 'runc' terminated with status code 1. 18:34:54 (4451): cranky exited; CPU time 0.044983 18:34:54 (4451): app exit status: 0xce 18:34:54 (4451): called boinc_finish(195) all subsequent tasks end like this 18:45:42 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Checking runc. 18:45:42 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Creating the filesystem. 18:45:42 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm3 18:45:42 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Updating config.json. 18:45:42 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] Running Container 'runc'. standard_init_linux.go:203: exec user process caused "too many levels of symbolic links" 18:45:44 BST +01:00 2019-08-03: cranky-0.0.29: [ERROR] Container 'runc' terminated with status code 1. 18:45:44 (4733): cranky exited; CPU time 0.014341 18:45:44 (4733): app exit status: 0xce 18:45:44 (4733): called boinc_finish(195) 18:45:44 BST +01:00 2019-08-03: cranky-0.0.29: [INFO] I haven't explicitly added any symlinks. Now I'm stuck. |
Send message Joined: 2 May 07 Posts: 2258 Credit: 174,552,610 RAC: 33,626 ![]() ![]() ![]() |
The third run will need a lot of time to finishing this task: boinc pp jets 13000 250,-,4160 - sherpa 2.2.0 default 2000 72] First Computer canceled after 4 days, the second after 26 days: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=117408514 |
Send message Joined: 2 May 07 Posts: 2258 Credit: 174,552,610 RAC: 33,626 ![]() ![]() ![]() |
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=120079734 05:14:23 CEST +02:00 2019-08-09: cranky-0.0.29: [INFO] ===> [runRivet] Fri Aug 9 03:14:22 UTC 2019 [boinc pp jets 8000 25 - pythia6 6.423 pro-q2o 100000 86] 06:07:08 CEST +02:00 2019-08-09: cranky-0.0.29: [ERROR] Container 'runc' terminated with status code 1. Both are running in a Status Code 1. |
Send message Joined: 2 May 07 Posts: 2258 Credit: 174,552,610 RAC: 33,626 ![]() ![]() ![]() |
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=122039899 [boinc pp jets 8000 25 - pythia8 8.180 default 100000 90] 16:07:04 UTC +00:00 2019-08-19: cranky-0.0.29: [ERROR] Container 'runc' terminated with status code 1. https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=121978893 |
![]() Send message Joined: 30 Aug 14 Posts: 145 Credit: 10,847,070 RAC: 0 ![]() ![]() |
Hi everyone! Theory Native is single core only, right? Thanks!! Why mine when you can research? - GRIDCOIN - Real cryptocurrency without wasting hashes! https://gridcoin.us |
Send message Joined: 14 Jan 10 Posts: 1439 Credit: 9,629,765 RAC: 2,577 ![]() ![]() ![]() |
Theory Native is single core only, right? Yes it is, but when you have idle time left it will use some extra cpu time to speed up the job. Example: https://lhcathome.cern.ch/lhcathome/result.php?resultid=244323423 |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 ![]() ![]() |
Theory Native is single core only, right? It used to use two cores, but seems to be one now. |
![]() Send message Joined: 15 Jun 08 Posts: 2571 Credit: 258,922,359 RAC: 119,059 ![]() ![]() |
Theory Native is single core only, right? From the perspective of your BOINC client: YES But! A typical Theory native pstree looks like this: cranky-0.0.29───runc─┬─job───runRivet.sh─┬─rivetvm.exe │ ├─runRivet.sh───sleep │ ├─rungen.sh───pythia8.exe │ └─sleep └─8*[{runc}] In this example the 2 main processes are rivetvm.exe and pythia8.exe. Their total CPU share is usually greater than 1. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 ![]() ![]() |
In this example the 2 main processes are rivetvm.exe and pythia8.exe. I have noticed that on native ATLAS even more so. You can set whatever you want in an app_config., but it will use all your cores somehow. It can make it difficult to run with other projects by the way. I think it is best to devote a machine to it. |
![]() Send message Joined: 15 Jun 08 Posts: 2571 Credit: 258,922,359 RAC: 119,059 ![]() ![]() |
IIRC ATLAS native requires the nthreads option to be set, e.g. <avg_ncpus>2.0</avg_ncpus> <cmdline>--nthreads 2</cmdline> Vbox apps do not need "nthreads" if <avg_ncpus> is already set. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 ![]() ![]() |
IIRC ATLAS native requires the nthreads option to be set, e.g. The last time I did native ATLAS before putting in an app_config, it just used all the cores available. That was eight cores (as shown by BOINC), which maybe is all that ATLAS will use anyway? |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 0 ![]() ![]() |
The last time I did native ATLAS before putting in an app_config, it just used all the cores available. But did you actually apply any limit? I've never seen Atlas-native disobey the "Max # CPUs" set through the website (which I believe translates into said --nthreads, for those of us that don't mess with XML), and yes I've tried at least values of 1, 2, 4 and "No limit"... IIRC the Atlas limit is 12 cores, but that's from a while back. Even at 8 cores there's noticeable inefficiency from the single-threaded start-up/shutdown phases. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 ![]() ![]() |
I've never seen Atlas-native disobey the "Max # CPUs" set through the website (which I believe translates into said --nthreads, for those of us that don't mess with XML), and yes I've tried at least values of 1, 2, 4 and "No limit".... BOINC will show however many cores that you set in the app_config. But run a "top" command, and you will see that all the cores are being used regardless of what you have set (at least up to 8, the maximum I tried). (I normally just set the "Max # CPUs" to either 8 or unlimited, so I don't know how a lesser value would affect it; I use the app_config to limit it further.) |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 0 ![]() ![]() |
But run a "top" command, and you will see that all the cores are being used regardless of what you have set (at least up to 8, the maximum I tried). I think I misunderstood what you were trying to say (and then expressed myself badly). Long ago, in Sixtrack-only days, "Max # CPUs" meant roughly "Max. total cores to be used for BOINCing, leaving the rest free for the user's day job." VBox and Atlas-native tasks instead interpret it to mean "Cores/task", so you're right that that setting can't be used to control overall load on the machine any more. I presently still have "Max # CPUs" set to 4, to get the 8-core machine to refill with Atlas efficiently after running Sixtracks; top gives 7587 boinc 39 19 2618m 1.8g 11m R 99.8 11.4 263:33.42 athena.py 7589 boinc 39 19 2602m 1.8g 11m R 99.5 11.5 263:42.48 athena.py 32249 boinc 39 19 2597m 1.8g 44m R 99.1 11.3 1:15.13 athena.py 7586 boinc 39 19 2601m 1.8g 12m R 98.8 11.4 263:40.89 athena.py 7588 boinc 39 19 2602m 1.8g 11m R 97.8 11.4 263:19.11 athena.py 32250 boinc 39 19 2597m 1.8g 44m R 96.5 11.3 1:15.65 athena.py 32248 boinc 39 19 2597m 1.8g 44m R 95.2 11.3 1:16.12 athena.py 32251 boinc 39 19 2597m 1.8g 44m R 95.2 11.3 1:14.79 athena.py i.e. there are two 4-core tasks, one just started and one nearly 5 hours in. (A couple of weeks back there'd have been one 4-core task and some Sixtracks). So, indeed I've never seen Atlas-native (or indeed VBox) disobey the "Max # CPUs" set through the website, with the proviso that "Max # CPUs" there actually means "Cores/task", rather than what it used to/should do. The next step would be to then limit the number of tasks running at any one time, which is probably best done through an app_config; I can't remember if I've ever tried setting "Max # jobs" to just one, and anyway Atlas-native has, er, idiosyncratic ideas about how many tasks it queues at the client for strange beancounting purposes. (I read your The last time I did native ATLAS before putting in an app_config, it just used all the cores available.as implying that an Atlas-native task would grab any extra unused cores it sees at runtime irrespective of --nthreads, however that's been set, which I don't believe is true.) |
![]() Send message Joined: 28 Sep 04 Posts: 739 Credit: 50,843,250 RAC: 40,550 ![]() ![]() ![]() |
So, indeed I've never seen Atlas-native (or indeed VBox) disobey the "Max # CPUs" set through the website, with the proviso that "Max # CPUs" there actually means "Cores/task", rather than what it used to/should do. This "Max # CPUs" is in the project settings. The one you are probably confusing it with is still in the BOINC settings and is called "Use at most XX % of the CPUs". ![]() |
![]() Send message Joined: 15 Jun 08 Posts: 2571 Credit: 258,922,359 RAC: 119,059 ![]() ![]() |
Could the ATLAS discussion please continued in the ATLAS thread? Long ago, in Sixtrack-only days, "Max # CPUs" meant roughly "Max. total cores to be used for BOINCing, leaving the rest free for the user's day job." The option "Max # CPUs" was never thought to limit the total cores for BOINC since BOINC has an option for this that can be used in app_config.xml Instead "Max # CPUs" has been introduced to set exactly what it is used for today. Unfortunately ATLAS uses "Max # CPUs" to also limit the #tasks that can be downloaded - a request from (unknown) accountants as David Cameron explained a long while ago. This results in client buffers that are only partly filled, especially if CPUs with lots of cores are used. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 ![]() ![]() |
(I read yourThe last time I did native ATLAS before putting in an app_config, it just used all the cores available.as implying that an Atlas-native task would grab any extra unused cores it sees at runtime irrespective of --nthreads, however that's been set, which I don't believe is true.) I have not used -nthreads for a while, and don't recall how it behaves, so you are probably correct. I just use <avg_ncpus>, which uses all the (virtual) cores; i.e., threads on native ATLAS. I don't think it does that on native Theory, but I am not running it at the moment. |
![]() ![]() Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,866,264 RAC: 0 ![]() ![]() |
A couple of months ago I got fed up with a series of Blue-screen-of-death loops on an old 2-core Athlon not being happy with a Windows update so I completely reformatted it and put on Linux Mint instead. After a few failed attempts, I have got it successfully running Theory_Native. I'm not entirely convinced that it's doing exactly as it should as, during setup, it didn't seem to like cvmfs_config and autofs, although "probe" returned OKs, and some finished tasks look too quick to have done much (unless this Native is waaay faster than on VBox/Windows). However it is returning McPlots so it must be working OK. It's running an ordinary VBox Theory where "Show VM Console" and "Show Graphics" let me see various outputs. My Native task doesn't have those buttons although I can see it is currently running a Herwig through "top" in a terminal. Can I get to those remote-desktop live outputs (events processed, etc) without being too Linuxy? (I'm not a convert yet. I still prefer point-and-click to Terminal.) |
![]() Send message Joined: 15 Jun 08 Posts: 2571 Credit: 258,922,359 RAC: 119,059 ![]() ![]() |
Theory native doesn't have a point-and-click interface to your BOINC client. To monitor the progress of your running task like in console 2 of a vbox task you may do the following: Open a console window (either direct at the linux host or remote from another computer). In this console window run the command: tail -Fn100 /path_to_your_boinc_client/slots/x/cernvm/shared/runRivet.log x must be replaced by the slotnumber of your running task. |
![]() ![]() Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,866,264 RAC: 0 ![]() ![]() |
OK, thanks, I thought it might be fiddly 😵 I don't think I'll be setting up the suspend/resume stuff either. Well, not until I have the time to get it wrong a few times. |
©2025 CERN