Thread '-1073740791 (0xC0000409) STATUS_STACK_BUFFER

Author	Message
peterfilla Send message Joined: 2 Jan 11 Posts: 23 Credit: 5,986,899 RAC: 0	Message 29116 - Posted: 8 Mar 2017, 20:18:21 UTC I try to run more than one 4-core Theory-tasks and get this error. The memory_size in app_config is set to 5400. ID: 29116 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1530 Credit: 10,028,883 RAC: 1,748	Message 29119 - Posted: 9 Mar 2017, 8:34:37 UTC - in response to Message 29116. Strange. In the stderr.txt 8400MB is reported: 2017-03-08 18:09:23 (8408): Setting Memory Size for VM. (8400MB) Set the memory size for Theory for a 4 core VM to 1280MB. That's enough. ID: 29119 · Reply Quote

peterfilla Send message Joined: 2 Jan 11 Posts: 23 Credit: 5,986,899 RAC: 0	Message 29120 - Posted: 9 Mar 2017, 8:44:51 UTC I am sure, that ist not the reason; I started my tests with 1280MB. But I have 2 problems: This error and the fact, that I can not run more than 2 ATLAS-tasks at the same time. So I set in the app_config for both projekts the mem to 8400MB. First I run ATLAS -> problem only 2 tasks Then I followed your advice : got this error on all tasks. Now I had uninstall VBox (14) and installed the latest (16) and I set up LHC new and make new tests. ID: 29120 · Reply Quote

Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 468 Credit: 224,508,842 RAC: 124,398	Message 29123 - Posted: 9 Mar 2017, 11:03:56 UTC HM, you are mixing up Theory-App and Atlas-App Crystal has suggested 1280 MB for a 4-Core-Theory-WU, but if yoy apply this number to Atlas-APP it will not be able to run 1 tasks Supporting BOINC, a great concept ! ID: 29123 · Reply Quote

peterfilla Send message Joined: 2 Jan 11 Posts: 23 Credit: 5,986,899 RAC: 0	Message 29124 - Posted: 9 Mar 2017, 11:21:13 UTC There is plenty of space . . . <app_config> <app_version> <app_name>Theory</app_name> <plan_class>vbox64</plan_class> <avg_ncpus>4.000000</avg_ncpus> <cmdline>--memory_size_mb 1500</cmdline> </app_version> <app_version> <app_name>ATLAS</app_name> <avg_ncpus>4.000000</avg_ncpus> <plan_class>vbox64_mt_mcore_atlas</plan_class> <cmdline>--memory_size_mb 8400</cmdline> </app_version> </app_config> For the aktual test I reduced the Theory-memory-size, three are now running - but only 2 4-core-ATLAS with 1 waiting. ID: 29124 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 299,140,090 RAC: 15,238	Message 29125 - Posted: 9 Mar 2017, 13:08:17 UTC What RAM values (in %) did you set in "Options -> Computing Preferencies ..."? Do all of your currently running and paused jobs fit into the configured RAM? ID: 29125 · Reply Quote

peterfilla Send message Joined: 2 Jan 11 Posts: 23 Credit: 5,986,899 RAC: 0	Message 29127 - Posted: 9 Mar 2017, 13:30:48 UTC I have a 12/24-core CPU with 64GB RAM. In Options RAM-usage was set to 90% max; the project can run 5 tasks max. The actual test uses 83% of CPU and 26% of RAM with 2 ATLAS and 3 Theory tasks, 3 tasks are waiting (1 ATLAS and 2 Theory). ID: 29127 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 299,140,090 RAC: 15,238	Message 29130 - Posted: 9 Mar 2017, 14:04:00 UTC I guess you have no other job from any other project running beside ATLAS or Theory, donÂ´t you? Try to locate (carefully!) all occurrences of the variable <rsc_memory_bound>123456789.123456</rsc_memory_bound> in the file client_state.xml and sum up the values for jobs that are running/pausing/waiting for memory. Does the result exceeds your configured RAM value (64 GB x 0.9 = 57.6 GB)? ID: 29130 · Reply Quote

peterfilla Send message Joined: 2 Jan 11 Posts: 23 Credit: 5,986,899 RAC: 0	Message 29132 - Posted: 9 Mar 2017, 14:31:41 UTC LHC ist the only project. rsc_memory_bound for ATLAS is : 26633830400.00 for each task Theory : 500000000.00 This is in BYTEs ?! Then 2 ATLAS tasks have over 53 GB blocked ?! Am I right ??? ID: 29132 · Reply Quote

peterfilla Send message Joined: 2 Jan 11 Posts: 23 Credit: 5,986,899 RAC: 0	Message 29133 - Posted: 9 Mar 2017, 14:37:43 UTC Question : Can I change theses values in the client_state.xml ??? (carefully!) ID: 29133 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1530 Credit: 10,028,883 RAC: 1,748	Message 29136 - Posted: 9 Mar 2017, 16:15:53 UTC - in response to Message 29132. LHC ist the only project. rsc_memory_bound for ATLAS is : 26633830400.00 for each task Theory : 500000000.00 This is in BYTEs ?! Then 2 ATLAS tasks have over 53 GB blocked ?! Am I right ??? Yes, that's right. Each task has 24.8 GB reserved. That's why the other tasks are waiting for memory. Question : Can I change theses values in the client_state.xml ??? (carefully!) Yes that's possible, but first you have to shutdown the BOINC-client. Edit with a basic text-editor the client_state.xml and change the <rsc_memory_bound>26633830400.000000</rsc_memory_bound> into <rsc_memory_bound>5662310400.000000</rsc_memory_bound> for 5400MB of memory for a 4-core ATLAS-task. Do this for all (also not started) Workunits, save the file and start the BOINC-client. ID: 29136 · Reply Quote

peterfilla Send message Joined: 2 Jan 11 Posts: 23 Credit: 5,986,899 RAC: 0	Message 29138 - Posted: 9 Mar 2017, 16:50:13 UTC What ist the difference between "Setting Memory Size for VM" (8400MB) 2017-03-09 09:03:40 (5764): vboxwrapper (7.7.26196): starting 2017-03-09 09:03:40 (5764): Feature: Checkpoint interval offset (300 seconds) 2017-03-09 09:03:40 (5764): Detected: VirtualBox COM Interface (Version: 5.1.16) 2017-03-09 09:03:40 (5764): Detected: Minimum checkpoint interval (900.000000 seconds) 2017-03-09 09:03:40 (5764): Successfully copied 'init_data.xml' to the shared directory. 2017-03-09 09:03:40 (5764): Create VM. (boinc_0521eef5705f3a8d, slot#0) 2017-03-09 09:03:40 (5764): Setting Memory Size for VM. (8400MB) 2017-03-09 09:03:40 (5764): Setting CPU Count for VM. (4) 2017-03-09 09:03:40 (5764): Setting Chipset Options for VM. and "rsc_memory_bound" in client_state.xml ? Is the number of CPUs set on the LHC-page used to calculate these values ? and: is this number of CPUs the phys. or the log. CPUs ? perhaps I should set the phys. number . . . and : I will try a changing . . . ID: 29138 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 299,140,090 RAC: 15,238	Message 29139 - Posted: 9 Mar 2017, 16:53:29 UTC ... rsc_memory_bound for ATLAS is : 26633830400.00 for each task ... ... This is in BYTEs ?! Then 2 ATLAS tasks have over 53 GB blocked ?! ... Right. If this is not a typo: 24.8 GB per task. ... Question : Can I change theses values in the client_state.xml ??? (carefully!) ... Not recommended but possible as CP wrote (be VERY CAREFUL!!!) The <rsc_memory_bound> tag is included in the server reply when you receive a new workunit. ItÂ´s in the responsibility of the project developers to fill it with a reasonable value. My proposal: The value for <rsc_memory_bound> should be derived from the <avg_ncpus> tag. Thus the user can set the number of cpus to be used inside an app_config.xml and the project can calculate the necessary amount of RAM for n cpus. ID: 29139 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1530 Credit: 10,028,883 RAC: 1,748	Message 29141 - Posted: 9 Mar 2017, 17:13:41 UTC - in response to Message 29139. The <rsc_memory_bound> tag is included in the server reply when you receive a new workunit. ItÂ´s in the responsibility of the project developers to fill it with a reasonable value. I suppose the project has dedicated the rsc_memory_bound to the given plan_class vbox64_mt_mcore_atlas and not obeying the setting from app_config.xml. ID: 29141 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 299,140,090 RAC: 15,238	Message 29142 - Posted: 9 Mar 2017, 17:23:43 UTC - in response to Message 29138. ... 2017-03-09 09:03:40 (5764): Setting Memory Size for VM. (8400MB) Your Virtual Machine is configured with 8400MB RAM. Your workunits consist of more than the VM, e.g. the vboxwrapper. With <rsc_memory_bound> the boinc client gets an information about the RAM usage of the complete WU and can calculate if other projects can be run in parallel. ... Is the number of CPUs set on the LHC-page used to calculate these values ? ... Should be but IMHO this is not implemented as it should be. ... is this number of CPUs the phys. or the log. CPUs ? perhaps I should set the phys. number . . . logical. You shouldnÂ´t care about this too much for the moment. ID: 29142 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 299,140,090 RAC: 15,238	Message 29143 - Posted: 9 Mar 2017, 17:30:28 UTC - in response to Message 29141. The <rsc_memory_bound> tag is included in the server reply when you receive a new workunit. ItÂ´s in the responsibility of the project developers to fill it with a reasonable value. I suppose the project has dedicated the rsc_memory_bound to the given plan_class vbox64_mt_mcore_atlas and not obeying the setting from app_config.xml. And thatÂ´s exactly the pitfall as it only works if you run one (multicore) job from one project. Here we need more flexibility. ID: 29143 · Reply Quote

peterfilla Send message Joined: 2 Jan 11 Posts: 23 Credit: 5,986,899 RAC: 0	Message 29144 - Posted: 9 Mar 2017, 17:46:39 UTC !!! Thanks for all the informations - now I can continue with special testing !!! ID: 29144 · Reply Quote

Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 468 Credit: 224,508,842 RAC: 124,398	Message 29156 - Posted: 10 Mar 2017, 9:26:45 UTC May I make a guess ? Your setting for LHC@Home is something like use all cores or use max cores or so. This is a high figure with 12 / 24 cores Limit it to 4 cores and this value will decrease with new downloaded WUs Supporting BOINC, a great concept ! ID: 29156 · Reply Quote

peterfilla Send message Joined: 2 Jan 11 Posts: 23 Credit: 5,986,899 RAC: 0	Message 29169 - Posted: 10 Mar 2017, 13:14:40 UTC The above error is "solved" (?) : I detached from LHC and reatached - and the new tasks finished OK (I do not know the reason why). My second error (not more than 2 ATLAS-tasks) will be "solved" too : I set the number of CPUs on the server to a lower value, so the calculation of "rsc_memory_bound" gives a lower value (this test is running today). At the moment I am running my CPU with 4-core-tasks at the limit for test - but later I will run 60 - 75 % load (4-core is a good number I heard). ID: 29169 · Reply Quote