Message boards :
Theory Application :
Checkpointing
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
Does Theory still not have checkpointing for Linux??? Last modified: 25 Mar 2019 https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4971 To suspend the application to disk so that it will survive the client exiting requires the container checkpointing feature. However, this is not currently available for Linux containers. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
With the so-called "native" Theory project is it still necessary to go through this rigmarole to even be able to suspend while still running BOINC client??? Last modified: 25 Mar 2019 https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4971 Suspend/Resume The Suspend/Resume does not work out of the box. It needs a cgroup to be created for each slot and this requires a cgroup with permissions for the user boinc. This can be provided by adding a PreStart script for boinc-client systemd. Download two files with wget: sudo wget http://lhcathome.cern.ch/lhcathome/download/create-boinc-cgroup -O /sbin/create-boinc-cgroup sudo wget http://lhcathome.cern.ch/lhcathome/download/boinc-client.service -O /etc/systemd/system/boinc-client.service Then run the following commands to pick up the changes: sudo systemctl daemon-reload sudo systemctl restart boinc-client This will only suspend the application in memory. |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 251,153,316 RAC: 118,611 |
Running Theory (or ATLAS) native still requires lots of expert knownlege, additional settings and more babysitting. They pay back more efficient tasks and less total RAM requirements, especially if many task run concurrently on a computer with many cores. Volunteers who don't want to spend that additional work should run the vbox apps. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
Running Theory (or ATLAS) native still requires lots of expert knownlege, additional settings and more babysitting.Not an answer to either of my questions. |
Send message Joined: 14 Jan 10 Posts: 1411 Credit: 9,334,202 RAC: 8,031 |
Does Theory still not have checkpointing for Linux??? With the so-called "native" Theory project is it still necessary to go through this rigmarole to even be able to suspend while still running BOINC client??? 2 Yesses. I'm not a Linux expert, but the only way I discovered to save a running native task in between, is to create on your machine (Linux or Windows) your own Linux VM, install BOINC on it and take all the needed steps to be able to run native tasks. With BOINC and tasks running you may take snapshots of that VM and restore from the last snapshot when needed. Another way would be when you have to shutdown the host: Suspend the tasks keeping them in memory. Keep BOINC running and close the VM saving the state to disk. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
With the so-called "native" Theory project is it still necessary to go through this rigmarole to even be able to suspend while still running BOINC client??? I just tried it on Ubuntu 18.04.4 and BOINC 7.16.6. After allowing a native Theory to run for 24 minutes, I suspended it for one minute and then resumed it. It started up again with no problem. It is possible that longer-term suspensions might have problems though. I have not checked that. |
©2024 CERN