Message boards : Theory Application : Theory native tasks fail when restarted
Message board moderation

To post messages, you must log in.

AuthorMessage
kotenok2000
Avatar

Send message
Joined: 21 Feb 11
Posts: 83
Credit: 577,613
RAC: 0
Message 50376 - Posted: 10 Jun 2024, 15:15:27 UTC
Last modified: 10 Jun 2024, 15:16:21 UTC

They fail with msg="runc run failed: container with given ID already exists"
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3338978
ID: 50376 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2632
Credit: 271,713,117
RAC: 93,186
Message 50377 - Posted: 10 Jun 2024, 16:05:02 UTC - in response to Message 50376.  

Describe the steps you have done to stop/restart the task.
Is your BOINC client configured to "leave non GPU applications in memory"?
ID: 50377 · Report as offensive     Reply Quote
kotenok2000
Avatar

Send message
Joined: 21 Feb 11
Posts: 83
Credit: 577,613
RAC: 0
Message 50378 - Posted: 10 Jun 2024, 16:07:09 UTC - in response to Message 50377.  
Last modified: 10 Jun 2024, 16:09:50 UTC

sudo systemctl stop boinc-client

sudo systemctl start boinc-client

or
sudo systemctl reboot
ID: 50378 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2632
Credit: 271,713,117
RAC: 93,186
Message 50379 - Posted: 10 Jun 2024, 16:16:05 UTC - in response to Message 50378.  

Theory native tasks do not survive a BOINC restart or a reboot.
ID: 50379 · Report as offensive     Reply Quote
kotenok2000
Avatar

Send message
Joined: 21 Feb 11
Posts: 83
Credit: 577,613
RAC: 0
Message 50380 - Posted: 10 Jun 2024, 16:18:34 UTC

Before cgroupsv2 they restarted from beginning.
Now they just crash.
ID: 50380 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2632
Credit: 271,713,117
RAC: 93,186
Message 50381 - Posted: 10 Jun 2024, 16:34:20 UTC - in response to Message 50380.  

Native apps (ATLAS/Theory) were never intended to fully replace vbox apps.
They were developed to run efficiently on datacenter hosts (24/7) with rare planned shutdown breaks.

The Theory cgroups v2 version adds a functionality to pause/resume without the need to manually fiddle around with cgroups.
This makes it a bit more cooperative on hosts running Non-Theory tasks but it never claimed to solve all basic issues from above.

As a user you have to decide what is more important for your system:
- run it 24/7 and plan shutdowns (hence stop running new native Tasks early enough)
- regularly stop/restart running tasks (then run vbox apps)
ID: 50381 · Report as offensive     Reply Quote

Message boards : Theory Application : Theory native tasks fail when restarted


©2025 CERN