Message boards : Theory Application : New native version v300.08
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 2108
Credit: 159,819,192
RAC: 107,232
Message 49519 - Posted: 11 Feb 2024, 12:41:26 UTC

Theory is back with Tasks.
Thank you. Sunday!
ID: 49519 · Report as offensive     Reply Quote
Profile tazzduke

Send message
Joined: 24 Jun 10
Posts: 42
Credit: 5,348,959
RAC: 18,397
Message 49811 - Posted: 22 Mar 2024, 12:31:28 UTC - in response to Message 49136.  

Tried on both Linux Mint 20.3 (Ubuntu 20.04) and 21.2 (Ubuntu 22.04). I guess I can't run Theory any more until sudo 1.9.10 gets added to the Ubuntu repository.
Found Sudo-Version 1.9.9.
This sudo version is lower than 1.9.10.
It does not support regular expressions.
Hence, sudoers will not be modified.
Error running /tmp/prepare_theory_native_environment


I know I am most likely late to the party on this one, but I only just started again with Theory Native, and ran into the same problem as Aurum.

I went and visited the SUDO website at https://www.sudo.ws/, and grabbed the latest package for my distribution.

Installed the latest version and then re ran the command in Laurence's OP.

Here is one my of validated workunits -

https://lhcathome.cern.ch/lhcathome/result.php?resultid=408095114

Cheers
ID: 49811 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2418
Credit: 226,702,146
RAC: 130,768
Message 49812 - Posted: 22 Mar 2024, 13:24:19 UTC - in response to Message 49811.  

Looks good.


Since a major goal of that version is to make suspend/resume work via systemd you may want to test this.

Select a currently running task in BOINC manager (or your preferred BOINC tool) and pause the task.
Test this with a task that has already started the container (see stderr.txt).
Then this should happen:

1.
You should find a corresponding line in the task's stderr.txt

2.
run the "systemd status ..." command shown in stderr.txt (press 'q' to exit the pager).
The output should mention the scope as "frozen".


A while later resume the task via the BOINC management tool.
Check again stderr.txt and the scope status.


Hint:
Although it would be possible to manually freeze/thaw the scope via systemctl this should not be done because BOINC will not be notified.
Hence, always use BOINC for this.
ID: 49812 · Report as offensive     Reply Quote
Profile tazzduke

Send message
Joined: 24 Jun 10
Posts: 42
Credit: 5,348,959
RAC: 18,397
Message 49813 - Posted: 22 Mar 2024, 13:57:56 UTC - in response to Message 49812.  

Ok, will try and find some time on the weekend and give it a go.

Cheers
ID: 49813 · Report as offensive     Reply Quote
Profile tazzduke

Send message
Joined: 24 Jun 10
Posts: 42
Credit: 5,348,959
RAC: 18,397
Message 49819 - Posted: 23 Mar 2024, 1:05:11 UTC - in response to Message 49813.  

Good morning,

As suggested, went an selected a running theory task and paused it in BOINC.

Looked in the matching stderr.txt for task to get the command.

Ran the command - systemctl status Theory_2743-2733248-9_1.scope

This was the output - Unit Theory_2743-2733248-9_1.scope could not be found.

Resumed task, task reset itself back to 0.00 percent and then finished as a computation error.

Here is the task - https://lhcathome.cern.ch/lhcathome/result.php?resultid=408107778

Hopefully someone else has had success and that would mean my setup is partially correct, but the suspend/resume is not setup correctly.

Cheers
ID: 49819 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2418
Credit: 226,702,146
RAC: 130,768
Message 49820 - Posted: 23 Mar 2024, 9:07:53 UTC - in response to Message 49819.  

The log entries should look a bit like these:
06:14:37 CET +01:00 2024-03-23: cranky: [INFO] Starting runc container.
06:14:38 CET +01:00 2024-03-23: cranky: [INFO] To get some details on systemd level run
06:14:38 CET +01:00 2024-03-23: cranky: [INFO] systemctl status Theory_2743-2785673-11_0.scope
06:14:38 CET +01:00 2024-03-23: cranky: [INFO] mcplots runspec: boinc pp jets 7000 80,-,1060 - herwig++ 2.7.1 UE-EE-5 100000 11
06:14:38 CET +01:00 2024-03-23: cranky: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_
06:21:28 CET +01:00 2024-03-23: cranky: [INFO] Pausing systemd unit Theory_2743-2785673-11_0.scope
06:22:27 CET +01:00 2024-03-23: cranky: [INFO] Resuming systemd unit Theory_2743-2785673-11_0.scope
06:32:49 CET +01:00 2024-03-23: cranky: [INFO] Pausing systemd unit Theory_2743-2785673-11_0.scope
06:32:58 CET +01:00 2024-03-23: cranky: [INFO] Resuming systemd unit Theory_2743-2785673-11_0.scope
06:33:24 CET +01:00 2024-03-23: cranky: [INFO] Pausing systemd unit Theory_2743-2785673-11_0.scope
06:34:03 CET +01:00 2024-03-23: cranky: [INFO] Resuming systemd unit Theory_2743-2785673-11_0.scope
07:42:57 CET +01:00 2024-03-23: cranky: [INFO] Container Theory_2743-2785673-11_0 finished with status code 0.
07:42:57 CET +01:00 2024-03-23: cranky: [INFO] Preparing output.
07:42:58 (102851): cranky exited; CPU time 5042.031816
07:42:58 (102851): called boinc_finish(0)



Yours look weird:
06:16:59 AWST +08:00 2024-03-23: cranky-0.1.4: [INFO] mcplots runspec: boinc pp jets 13000 260 - pythia6 6.428 ambt1 100000 9
06:16:59 AWST +08:00 2024-03-23: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_
07:39:34 (590135): wrapper (7.15.26016): starting
07:39:34 (590135): wrapper (7.15.26016): starting
.
.
.
time="2024-03-23T07:39:38+08:00" level=error msg="container with id exists: Theory_2743-2733248-9_1"

It looks like the task stared from scratch (for an unknown reason).
It finally failed because runc didn't remove the container id from the 1st attempt.


Which systemctl version do you use (must be at least v246)?
Please post the output of "systemctl --version" plus the status output of a currently running Theory task.
You get the latter via a command like this:
systemctl --no-pager status Theory_2743-2733248-9_1.scope
ID: 49820 · Report as offensive     Reply Quote
Profile tazzduke

Send message
Joined: 24 Jun 10
Posts: 42
Credit: 5,348,959
RAC: 18,397
Message 49821 - Posted: 23 Mar 2024, 10:53:46 UTC - in response to Message 49820.  

Evening,

Thank you for the extra steps to look at and also an output of what it is supposed to look like.

I have to step out for the evening, but I can post this bit of info on my system, using the following version of systemd.

systemd 249 (249.11-0ubuntu3.12)

Regards
ID: 49821 · Report as offensive     Reply Quote
Profile tazzduke

Send message
Joined: 24 Jun 10
Posts: 42
Credit: 5,348,959
RAC: 18,397
Message 49822 - Posted: 23 Mar 2024, 23:52:48 UTC - in response to Message 49821.  

It's all working now, but not to sure why, but maybe a reboot did something lol.

But here is a work unit from LHC-Dev (( had it running a few of the theory tasks over there)

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3310417

I am getting the pause and resume now, with no errors.

Also this is the output in the stderr.txt in one of my theory tasks from here (not dev) which I tried it as well and you can see it is working as well.

05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] Starting runc container.
05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] To get some details on systemd level run
05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] systemctl status Theory_2743-2722097-13_1.scope
05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] mcplots runspec: boinc pp jets 13000 160 - pythia8 8.308 tune-A2 100000 13
05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_
07:40:32 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] Pausing systemd unit Theory_2743-2722097-13_1.scope
07:43:01 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] Resuming systemd unit Theory_2743-2722097-13_1.scope

Regards
ID: 49822 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4

Message boards : Theory Application : New native version v300.08


©2024 CERN