Message boards :
Theory Application :
Made a small script to keep an eye on Theory jobs with correct % done
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
I'm using this to know how far Theory jobs has come in the calculations. Maybe someone else find it useful. #!/bin/sh BASEDIR=/var/lib/boinc-client STDERR=stderr.txt # not used yet, this file is in $BASEDIR/slots/$SLOT/stderr.txt for each job RUNRIVET=cernvm/shared/runRivet.log INPUT=input JOBNAME="" TOTALEVENT=0 PROCESSEDEVENT=0 EVENTTOGO=0 PERCENT=0 # Find all slots SLOTLIST=$(ls $BASEDIR/slots) echo " " echo "---LHC Theory ------------------------------------------------------------------------------" printf "%*s%*s%*s%*s%*s\n" 25 "Job id" 14 "Total events" 18 "Processed events" 18 "Remaining events" 13 "Completed %" echo "--------------------------------------------------------------------------------------------" for SLOT in $SLOTLIST do # Check if the slot is a Theory job - must check other LHC projects so cernvm is not used in # more projects - Not used in CMS, have to check ATLAS and sixtrack as soon I get work if [ -d $BASEDIR/slots/$SLOT/cernvm ]; then # Get job name (note: this is one line if you copy/paste) JOBNAME="Theory_"$(grep -a "revision" $BASEDIR/slots/$SLOT/$INPUT | tr --delete '"' | awk -F'=' '{print $2}')"-"$(grep -a "runid" $BASEDIR/slots/$SLOT/$INPUT | tr --delete '"' | awk -F'=' '{print $2}')"-"$(grep -a "seed" $BASEDIR/slots/$SLOT/$INPUT | tr --delete '"' | awk -F'=' '{print $2}') # Must be possible to do this line in a smarter way :( TOTALEVENT=$(grep "\[runRivet\]" $BASEDIR/slots/$SLOT/$RUNRIVET | awk '{print$18}') # need a real error handler - if the scripts read the file same time as it's being updated the TOTALEVENT get screwed up if ! [ -n "$TOTALEVENT" ] && [ "$TOTALEVENT" -eq "$TOTALEVENT" ] 2>/dev/null; then sleep 1 # and try one more time TOTALEVENT=$(grep "\[runRivet\]" $BASEDIR/slots/$SLOT/$RUNRIVET | awk '{print$18}') fi PROCESSEDEVENT=$(tac $BASEDIR/slots/$SLOT/$RUNRIVET | awk '/events processed/{print $1;exit}') if [ -z "${PROCESSEDEVENT}" ]; then PROCESSEDEVENT=0 # if it gets here there is no job progress fi EVENTTOGO=$(expr $TOTALEVENT - $PROCESSEDEVENT) PERCENT=$(echo "scale = 2; $PROCESSEDEVENT/$TOTALEVENT *100" | bc) printf "%*s%*s%*s%*s%*s\n" 25 "$JOBNAME" 14 "$TOTALEVENT" 18 "$PROCESSEDEVENT" 18 "$EVENTTOGO" 13 "$PERCENT" fi done The output look like this (it shows correct on a terminal but gets a little wrong here): ---LHC Theory ------------------------------------------------------------------------------ Job id Total events Processed events Remaining events Completed % -------------------------------------------------------------------------------------------- Theory_2843-4105746-2 100000 10900 89100 10.00 Theory_2843-4105745-260 14000 700 13300 5.00 Theory_2843-4105744-242 31000 25300 5700 81.00 Theory_2843-4105740-250 47000 45900 1100 97.00 Theory_2843-4105750-290 45000 20900 24100 46.00 Theory_2843-4105752-290 22000 20400 1600 92.00 Theory_2843-4105744-250 22000 21800 200 99.00 |
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
Replace: PERCENT=$(echo "scale = 2; $PROCESSEDEVENT/$TOTALEVENT *100" | bc) With: PERCENT=$(awk -v a="$PROCESSEDEVENT" -v b="$TOTALEVENT" 'BEGIN {printf("%.1f\n",100*a/b)}') It gives a more correct % number ---LHC Theory ------------------------------------------------------------------------------ Job id Total events Processed events Remaining events Completed % -------------------------------------------------------------------------------------------- Theory_2843-4105746-2 100000 12000 88000 12.0 Theory_2843-4105745-260 14000 2200 11800 15.7 Theory_2843-4105742-304 29000 1300 27700 4.5 Theory_2843-4105743-304 19000 1000 18000 5.3 Theory_2843-4105749-304 33000 3600 29400 10.9 Theory_2843-4105740-304 45000 2200 42800 4.9 Theory_2843-4105748-304 36000 3600 32400 10.0 Theory_2843-4105750-304 39000 3600 35400 9.2 Theory_2843-4105747-304 41000 3700 37300 9.0 Theory_2843-4105744-242 31000 26700 4300 86.1 Theory_2843-4105751-304 42000 3500 38500 8.3 Theory_2843-4105750-290 45000 25900 19100 57.6 Theory_2814-4013432-44 100000 22600 77400 22.6 Theory_2814-3953152-43 100000 61700 38300 61.7 Theory_2814-3941809-44 100000 43500 56500 43.5 |
![]() Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,886,316 RAC: 55,049 ![]() ![]() |
@seanr22a As for your stderr.txt logfiles, this one looks good: https://lhcathome.cern.ch/lhcathome/result.php?resultid=419992235 Some hints regarding your script (you don't need to follow them) Avoid variale names like STDERR. It is too close to stderr which has a special meaning. Consider to grep init_data.xml for result_name and refine the output like: JOBNAME="$(grep -Pom1 '<result_name>Theory_\K[^<]+' init_data.xml)" [[ $JOBNAME != "" ]] && JOBNAME="Theory_$JOBNAME" Avoid 'expr' like in: EVENTTOGO=$(expr $TOTALEVENT - $PROCESSEDEVENT) Instead (in bash) use '(( ))' for integer(!) calculations like: EVENTTOGO=$(( $TOTALEVENT - $PROCESSEDEVENT )) Consider to use 'watch' to run your script in a separate console every n seconds like: watch -n 60 my_script |
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
@computezrmle I followed your suggestions (almost) ;) and did some small improvements and changed from sh to bash. The end result is the same but the script is cleaner. Thanks ! I use while :; do clear; theory.sh; sleep 60; done - watch do the same thing and is shorter to write :) The updated version: #!/bin/bash DATE=$(date +"%Y-%m-%d %H:%m:%S") BASEDIR=/var/lib/boinc-client ERRLOG=stderr.txt # not used yet in this script, this file is in $BASEDIR/slots/$SLOT/stderr.txt for each job - check if you have problems RUNRIVET=cernvm/shared/runRivet.log JOBINPUT=init_data.xml JOBNAME="" TOTALEVENT=0 PROCESSEDEVENT=0 EVENTTOGO=0 PERCENT=0 CERNVMCOUNTER=0 # Find all slots SLOTLIST=$(ls $BASEDIR/slots) echo " " echo "--- LHC Theory ----- $DATE ---------------------------------------------------" printf "%*s%*s%*s%*s%*s\n" 27 "Job id" 14 "Total events" 18 "Processed events" 18 "Remaining events" 13 "Completed %" echo "--------------------------------------------------------------------------------------------" for SLOT in $SLOTLIST do # Check if the slot is a Theory job - must check other LHC projects so cernvm is not used in # more projects - Not used in CMS, have to check ATLAS and sixtrack as soon I get work if [ -d $BASEDIR/slots/$SLOT/cernvm ]; then ((CERNVMCOUNTER++)) JOBNAME="Theory_"$(grep -Pom1 '<result_name>Theory_\K[^<]+' $BASEDIR/slots/$SLOT/$JOBINPUT) TOTALEVENT=$(grep "\[runRivet\]" $BASEDIR/slots/$SLOT/$RUNRIVET | awk '{print$18}') # if the scripts read the file same time as it's being updated the TOTALEVENT get screwed up so make a simple error handler if ! [ -n "$TOTALEVENT" ] && [ "$TOTALEVENT" -eq "$TOTALEVENT" ] 2>/dev/null; then sleep 1 # and try one more time TOTALEVENT=$(grep "\[runRivet\]" $BASEDIR/slots/$SLOT/$RUNRIVET | awk '{print$18}') fi PROCESSEDEVENT=$(tac $BASEDIR/slots/$SLOT/$RUNRIVET | awk '/events processed/{print $1;exit}') if [ -z "${PROCESSEDEVENT}" ]; then PROCESSEDEVENT=0 fi EVENTTOGO=$(( $TOTALEVENT - $PROCESSEDEVENT )) PERCENT=$(awk -v a="$PROCESSEDEVENT" -v b="$TOTALEVENT" 'BEGIN {printf("%.1f\n",100*a/b)}') printf "%*s%*s%*s%*s%*s\n" 27 "$JOBNAME" 14 "$TOTALEVENT" 18 "$PROCESSEDEVENT" 18 "$EVENTTOGO" 13 "$PERCENT" fi done if (( $CERNVMCOUNTER == 0 )); then echo "No Theory job running" fi echo "--------------------------------------------------------------------------------------------" Looks like this: ![]() |
Send message Joined: 4 Mar 20 Posts: 14 Credit: 6,507,038 RAC: 5,596 ![]() ![]() ![]() |
@seanr22a Thank you very much for the script. I just want to let you know a little typo. The minutes part of the date should be a capital M for the minutes instead of m for the month. DATE=$(date +"%Y-%m-%d %H:%M:%S") |
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
Anne Havinga didn't see that one. Thanks ! I can't go back and edit my previous post so I hope those who are interested see your post or they will have the minute updated once a month :lol: |
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
There is a small bug in the theory logging. I don't expect anyone to fix it but I can whish :) this is runRivet.log for Theory_2843-4274082-511_0. When the theory app logs anomalies, it doesn't end the log line with a newline \n. When the app logs how many events that has been processed the info gets mixed up and it's not possible to extract the correct number of events, example from the log: -15 [3163800 events processed - should have been '63800 events processed' on its own line. The logging works as intended when there is no anomalies. The 'events processed' is mixed up in many different ways with the other log info. This is just a couple of lines of thousands of lines: The decay tau- -> nu_tau e- nu_ebar 156.664 500 is too inefficient for the particle 30 tau- 63400 events processed The decay tau- -> nu_tau mu- nu_mub63500 events processed 63600 events processed The decay tau- -> nu_tau e- nu_ebar 156.664 500 is too inefficient for the particle 38 63700 events processed The decay tau+ -> nu_taubar pi+ pi0 544.147 500 is too inefficient for the particle 32 tau+ -15 [3163800 events processed The decay tau+ ->63900 events processed So my whish is that a new line is added to the end of each log message in the app running theory jobs. |
![]() Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,886,316 RAC: 55,049 ![]() ![]() |
So my whish is that a new line is added ... These logs are not intended to be used by BOINC users. If you do, accept them 'as is'. OTOH you could easily enable your script to skip weird lines and only filter for well formatted ones. Hint: use regex patterns. |
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
There has been a lot of new Theory jobs released that behaves a little different from the ones I wrote the script for. So I did some updates. 1. Added a Slot column to make it easier to find the stderr.txt file for the job 2. Added an Elapsed Time column. This does not show the CPU time; it shows the total time from when Boinc downloaded the job and created the slot until now. 3. Added a Err column. This column gets a marker if the script can't extract the Processed Event info from the runRivet.log. It does NOT show that there is something wrong with the job. 4. Improved error handling. ------------- #!/bin/bash DATE=$(date +"%Y-%m-%d %H:%M:%S") HOST=$(hostname) BASEDIR=/var/lib/boinc-client ERRLOG=stderr.txt # not used yet in this script, this file is in $BASEDIR/slots/$SLOT/stderr.txt for each job - check if you have problems RUNRIVET=cernvm/shared/runRivet.log TMPRUNRIVET=/tmp/runRivet.log JOBINPUT=init_data.xml JOBNAME="" JOBTIME="" JOBSTART="" JOBCURRENT="" TOTALEVENT=0 PROCESSEDEVENT=0 EVENTTOGO=0 PERCENT=0 CERNVMCOUNTER=0 ERR="" # Find all Boinc slots SLOTLIST=$(ls $BASEDIR/slots) echo -e "\n" echo "--- LHC Theory - $HOST ---- $DATE -------------------------------------------------------------------------" printf "%*s%*s%*s%*s%*s%*s%*s%*s\n" 6 "Slot" 27 "Job id" 14 "Total events" 18 "Processed events" 18 "Remaining events" 18 "Elapsed time" 13 "Completed %" 5 "Err" echo "-------------------------------------------------------------------------------------------------------------------------" for SLOT in $SLOTLIST do # Check if the slot is a Theory job. Only the Theory jobs has the cernvm folder and check for the runRivet.log file. if [ -d $BASEDIR/slots/"$SLOT"/cernvm ] && [ -f $BASEDIR/slots/"$SLOT"/$RUNRIVET ]; then # Work with a copy of the runRivet.log file to avoid errors if the file is modified during script exec. cp $BASEDIR/slots/"$SLOT"/$RUNRIVET $TMPRUNRIVET # Keep track of how many theory jobs ((CERNVMCOUNTER++)) # Calculate runtime. This is not cpu time, it's the total time from when the slot was created in Boinc until now. JOBSTART=$(stat --format %w $BASEDIR/slots/"$SLOT"/boinc_lockfile | awk -F'.' '{print $1}') JOBCURRENT=$(date +"%Y-%m-%d %H:%m:%S") diff=$(($(date -d "$JOBCURRENT" +'%s') - $(date -d "$JOBSTART" +'%s'))) days=$(($(date -d @$diff +'%-j')-1)) JOBTIME=$(date -d @"$diff" +"$days"' day(s) %H:%M') JOBNAME="Theory_"$(grep -Pom1 '<result_name>Theory_\K[^<]+' $BASEDIR/slots/"$SLOT"/$JOBINPUT) TOTALEVENT=$(grep "\[runRivet\]" $TMPRUNRIVET | awk '{print$18}') if [ -z "${TOTALEVENT}" ]; then TOTALEVENT=0 fi PROCESSEDEVENT=$(grep "events processed" $TMPRUNRIVET | tac $TMPRUNRIVET | awk '/events processed/{print $1;exit}') # Check so PROCESSEDEVENT is a number. There is a logging bug in the theory app so if there is job anomalies logged we can't extract how many events if [ -n "$PROCESSEDEVENT" ] && [ "$PROCESSEDEVENT" -eq "$PROCESSEDEVENT" ] 2>/dev/null; then if [ "$TOTALEVENT" -ge "$PROCESSEDEVENT" ]; then EVENTTOGO=$(( TOTALEVENT - PROCESSEDEVENT )) PERCENT=$(awk -v a="$PROCESSEDEVENT" -v b="$TOTALEVENT" 'BEGIN {printf("%.1f\n",100*a/b)}') ERR="" else EVENTTOGO=0 PERCENT=0 ERR="*" fi else PROCESSEDEVENT=0 EVENTTOGO=0 PERCENT=0 ERR="*" fi printf "%*s%*s%*s%*s%*s%*s%*s%*s\n" 6 "$SLOT" 27 "$JOBNAME" 14 "$TOTALEVENT" 18 "$PROCESSEDEVENT" 18 "$EVENTTOGO" 18 "$JOBTIME" 13 "$PERCENT" 5 "$ERR" rm $TMPRUNRIVET fi done if (( CERNVMCOUNTER == 0 )); then echo "No Theory job running" fi echo -e "\n--- Number of Theory jobs for host $HOST: $CERNVMCOUNTER ----------------------------------------------------------------------------" ------------- This is how it looks now: ![]() |
Send message Joined: 4 Mar 20 Posts: 14 Credit: 6,507,038 RAC: 5,596 ![]() ![]() ![]() |
Thanks again for the update. |
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
I received some jobs that doesn't use the Event method to tell how the job is proceeding. Instead it uses 'Integrate' like this in the runRivet.log file: Integrate 318 of 760: Replace this: -------------- TOTALEVENT=$(grep "\[runRivet\]" $TMPRUNRIVET | awk '{print$18}') if [ -z "${TOTALEVENT}" ]; then TOTALEVENT=0 fi PROCESSEDEVENT=$(grep "events processed" $TMPRUNRIVET | tac $TMPRUNRIVET | awk '/events processed/{print $1;exit}') -------------- With this: -------------- # Check if it is a job that doesn't use Event but Integrate INTEGRATE=$(awk -v p="Integrate" '$1 == p' $TMPRUNRIVET | awk '{last=$0} END{print last}' | sed 's/.$//') if [ ${INTEGRATE:+1} ]; then # Uses Integrate TOTALEVENT=$(echo $INTEGRATE | awk '{print $4}') PROCESSEDEVENT=$(echo $INTEGRATE | awk '{print $2}') else # Uses Event TOTALEVENT=$(grep "\[runRivet\]" $TMPRUNRIVET | awk '{print$18}') if [ -z "${TOTALEVENT}" ]; then TOTALEVENT=0 fi PROCESSEDEVENT=$(grep "events processed" $TMPRUNRIVET | tac $TMPRUNRIVET | awk '/events processed/{print $1;exit}') fi -------------- It just do a check for a absolute match of "Integrate" in the runRivet.log file. This can cause trouble if any app would log the word 'Integrate' to the logfile for some other reason but so far it's working well. This check can always be improved :) This is one of those jobs: ![]() |
Send message Joined: 14 Jan 10 Posts: 1461 Credit: 9,859,193 RAC: 2,531 ![]() ![]() |
I received some jobs that doesn't use the Event method to tell how the job is proceeding. Instead it uses 'Integrate' like this in the runRivet.log file: Integrate 318 of 760:This is one of the Theory tasks with the SHERPA event generator. These jobs have several (I think 4) steps before the real event generation starts. These steps consist of integration and initialisation. Most of the times the needed time for the last part (event generation) is shorter than the other steps together, so difficult to predict. |
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
These jobs have several (I think 4) steps before the real event generation starts. These steps consist of integration and initialisation. Good to know. I will keep an eye on it and do necessary script adjustments so it can handle the transition from preparing to the event phase. Thanks ! |
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
@crystal pellet The script handles the Integrate/prepare now. Those jobs that has the Integrate/prepare period will show the progress and there is a Pre in the Err column showing it is preparing. As soon it gone through its prepare sequence and started the actual job it shows the Event progress as usual. I have tested back and forth with a saved runRivet.log file and it seems to be working. I have to wait for more jobs of that type to be sure. @Anne Havinga I don't know if I noticed it first this time :) I have apperently a hard time writing the M for minute. If you look at the updated script in the previous post at the Elapsed time calculation - I did it again :lol: but it's fixed now. Another updated version. I don't know how many different types of jobs there is needing updates. Time will tell. #!/bin/bash DATE=$(date +"%Y-%m-%d %H:%M:%S") HOST=$(hostname) BASEDIR=/var/lib/boinc-client ERRLOG=stderr.txt # not used yet in this script, this file is in $BASEDIR/slots/$SLOT/stderr.txt for each job - check if you have problems RUNRIVET=cernvm/shared/runRivet.log TMPRUNRIVET=/tmp/runRivet.log JOBINPUT=init_data.xml JOBNAME="" JOBTIME="" JOBSTART="" JOBCURRENT="" TOTALEVENT=0 PROCESSEDEVENT=0 EVENTTOGO=0 PERCENT=0 CERNVMCOUNTER=0 INTEGRATE="" ERR="" # Find all Boinc slots SLOTLIST=$(ls $BASEDIR/slots) echo -e "\n" echo "--- LHC Theory - $HOST ---- $DATE -------------------------------------------------------------------------" printf "%*s%*s%*s%*s%*s%*s%*s%*s\n" 6 "Slot" 27 "Job id" 14 "Total events" 18 "Processed events" 18 "Remaining events" 18 "Elapsed time" 13 "Completed %" 5 "Err" echo "-------------------------------------------------------------------------------------------------------------------------" for SLOT in $SLOTLIST do # Check if the slot is a Theory job. Only the Theory jobs has the cernvm folder and check for the runRivet.log file. if [ -d $BASEDIR/slots/"$SLOT"/cernvm ] && [ -f $BASEDIR/slots/"$SLOT"/$RUNRIVET ]; then # Work with a copy of the runRivet.log file to avoid errors if the file is modified during script exec. cp $BASEDIR/slots/"$SLOT"/$RUNRIVET $TMPRUNRIVET ERR="" # Keep track of how many theory jobs ((CERNVMCOUNTER++)) # Calculate runtime. This is not cpu time, it's the total time from when the slot was created in Boinc until now. JOBSTART=$(stat --format %w $BASEDIR/slots/"$SLOT"/boinc_lockfile | awk -F'.' '{print $1}') JOBCURRENT=$(date +"%Y-%m-%d %H:%M:%S") diff=$(($(date -d "$JOBCURRENT" +'%s') - $(date -d "$JOBSTART" +'%s'))) days=$(($(date -d @$diff +'%-j')-1)) JOBTIME=$(date -d @"$diff" +"$days"' day(s) %H:%M') JOBNAME="Theory_"$(grep -Pom1 '<result_name>Theory_\K[^<]+' $BASEDIR/slots/"$SLOT"/$JOBINPUT) TOTALEVENT=$(grep "\[runRivet\]" $TMPRUNRIVET | awk '{print$18}') if [ -z "${TOTALEVENT}" ]; then TOTALEVENT=0 ERR="*" fi # If PROCESSEDEVENT get a number when disable INTEGRATE. If it is empty it's still in INTEGRATE mode so enable INTEGRATE PROCESSEDEVENT=$(grep "events processed" $TMPRUNRIVET | tail -1 | awk '/events processed/{print $1;exit}') if [ -n "$PROCESSEDEVENT" ] && [ "$PROCESSEDEVENT" -eq "$PROCESSEDEVENT" ] 2>/dev/null; then NOINTEGRATE=1 else NOINTEGRATE=0 fi if [ -z "${PROCESSEDEVENT}" ]; then PROCESSEDEVENT=0 ERR="*" fi # Overide with Integrate to handle the job transition from Integrate (integration and initialisation phase) to Event phase. Info from @Crystal Pellet at LHC. # Check if job uses integration/initialisation phase if [ "$NOINTEGRATE" -eq "0" ]; then INTEGRATE=$(awk -v p="Integrate" '$1 == p' $TMPRUNRIVET | awk '{last=$0} END{print last}' | sed 's/.$//') if [ ${INTEGRATE:+1} ]; then # Uses Integrate TOTALEVENT=$(echo "$INTEGRATE" | awk '{print $4}') PROCESSEDEVENT=$(echo "$INTEGRATE" | awk '{print $2}') ERR="Pre" fi fi # Check so PROCESSEDEVENT is a number if [ -n "$PROCESSEDEVENT" ] && [ "$PROCESSEDEVENT" -eq "$PROCESSEDEVENT" ] 2>/dev/null; then if [ "$TOTALEVENT" -ge "$PROCESSEDEVENT" ]; then EVENTTOGO=$(( TOTALEVENT - PROCESSEDEVENT )) PERCENT=$(awk -v a="$PROCESSEDEVENT" -v b="$TOTALEVENT" 'BEGIN {printf("%.1f\n",100*a/b)}') else EVENTTOGO=0 PERCENT=0 ERR="*" fi else PROCESSEDEVENT=0 EVENTTOGO=0 PERCENT=0 ERR="*" fi printf "%*s%*s%*s%*s%*s%*s%*s%*s\n" 6 "$SLOT" 27 "$JOBNAME" 14 "$TOTALEVENT" 18 "$PROCESSEDEVENT" 18 "$EVENTTOGO" 18 "$JOBTIME" 13 "$PERCENT" 5 "$ERR" rm $TMPRUNRIVET fi done if (( CERNVMCOUNTER == 0 )); then echo "No Theory job running" fi echo -e "\n--- Number of Theory jobs for host $HOST: $CERNVMCOUNTER ----------------------------------------------------------------------------" |
Send message Joined: 4 Mar 20 Posts: 14 Credit: 6,507,038 RAC: 5,596 ![]() ![]() ![]() |
@ seanr22a yes I did notice it and corrected it on my own copy :-). Thanks again for this update however I found a minor fault. The elapsed time calculated is hour to much. If the jobs started the elapsed time starts with 0 days and 1:00 hours. As I am not a scripting and/or regex expert it's hard for me to correct. It would be nice to have it fixed if possible. Thanks in advance. |
Send message Joined: 21 Feb 11 Posts: 86 Credit: 578,973 RAC: 0 ![]() ![]() |
There is another problem It assumes boinc data is located in /var/lib/boinc-client, while latest versions from https://boinc.berkeley.edu/linux_install.php install it in /var/lib/boinc |
![]() Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,886,316 RAC: 55,049 ![]() ![]() |
@ seanr22a I suggest not to use this forum to publish your script. The much better place would be https://github.com/. Looks like you are already registered there. Just create your own repository and make it public, e.g. https://github.com/seanr22a/lhcathome___or_any_name_you_choose From here set a link to that repository. |
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
There is another problem That is why you have the variabel BASEDIR=/var/lib/boinc-client that you are supposed to change to match your install :) |
Send message Joined: 29 Nov 18 Posts: 41 Credit: 2,644,024 RAC: 25 ![]() ![]() |
@ seanr22a I will take a look at it. Waiting for another batch of Theory jobs so I have something to work with. I fixed some other small errors and added support for one more type of Theory job. I'm testing that now but need more Theory jobs. I will post the script here one more time when I've done some testing. After that I will take a look at github as admin @computezrmle wanted me to do. [EDIT] Did a quick test script with exactly the same time calculation as in the theory script. What I found was quite funny :) This is the time info as it is in the theory script right now: Jobstart = 2025-03-19 21:43:37 Jobcurrent = 2025-03-19 21:43:57 diff = 20 days = 0 Jobtime = 0 day(s) 07:00 So, it should have been 00:00 in time difference but at my location it adds 7 hours from the TIMEZONE. I'm at Asia/Bangkok so it explains the issue. At your location you have a TIMEZONE that are +1 hour. To fix this replace: JOBTIME=$(date -d @"$diff" +"$days"' day(s) %H:%M') With: JOBTIME=$(TZ=GMT date -d @"$diff" +"$days"' day(s) %H:%M') This clears the timezone for the date command (only in the script) to GMT timezone which is 0. I hope daylight savings doesn't mess with this as well :) |
![]() Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,886,316 RAC: 55,049 ![]() ![]() |
...as admin @computezrmle wanted me to do. I'm neither admin nor do I force you to do so. It's just the case that this forum is not made to handle things like code management. Github is made exactly for this. |
©2025 CERN