1) Message boards : CMS Application : CMS Tasks Failing (Message 42427)
Posted 12 May 2020 by Guy PF Masevaux
Post:
my whole computers where making errors and i excuse me but it is a programmi,ng problem or a mistake of mathematical domain définition in the program
I think an error treatment Inside of the program could prevent such mistake
2) Message boards : CMS Application : CMS Tasks Failing (Message 42426)
Posted 12 May 2020 by Guy PF Masevaux
Post:
new error are occured but not after 18 mimutes but after 22 minutes of running
the resume of the task in error is different
He is Killing a lot of jobs
i look if the origin is one or more computers
3) Message boards : CMS Application : CMS Tasks Failing (Message 42425)
Posted 12 May 2020 by Guy PF Masevaux
Post:
now everything is running without problem
congratulation to the programmer who solved the bug
4) Message boards : CMS Application : CMS Tasks Failing (Message 42403)
Posted 10 May 2020 by Guy PF Masevaux
Post:
at the programmer should understand what is failing:



<core_client_version>7.16.5</core_client_version>
<![CDATA[
<message>
La pile de l - exit code 207 (0xcf)</message>
<stderr_txt>
2020-05-10 23:53:19 (157820): Detected: vboxwrapper 26197
2020-05-10 23:53:19 (157820): Detected: BOINC client v7.7
2020-05-10 23:53:20 (157820): Detected: VirtualBox VboxManage Interface (Version: 6.0.14)
2020-05-10 23:53:20 (157820): Detected: Heartbeat check (file: 'heartbeat' every 1200.000000 seconds)
2020-05-10 23:53:20 (157820): Successfully copied 'init_data.xml' to the shared directory.
2020-05-10 23:53:22 (157820): Create VM. (boinc_312d6963e4768bbf, slot#6)
2020-05-10 23:53:22 (157820): Setting Memory Size for VM. (2048MB)
2020-05-10 23:53:22 (157820): Setting CPU Count for VM. (1)
2020-05-10 23:53:23 (157820): Setting Chipset Options for VM.
2020-05-10 23:53:23 (157820): Setting Boot Options for VM.
2020-05-10 23:53:23 (157820): Setting Network Configuration for NAT.
2020-05-10 23:53:23 (157820): Enabling VM Network Access.
2020-05-10 23:53:24 (157820): Disabling USB Support for VM.
2020-05-10 23:53:25 (157820): Disabling COM Port Support for VM.
2020-05-10 23:53:25 (157820): Disabling LPT Port Support for VM.
2020-05-10 23:53:25 (157820): Disabling Audio Support for VM.
2020-05-10 23:53:25 (157820): Disabling Clipboard Support for VM.
2020-05-10 23:53:26 (157820): Disabling Drag and Drop Support for VM.
2020-05-10 23:53:26 (157820): Adding storage controller(s) to VM.
2020-05-10 23:53:26 (157820): Adding virtual disk drive to VM. (vm_image.vdi)
2020-05-10 23:53:27 (157820): Adding VirtualBox Guest Additions to VM.
2020-05-10 23:53:27 (157820): Adding network bandwidth throttle group to VM. (Defaulting to 1024GB)
2020-05-10 23:53:27 (157820): forwarding host port 65246 to guest port 80
2020-05-10 23:53:27 (157820): Enabling remote desktop for VM.
2020-05-10 23:53:28 (157820): Enabling shared directory for VM.
2020-05-10 23:53:28 (157820): Starting VM using VBoxManage interface. (boinc_312d6963e4768bbf, slot#6)
2020-05-10 23:53:33 (157820): Successfully started VM. (PID = '142712')
2020-05-10 23:53:33 (157820): Reporting VM Process ID to BOINC.
2020-05-10 23:53:33 (157820): Guest Log: BIOS: VirtualBox 6.0.14

2020-05-10 23:53:33 (157820): Guest Log: CPUID EDX: 0x178bfbff

2020-05-10 23:53:33 (157820): Guest Log: BIOS: ata0-0: PCHS=16383/16/63 LCHS=1024/255/63

2020-05-10 23:53:33 (157820): VM state change detected. (old = 'PoweredOff', new = 'Running')
2020-05-10 23:53:33 (157820): Detected: Web Application Enabled (http://localhost:65246)
2020-05-10 23:53:33 (157820): Detected: Remote Desktop Enabled (localhost:65247)
2020-05-10 23:53:33 (157820): Preference change detected
2020-05-10 23:53:33 (157820): Setting CPU throttle for VM. (80%)
2020-05-10 23:53:34 (157820): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 60 seconds) or (Vbox_job.xml: 600 seconds))
2020-05-10 23:53:35 (157820): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032

2020-05-10 23:53:35 (157820): Guest Log: BIOS: Booting from Hard Disk...

2020-05-10 23:53:37 (157820): Guest Log: BIOS: KBD: unsupported int 16h function 03

2020-05-10 23:53:37 (157820): Guest Log: BIOS: AX=0305 BX=0000 CX=0000 DX=0000

2020-05-10 23:53:52 (157820): Guest Log: vgdrvHeartbeatInit: Setting up heartbeat to trigger every 2000 milliseconds

2020-05-10 23:53:52 (157820): Guest Log: vboxguest: misc device minor 56, IRQ 20, I/O port d020, MMIO at 00000000f0400000 (size 0x400000)

2020-05-10 23:54:14 (157820): Guest Log: VBoxService 5.2.6 r120293 (verbosity: 0) linux.amd64 (Jan 15 2018 14:51:00) release log

2020-05-10 23:54:14 (157820): Guest Log: 00:00:00.000125 main Log opened 2020-05-10T21:54:14.490535000Z

2020-05-10 23:54:14 (157820): Guest Log: 00:00:00.000275 main OS Product: Linux

2020-05-10 23:54:14 (157820): Guest Log: 00:00:00.000326 main OS Release: 4.14.157-17.cernvm.x86_64

2020-05-10 23:54:14 (157820): Guest Log: 00:00:00.000348 main OS Version: #1 SMP Wed Dec 4 17:26:45 CET 2019

2020-05-10 23:54:14 (157820): Guest Log: 00:00:00.000367 main Executable: /usr/share/vboxguest52/usr/sbin/VBoxService

2020-05-10 23:54:14 (157820): Guest Log: 00:00:00.000368 main Process ID: 2948

2020-05-10 23:54:14 (157820): Guest Log: 00:00:00.000368 main Package type: LINUX_64BITS_GENERIC

2020-05-10 23:54:14 (157820): Guest Log: 00:00:00.001838 main 5.2.6 r120293 started. Verbose level = 0

2020-05-10 23:54:25 (157820): Guest Log: [INFO] Mounting the shared directory

2020-05-10 23:54:25 (157820): Guest Log: [INFO] Shared directory mounted, enabling vboxmonitor

2020-05-10 23:54:25 (157820): Guest Log: [DEBUG] Testing network connection to cern.ch on port 80

2020-05-10 23:54:25 (157820): Guest Log: [DEBUG] Connection to cern.ch 80 port [tcp/http] succeeded!

2020-05-10 23:54:25 (157820): Guest Log: [DEBUG] 0

2020-05-10 23:54:25 (157820): Guest Log: [DEBUG] Testing VCCS connection to vccs.cern.ch on port 443

2020-05-10 23:54:25 (157820): Guest Log: [DEBUG] Connection to vccs.cern.ch 443 port [tcp/https] succeeded!

2020-05-10 23:54:25 (157820): Guest Log: [DEBUG] 0

2020-05-10 23:54:25 (157820): Guest Log: [DEBUG] Testing connection to Condor server on port 9618

2020-05-10 23:54:25 (157820): Guest Log: [DEBUG] Connection to vocms0840.cern.ch 9618 port [tcp/condor] succeeded!

2020-05-10 23:54:26 (157820): Guest Log: [DEBUG] 0

2020-05-10 23:55:28 (157820): Guest Log: [DEBUG] Probing CVMFS ...

2020-05-10 23:55:29 (157820): Guest Log: Probing /cvmfs/grid.cern.ch... OK

2020-05-10 23:55:29 (157820): Guest Log: VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE

2020-05-10 23:55:29 (157820): Guest Log: 2.4.4.0 3713 1 25848 12197 4 1 1242455 4096000 2 65024 0 3 100 0 0 http://s1cern-cvmfs.openhtc.io/cvmfs/grid.cern.ch DIRECT 1

2020-05-10 23:55:33 (157820): Guest Log: [INFO] Reading volunteer information

2020-05-10 23:55:33 (157820): Guest Log: [INFO] Volunteer: Guy PF Masevaux (589052)

2020-05-10 23:55:33 (157820): Guest Log: [INFO] VMID: 49b2fac1-df25-48d2-a4ee-4612ca6a31f8

2020-05-10 23:55:34 (157820): Guest Log: [INFO] Requesting an X509 credential from LHC@home

2020-05-10 23:55:34 (157820): Guest Log: [INFO] Running the fast benchmark.

2020-05-10 23:55:59 (157820): Guest Log: [INFO] Machine performance 20.11 HEPSPEC06

2020-05-10 23:55:59 (157820): Guest Log: [INFO] CMS application starting. Check log files.

2020-05-10 23:56:00 (157820): Guest Log: [DEBUG] HTCondor ping

2020-05-10 23:56:01 (157820): Guest Log: [DEBUG] 0

2020-05-11 00:06:26 (157820): Guest Log: Did the tarball get created?

2020-05-11 00:06:26 (157820): Guest Log: /tmp/CMS_175225_1589144489.909597_0.tgz

2020-05-11 00:06:26 (157820): Guest Log: Here is the upload output

2020-05-11 00:06:27 (157820): Guest Log: Here is the upload error

2020-05-11 00:06:27 (157820): Guest Log: Here is the condor directory

2020-05-11 00:06:27 (157820): Guest Log: MasterLog

2020-05-11 00:06:27 (157820): Guest Log: ProcLog

2020-05-11 00:06:27 (157820): Guest Log: StarterLog

2020-05-11 00:06:27 (157820): Guest Log: StartLog

2020-05-11 00:06:27 (157820): Guest Log: XferStatsLog

2020-05-11 00:06:27 (157820): Guest Log: Here is the MasterLog

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 ******************************************************

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 ** condor_master (CONDOR_MASTER) STARTING UP

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 ** /usr/sbin/condor_master

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 ** SubsystemInfo: name=MASTER type=MASTER(2) class=DAEMON(1)

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 ** Configuration: subsystem:MASTER local:<NONE> class:DAEMON

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 ** $CondorVersion: 8.6.10 Mar 12 2018 BuildID: 435200 $

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 ** $CondorPlatform: x86_64_RedHat6 $

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 ** PID = 4695

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 ** Log last touched time unavailable (No such file or directory)

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 ******************************************************

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 Using config source: /etc/condor/condor_config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 Using local config sources:

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 /etc/condor/config.d/10_security.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 /etc/condor/config.d/14_network.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 /etc/condor/config.d/20_workernode.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 /etc/condor/config.d/30_lease.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 /etc/condor/config.d/35_cms.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 /etc/condor/config.d/40_ccb.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 /etc/condor/config.d/62-benchmark.conf

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 /etc/condor/condor_config.local

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 config Macros = 170, Sorted = 170, StringBytes = 6830, TablesBytes = 6224

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 CLASSAD_CACHING is OFF

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 Daemon Log is logging: D_ALWAYS D_ERROR

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 Daemoncore: Listening at <10.0.2.15:43927> on TCP (ReliSock).

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 DaemonCore: command socket at <10.0.2.15:43927?addrs=10.0.2.15-43927&noUDP>

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:01 DaemonCore: private command socket at <10.0.2.15:43927?addrs=10.0.2.15-43927>

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:16 CCBListener: registered with CCB server vocms0840.cern.ch as ccbid 137.138.156.85:9618?addrs=137.138.156.85-9618#2081158

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:16 Master restart (GRACEFUL) is watching /usr/sbin/condor_master (mtime:1520893905)

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 Started DaemonCore process "/usr/sbin/condor_startd", pid and pgroup = 10244

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:27 Setting ready state 'Ready' for STARTD

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Got SIGTERM. Performing graceful shutdown.

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Sent SIGTERM to STARTD (pid 10244)

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 AllReaper unexpectedly called on pid 10244, status 0.

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 The STARTD (pid 10244) exited with status 0

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 All daemons are gone. Exiting.

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 **** condor_master (condor_MASTER) pid 4695 EXITING WITH STATUS 0

2020-05-11 00:06:27 (157820): Guest Log: Here is the KernelTuning.log

2020-05-11 00:06:27 (157820): Guest Log: Here is the StartLog

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 ******************************************************

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 ** condor_startd (CONDOR_STARTD) STARTING UP

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 ** /usr/sbin/condor_startd

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 ** SubsystemInfo: name=STARTD type=STARTD(7) class=DAEMON(1)

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 ** Configuration: subsystem:STARTD local:<NONE> class:DAEMON

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 ** $CondorVersion: 8.6.10 Mar 12 2018 BuildID: 435200 $

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 ** $CondorPlatform: x86_64_RedHat6 $

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 ** PID = 10244

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 ** Log last touched time unavailable (No such file or directory)

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 ******************************************************

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 Using config source: /etc/condor/condor_config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 Using local config sources:

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 /etc/condor/config.d/10_security.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 /etc/condor/config.d/14_network.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 /etc/condor/config.d/20_workernode.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 /etc/condor/config.d/30_lease.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 /etc/condor/config.d/35_cms.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 /etc/condor/config.d/40_ccb.config

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 /etc/condor/config.d/62-benchmark.conf

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 /etc/condor/condor_config.local

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 config Macros = 171, Sorted = 171, StringBytes = 6856, TablesBytes = 6260

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 CLASSAD_CACHING is ENABLED

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 Daemon Log is logging: D_ALWAYS D_ERROR

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 Daemoncore: Listening at <10.0.2.15:41863> on TCP (ReliSock).

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 DaemonCore: command socket at <10.0.2.15:41863?addrs=10.0.2.15-41863&noUDP>

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:17 DaemonCore: private command socket at <10.0.2.15:41863?addrs=10.0.2.15-41863>

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:22 CCBListener: registered with CCB server vocms0840.cern.ch as ccbid 137.138.156.85:9618?addrs=137.138.156.85-9618#2081160

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 VM-gahp server reported an internal error

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 VM universe will be tested to check if it is available

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 History file rotation is enabled.

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 Maximum history file size is: 20971520 bytes

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 Number of rotated history files is: 2

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 Allocating auto shares for slot type 0: Cpus: auto, Memory: auto, Swap: auto, Disk: auto

2020-05-11 00:06:27 (157820): Guest Log: slot type 0: Cpus: 1.000000, Memory: 3000, Swap: 100.00%, Disk: 100.00%

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 New machine resource allocated

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 Setting up slot pairings

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 CronJobList: Adding job 'multicore'

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 CronJob: Initializing job 'multicore' (/usr/local/bin/multicore-shutdown)

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 CronJobList: Adding job 'mips'

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 CronJobList: Adding job 'kflops'

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 CronJob: Initializing job 'mips' (/usr/libexec/condor/condor_mips)

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 CronJob: Initializing job 'kflops' (/usr/libexec/condor/condor_kflops)

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 State change: IS_OWNER is false

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 Changing state: Owner -> Unclaimed

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 State change: RunBenchmarks is TRUE

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 Changing activity: Idle -> Benchmarking

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:24 BenchMgr:StartBenchmarks()

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:27 Initial update sent to collector(s)

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:27 Sending DC_SET_READY message to master <10.0.2.15:43927?addrs=10.0.2.15-43927>

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:45 State change: benchmarks completed

2020-05-11 00:06:27 (157820): Guest Log: 05/10/20 23:56:45 Changing activity: Benchmarking -> Idle

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 No resources have been claimed for 600 seconds

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Shutting down Condor on this machine.

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Got SIGTERM. Performing graceful shutdown.

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 shutdown graceful

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Cron: Killing all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Killing job multicore

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Cron: Killing all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Killing job mips

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Killing job kflops

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Deleting cron job manager

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Cron: Killing all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Killing job multicore

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJob: 'multicore': Trying to kill illegal PID 0

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Cron: Killing all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Killing job multicore

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJob: 'multicore': Trying to kill illegal PID 0

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJobList: Deleting all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJobList: Deleting job 'multicore'

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJob: Deleting job 'multicore' (/usr/local/bin/multicore-shutdown), timer 9

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJob: 'multicore': Trying to kill illegal PID 0

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Cron: Killing all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJobList: Deleting all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Deleting benchmark job mgr

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Cron: Killing all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Killing job mips

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Killing job kflops

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Cron: Killing all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Killing job mips

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Killing job kflops

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJobList: Deleting all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJobList: Deleting job 'mips'

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJob: Deleting job 'mips' (/usr/libexec/condor/condor_mips), timer -1

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJobList: Deleting job 'kflops'

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJob: Deleting job 'kflops' (/usr/libexec/condor/condor_kflops), timer -1

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 Cron: Killing all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 CronJobList: Deleting all jobs

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 All resources are free, exiting.

2020-05-11 00:06:27 (157820): Guest Log: 05/11/20 00:06:24 **** condor_startd (condor_STARTD) pid 10244 EXITING WITH STATUS 0

2020-05-11 00:06:27 (157820): Guest Log: [ERROR] No jobs were available to run.

2020-05-11 00:06:27 (157820): Guest Log: [INFO] Shutting Down.

2020-05-11 00:06:27 (157820): VM Completion File Detected.
2020-05-11 00:06:27 (157820): VM Completion Message: No jobs were available to run.
.
2020-05-11 00:06:27 (157820): Powering off VM.
2020-05-11 00:11:28 (157820): VM did not power off when requested.
2020-05-11 00:11:28 (157820): VM was successfully terminated.
2020-05-11 00:11:28 (157820): Deregistering VM. (boinc_312d6963e4768bbf, slot#6)
2020-05-11 00:11:28 (157820): Removing network bandwidth throttle group from VM.
2020-05-11 00:11:28 (157820): Removing VM from VirtualBox.
00:11:33 (157820): called boinc_finish(207)

</stderr_txt>
5) Message boards : CMS Application : CMS Tasks Failing (Message 42400)
Posted 10 May 2020 by Guy PF Masevaux
Post:
today the cms tasks are only failing
I looked the resume
The tasks should be bad created as multicore but are running as single core and failing
6) Message boards : Number crunching : VM Applications Errors (Message 41205)
Posted 8 Jan 2020 by Guy PF Masevaux
Post:
I used an HP omen with RTX. i did my best to be successful
2times they get vrong
The error is occured in the latest computing second
Now this took place 3 times
Something must be wrong
I don'tknow on which computer you can solve it because it failed on 2 big I7 and on a good amd Ryzen 5
Sorry
Bhind you find the list of occured errors
Theory_2279-784485-196
applications
Theory Simulation
créé
19 Dec 2019, 17:43:43 UTC

erreurs
Trop de résultats totaux

Temps de fonctionnement
(sec)
Temps de CPU
(sec)
Crédit
Application
255962069
10536006
19 Dec 2019, 19:15:11 UTC
24 Dec 2019, 9:30:26 UTC
Erreur lors des calculs
44,307.94
19,950.05
---
Theory Simulation v300.02 (vbox64_theory)
windows_x86_64
256748413
10623629
24 Dec 2019, 9:38:14 UTC
5 Jan 2020, 1:05:58 UTC
Erreur lors des calculs
357,010.95
352,895.60
---
Theory Simulation v300.02 (vbox64_theory)
x86_64-pc-linux-gnu
257549284
10621995
4 Jan 2020, 9:38:20 UTC
8 Jan 2020, 18:19:04 UTC
Erreur lors des calculs
376,418.99
376,092.60
---
Theory Simulation v300.02 (vbox64_theory)
windows_x86_64



©2020 CERN
7) Message boards : Number crunching : Checklist Version 3 for Atlas@Home (and other VM-based Projects) on your PC (Message 41195)
Posted 7 Jan 2020 by Guy PF Masevaux
Post:
memory inside of the computer
It is necessary to have a look Inside of the boinc manager Tools and especially Inside of the computing choice to select the biggest size of memory avaible Inside of the computer to run easy the projects
as sample:
with an I7 7700k i need 15GB RAM
with an amd ryzen 7 2700x the maximal Inside builded memory is 32 GB an i use 99% for cern project if i compute with full power
sometimes the amd ryzen need at full power more as 32 GB and some tasks are made in standby and the computer is not processing with the 16 processor at the same time
You can see this with some CMS tasks that could be delayed
It is usual that atlas tasks need n processors
The disk memory space is the second important look to have to your computer depending on the project choice
it is important to have a big c: disk on your computer with free memory or 15 Gb avaible free disk space after installing the projects Inside of the boinc manager
each project take place. If you want to work for more projects then you need more disk space and you have less avaible free disk space
if your hard disk is too light in free space the best choice is to install a hard disk with over 230Gb on C:
if you have 300Gb you should run easy
I am back since a short time because i was ill and needed time to go better.
It took a long time and i can say that i spend 50% of the time for health since the open days without relation ship with the CERN
It is my advanced age that arrive.
Cern was great and i stayed at IBIS Petit Lancy

I restarted with theory and some cms computing
I have 5 cms long run tasks that could arrive to end Inside one or 2 days
the run seems to be today ok. I hope it will go also until end

Guy PFLIEGER
8) Message boards : Number crunching : Checklist Version 3 for Atlas@Home (and other VM-based Projects) on your PC (Message 40368)
Posted 7 Nov 2019 by Guy PF Masevaux
Post:
This is a great answer
Hyper v is used only for server and not for virtualisation computing and this is the reason why hyper v must be disabled

To activate virtualisation you must go Inside of the setup.

Now i change my system and i get to 6 computers
The latest is an I7 at 5GHz with rtx coprocessor:
Type de CPU
GenuineIntel
Intel(R) Core(TM) i7-9700F CPU @ 3.00GHz [Family 6 Model 158 Stepping 13]
Nombre de processeurs
8
Coprocesseurs
NVIDIA GeForce RTX 2060 (4095MB) driver: 430.39 OpenCL: 1.2
Virtualisation
Virtualbox (5.2.8) installé, le processeur est compatible avec la virtualisation matérielle et est activé

I had to make some change in my computer room and now i restartet to compute for the LHC

Best regards to everybody

Guy PFLIEGER
9) Message boards : Theory Application : Poor CPU utilization on High Core Count CPUs during Theory Application (Message 40338)
Posted 30 Oct 2019 by Guy PF Masevaux
Post:
The virtualisation of amd ryzen is now activated
I can send the proof to a mail box of cern

Guy PFLIEGER
10) Message boards : Theory Application : Poor CPU utilization on High Core Count CPUs during Theory Application (Message 40327)
Posted 30 Oct 2019 by Guy PF Masevaux
Post:
Yes!
This was the answer!
I searched on internet how to modify this problem and i found a detailled answer
I made the change like explained on the website
Then i modified the boot
Now the virtualization is fully recognized and the hyper V too
I installed the latest virtual box and the latest extension pack
I cleaned the computer and optimized the registry
Then I restarted
Now the computer needs a lot less memory and is able to work at full power

Greeting You!

Guy PFLIEGER
11) Message boards : Theory Application : Poor CPU utilization on High Core Count CPUs during Theory Application (Message 40312)
Posted 29 Oct 2019 by Guy PF Masevaux
Post:
I Just received an answer: they are changing the Windows of the buro at the hospital and they can not receive us until next time.

My wife confirmed it

Guy PFLIEGER
12) Message boards : Theory Application : Poor CPU utilization on High Core Count CPUs during Theory Application (Message 40311)
Posted 29 Oct 2019 by Guy PF Masevaux
Post:
this is a sample of errors with AMD Ryzen
For errors exist usually answers.


Nom
Theory_1729189_1572308367.834984_0
Unité de travail (WU)
125694646
Créé
29 Oct 2019, 0:19:29 UTC
Envoyé
29 Oct 2019, 5:52:44 UTC
Date limite de rapport
29 Nov 2019, 5:52:44 UTC
Reçu
29 Oct 2019, 5:54:21 UTC


2019-10-29 06:52:50 (14168):
Command: VBoxManage -q showvminfo "boinc_1ad8e5bfb0f350ca" --machinereadable
Exit Code: -2135228415
Output:
VBoxManage.exe: error: Could not find a registered machine named 'boinc_1ad8e5bfb0f350ca'
VBoxManage.exe: error: Details: code VBOX_E_OBJECT_NOT_FOUND (0x80bb0001), component VirtualBoxWrap, interface IVirtualBox, callee IUnknown
VBoxManage.exe: error: Context: "FindMachine(Bstr(VMNameOrUuid).raw(), machine.asOutParam())" at line 2621 of file VBoxManageInfo.cpp

VBoxManage.exe: error: Not in a hypervisor partition (HVP=0) (VERR_NEM_NOT_AVAILABLE).
VBoxManage.exe: error: AMD-V is disabled in the BIOS (or by the host OS) (VERR_SVM_DISABLED)
VBoxManage.exe: error: Details: code E_FAIL (0x80004005), component ConsoleWrap, interface IConsole

This machine does not have any snapshots
2019-10-29 06:53:03 (14168):
Command: VBoxManage -q bandwidthctl "boinc_1ad8e5bfb0f350ca" remove "boinc_1ad8e5bfb0f350ca_net"
Exit Code: 0
Output:
VBoxManage.exe: error: Bandwidth groups cannot be deleted while the VM is running


I cross my fingers
Sorry, but i must go to médicine between 13h20 to 15h30 French time

Good afternoon everybody
Guy PFLIEGER
13) Message boards : Theory Application : Poor CPU utilization on High Core Count CPUs during Theory Application (Message 40310)
Posted 29 Oct 2019 by Guy PF Masevaux
Post:
If you have an AMD Ryzen threadripper, then i want to know what you made for running atlas without Failure?


I installed yeti's cheklist and it is impossible to run atlas on my AMD Ryzen 7 2700X
I can run six track and Nothing else
Something must be wrong but i do not know what?
I have no problem with intel skylake processor with yeti's cheklist

My AMD Ryzen have 32 GB Ram and all my other computers have 16 Gb RAM
Also i tested the AMD with or without Hyper V and at everytime i have Failure. Also with Theory running.
I think there exist a bug.
But AMD running exist in each applys:


CMS Simulation
Plateforme
Version
Créé
Calcul moyen
Microsoft Windows running on an AMD x86_64 or Intel EM64T CPU
49.00 (vbox64)
26 Mar 2019, 8:56:04 UTC
267 GigaFLOPS
Linux running on an AMD x86_64 or Intel EM64T CPU
49.00 (vbox64)
26 Mar 2019, 9:00:17 UTC
244 GigaFLOPS
Intel 64-bit Mac OS 10.5 or later
49.00 (vbox64)
26 Mar 2019, 8:55:02 UTC
17 GigaFLOPS
Theory Simulation
Plateforme
Version
Créé
Calcul moyen
Microsoft Windows (98 or later) running on an Intel x86-compatible CPU
263.50 (vbox32)
8 Oct 2017, 10:47:42 UTC
310 GigaFLOPS
Microsoft Windows running on an AMD x86_64 or Intel EM64T CPU
263.98 (vbox64_mt_mcore)
16 Jul 2019, 14:18:50 UTC
9,667 GigaFLOPS
Linux running on an AMD x86_64 or Intel EM64T CPU
263.98 (vbox64_mt_mcore)
16 Jul 2019, 14:19:04 UTC
4,782 GigaFLOPS
Intel 64-bit Mac OS 10.5 or later
263.98 (vbox64_mt_mcore)
16 Jul 2019, 14:19:16 UTC
358 GigaFLOPS
ATLAS Simulation
Plateforme
Version
Créé
Calcul moyen
Microsoft Windows running on an AMD x86_64 or Intel EM64T CPU
2.00 (vbox64_mt_mcore_atlas)
9 Oct 2019, 8:30:17 UTC
3,128 GigaFLOPS
Linux running on an AMD x86_64 or Intel EM64T CPU
2.00 (vbox64_mt_mcore_atlas)
9 Oct 2019, 8:30:50 UTC
806 GigaFLOPS
Linux running on an AMD x86_64 or Intel EM64T CPU
2.73 (native_mt) (beta test)
17 Oct 2019, 12:21:07 UTC
27,575 GigaFLOPS
Intel 64-bit Mac OS 10.5 or later
2.00 (vbox64_mt_mcore_atlas)
9 Oct 2019, 8:31:15 UTC
138 GigaFLOPS

I hope an Answer

Guy PFLIEGER
14) Message boards : Number crunching : GPU advertised for LHC, but they don't do it? (Message 40239)
Posted 22 Oct 2019 by Guy PF Masevaux
Post:
download of the languages:
https://www.khronos.org/opencl/
https://developer.nvidia.com/cuda-zone


happy crunching!

Guy PFLIEGER
France
15) Message boards : Number crunching : GPU advertised for LHC, but they don't do it? (Message 40238)
Posted 22 Oct 2019 by Guy PF Masevaux
Post:
document to read with link for the devollopers (Sorry; in French)

https://fr.wikipedia.org/wiki/OpenCL

https://fr.wikipedia.org/wiki/Compute_Unified_Device_Architecture


Guy PFLIEGER

France
16) Message boards : Number crunching : GPU advertised for LHC, but they don't do it? (Message 40237)
Posted 22 Oct 2019 by Guy PF Masevaux
Post:
then the answer is easy:
To satisfy the most of users the choice should be Open CL
and then if CERN develop under opencl and CPU tasks then it is possible that the most of user can use his GPU

Guy PFLIEGER
France
17) Message boards : Number crunching : GPU advertised for LHC, but they don't do it? (Message 40234)
Posted 21 Oct 2019 by Guy PF Masevaux
Post:
I eated Lepista nuda with ognon and Brussel sprout
Yesterday and today i found also Laccaria amethystina.
One part i prepared for an omelette and the rest was light cooked in melted butter and as soon as it was cold i putted them to freezing for the winter
So I have hability to have special mushrom for the feasts

They where searched outside of mining zone in a forest with ground of clay

This was excellent
but i have a problem with my hearth and the specialist said to be care with the consum of mushrom

Thank you for your messages

Now i think listening "Hold the line" from TOTO

Nice evening for everybody!

Guy PFLIEGER
France
18) Message boards : Number crunching : GPU advertised for LHC, but they don't do it? (Message 40230)
Posted 21 Oct 2019 by Guy PF Masevaux
Post:
I just want to excuse me against you to have forgotten Something:
I forgotted to say to you in the last message:

Happy Crunching and best Regards

The Reason was that i combined at Sunday several activities and i wrote the message after 22 hours of combined work so i was really tired
I get to bed early in the Morning and after sleeping i had to do my cooking
I was also yesterday searching muschroom and i found enough to do today my cooking and some were prepared to be freezed for the winter
In regards of muschroom in Relationship with possible radiation comming from Tchernobyl problem, I can say that the dried muschroom where controled with a dosimeter and you can find a very little noise comming from the muschroom. The difference is so little that we can say now that the Relationship with the data published by crirad direct after the destruction of Tchernobyl show a very important decreasing of the radioactivity in the muschroom in alsace (except in ground recognized containing natural uran in Relationship with possible mining)
North of France is usually a raining part of France and if you look the property of uran is to be slovly dissolved in water and with the years the ground was cleaned with the rain and the melting snow in ground that did not contain uran in the rock Under the ground

Now i finish my meeltime and i eat my muschroom

Happy Crunching
Best Regards

Guy PFLIEGER
France
19) Message boards : Number crunching : GPU advertised for LHC, but they don't do it? (Message 40228)
Posted 20 Oct 2019 by Guy PF Masevaux
Post:
Hello Cruncher and forum manager!
At first i want to speek from the computer market of GPU
The most selled GPU are NVIDIA GTX... with Always greater power for computing and with a regular upgrade of the whole models of GPU drivers
If i compare the computing speed of 2 similar tasks as sample by SETI then you have for the same task with GPU a computing time of only 11 minutes to 20 minutes depending of GPU model and if you compute only with CPU then you are needing 1h30 minutes to 2 h 30 minutes

If this is applied Inside of the CERN which is actually doing a great upgrade, then i whish for the future the use of cuda 42 or Cuda 50 or 55 for NVIDIA graphics cards or nvidia GPU
This will severaly decrease the computing time for many tasks
For comparison i think that 7 hours computing could be brang to less as one hour
This increase the productivity of each of our computers and bring to the cern much more results in less time as usual with the same number of computers in the world
The cruncher are receiving as prime for their truly crunching by cern a higher number of cobblestones in the same time.
Everybody is winning in the party

Guy PFLIEGER
France
20) Message boards : ATLAS application : Guide for building everything from sources to run native ATLAS on Debian 9 (Stretch) Version 2 (Message 40017)
Posted 24 Sep 2019 by Guy PF Masevaux
Post:
now came the true answer for amd Ryzen from a computer maintenance center: 32GB ram


Next 20


©2020 CERN