Message boards : ATLAS application : ATLAS vbox version 2.00
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,733,934
RAC: 233,644
Message 40367 - Posted: 6 Nov 2019, 16:42:50 UTC

I got interesting logs in my task today:

kworker/0:3:95 blocked for more than 120 seconds.

You can't switch to the other logings.
ID: 40367 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,180,005
RAC: 0
Message 40370 - Posted: 7 Nov 2019, 9:39:47 UTC - in response to Message 40357.  

Typical config entries

/etc/cvmfs/default.local
CVMFS_REPOSITORIES="atlas.cern.ch,atlas-condb.cern.ch,grid.cern.ch,cernvm-prod.cern.ch"
[...]

It should be checked (by the project team) if CVMFS_SERVER_URL lists at least 4 servers. Then it's very unlikely that all of them fail at the same moment.
There isn't cernvm-prod.cern.ch in my stderr files.


Client side issues could be:
- wrong firewall settings, e.g. closed ports or filtered destinations
- slow DNS resolving
- high load on the router (not the same as high bandwidth usage!) that causes timeouts
Maybe this. My router wifi is not as good as old router + there are Theory VMs, smartphones, paytv, etc...
ID: 40370 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,180,005
RAC: 0
Message 40645 - Posted: 25 Nov 2019, 7:05:08 UTC - in response to Message 40370.  
Last modified: 25 Nov 2019, 7:11:51 UTC

Is this task ok?

2019-11-23 16:31:15 (29320): vboxwrapper (7.7.26197): starting
2019-11-23 16:31:16 (29320): Feature: Checkpoint interval offset (402 seconds)
2019-11-23 16:31:16 (29320): Detected: VirtualBox VboxManage Interface (Version: 5.2.10)
2019-11-23 16:31:16 (29320): Detected: Minimum checkpoint interval (900.000000 seconds)
2019-11-23 16:31:16 (29320): Successfully copied 'init_data.xml' to the shared directory.
2019-11-23 16:31:16 (29320): Create VM. (boinc_13c65ac4e77b5ca5, slot#3)
2019-11-23 16:31:19 (29320): Setting Memory Size for VM. (6600MB)
2019-11-23 16:31:19 (29320): Setting CPU Count for VM. (1)
2019-11-23 16:31:19 (29320): Setting Chipset Options for VM.
2019-11-23 16:31:19 (29320): Setting Boot Options for VM.
2019-11-23 16:31:19 (29320): Setting Network Configuration for NAT.
2019-11-23 16:31:19 (29320): Enabling VM Network Access.
2019-11-23 16:31:19 (29320): Disabling USB Support for VM.
2019-11-23 16:31:19 (29320): Disabling COM Port Support for VM.
2019-11-23 16:31:19 (29320): Disabling LPT Port Support for VM.
2019-11-23 16:31:20 (29320): Disabling Audio Support for VM.
2019-11-23 16:31:20 (29320): Disabling Clipboard Support for VM.
2019-11-23 16:31:20 (29320): Disabling Drag and Drop Support for VM.
2019-11-23 16:31:20 (29320): Adding storage controller(s) to VM.
2019-11-23 16:31:20 (29320): Adding virtual disk drive to VM. (vm_image.vdi)
2019-11-23 16:31:20 (29320): Adding VirtualBox Guest Additions to VM.
2019-11-23 16:31:20 (29320): Adding network bandwidth throttle group to VM. (Defaulting to 1024GB)
2019-11-23 16:31:20 (29320): forwarding host port 59209 to guest port 80
2019-11-23 16:31:20 (29320): Enabling remote desktop for VM.
2019-11-23 16:31:20 (29320): Enabling shared directory for VM.
2019-11-23 16:31:21 (29320): Starting VM. (boinc_13c65ac4e77b5ca5, slot#3)
2019-11-23 16:31:22 (29320): Successfully started VM. (PID = '29969')
2019-11-23 16:31:22 (29320): Reporting VM Process ID to BOINC.
2019-11-23 16:31:22 (29320): Guest Log: BIOS: VirtualBox 5.2.10
2019-11-23 16:31:22 (29320): Guest Log: CPUID EDX: 0x078bfbff
2019-11-23 16:31:22 (29320): Guest Log: BIOS: ata0-0: PCHS=16383/16/63 LCHS=1024/255/63
2019-11-23 16:31:22 (29320): VM state change detected. (old = 'poweroff', new = 'running')
2019-11-23 16:31:22 (29320): Detected: Web Application Enabled (http://localhost:59209)
2019-11-23 16:31:22 (29320): Detected: Remote Desktop Enabled (localhost:41312)
2019-11-23 16:31:23 (29320): Preference change detected
2019-11-23 16:31:23 (29320): Setting CPU throttle for VM. (100%)
2019-11-23 16:31:23 (29320): Setting checkpoint interval to 900 seconds. (Higher value of (Preference: 300 seconds) or (Vbox_job.xml: 900 seconds))
2019-11-23 16:31:25 (29320): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032
2019-11-23 16:31:25 (29320): Guest Log: BIOS: Booting from Hard Disk...
2019-11-23 16:31:27 (29320): Guest Log: BIOS: KBD: unsupported int 16h function 03
2019-11-23 16:31:27 (29320): Guest Log: BIOS: AX=0305 BX=0000 CX=0000 DX=0000 
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=81
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=81
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=82
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=82
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=83
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=83
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=84
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=84
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=85
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=85
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=86
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=86
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=87
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=87
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=88
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=88
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=89
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=89
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=8a
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=8a
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=8b
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=8b
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=8c
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=8c
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=8d
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=8d
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=8e
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=8e
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk_ext: function 41, unmapped device for ELDL=8f
2019-11-23 16:31:27 (29320): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=8f
2019-11-23 16:31:31 (29320): Guest Log: vgdrvHeartbeatInit: Setting up heartbeat to trigger every 2000 milliseconds
2019-11-23 16:31:31 (29320): Guest Log: vboxguest: misc device minor 58, IRQ 20, I/O port d020, MMIO at 00000000f0400000 (size 0x400000)
2019-11-23 16:31:34 (29320): Guest Log: Checking CVMFS...
2019-11-23 16:31:34 (29320): Guest Log: Failed to check CVMFS, check output from cvmfs_config probe:
2019-11-23 18:11:14 (29320): Status Report: Elapsed Time: '6000.585989'
2019-11-23 18:11:14 (29320): Status Report: CPU Time: '5567.880000'
2019-11-23 19:51:11 (29320): Status Report: Elapsed Time: '12000.743581'
2019-11-23 19:51:11 (29320): Status Report: CPU Time: '11286.870000'
2019-11-23 21:31:10 (29320): Status Report: Elapsed Time: '18002.214636'
2019-11-23 21:31:10 (29320): Status Report: CPU Time: '17047.590000'
2019-11-23 23:11:09 (29320): Status Report: Elapsed Time: '24004.466774'
2019-11-23 23:11:09 (29320): Status Report: CPU Time: '22828.480000'
2019-11-24 00:51:03 (29320): Status Report: Elapsed Time: '30004.874428'
2019-11-24 00:51:03 (29320): Status Report: CPU Time: '28598.400000'
2019-11-24 02:31:02 (29320): Status Report: Elapsed Time: '36006.562629'
2019-11-24 02:31:02 (29320): Status Report: CPU Time: '34382.740000'
2019-11-24 04:10:51 (29320): Status Report: Elapsed Time: '42007.551548'
2019-11-24 04:10:51 (29320): Status Report: CPU Time: '40147.510000'
2019-11-24 05:50:49 (29320): Status Report: Elapsed Time: '48007.580669'
2019-11-24 05:50:49 (29320): Status Report: CPU Time: '45889.600000'
2019-11-24 07:30:44 (29320): Status Report: Elapsed Time: '54007.754148'
2019-11-24 07:30:44 (29320): Status Report: CPU Time: '51617.320000'
2019-11-24 09:10:42 (29320): Status Report: Elapsed Time: '60007.954579'
2019-11-24 09:10:42 (29320): Status Report: CPU Time: '57349.920000'
2019-11-24 10:50:37 (29320): Status Report: Elapsed Time: '66008.825746'
2019-11-24 10:50:37 (29320): Status Report: CPU Time: '63149.470000'
2019-11-24 12:30:29 (29320): Status Report: Elapsed Time: '72009.494593'
2019-11-24 12:30:29 (29320): Status Report: CPU Time: '68899.570000'
2019-11-24 14:10:27 (29320): Status Report: Elapsed Time: '78009.708986'
2019-11-24 14:10:27 (29320): Status Report: CPU Time: '74605.640000'
2019-11-24 15:50:26 (29320): Status Report: Elapsed Time: '84011.110692'
2019-11-24 15:50:26 (29320): Status Report: CPU Time: '80253.640000'
2019-11-24 17:30:20 (29320): Status Report: Elapsed Time: '90012.575876'
2019-11-24 17:30:20 (29320): Status Report: CPU Time: '86041.990000'
2019-11-24 19:10:14 (29320): Status Report: Elapsed Time: '96013.218405'
2019-11-24 19:10:14 (29320): Status Report: CPU Time: '91444.780000'
2019-11-24 20:50:04 (29320): Status Report: Elapsed Time: '102013.498886'
2019-11-24 20:50:04 (29320): Status Report: CPU Time: '96998.080000'
2019-11-24 22:30:00 (29320): Status Report: Elapsed Time: '108013.533275'
2019-11-24 22:30:01 (29320): Status Report: CPU Time: '102776.200000'
2019-11-25 00:09:58 (29320): Status Report: Elapsed Time: '114013.561833'
2019-11-25 00:09:58 (29320): Status Report: CPU Time: '108540.050000'
2019-11-25 01:49:51 (29320): Status Report: Elapsed Time: '120014.473556'
2019-11-25 01:49:51 (29320): Status Report: CPU Time: '114327.220000'
2019-11-25 03:29:42 (29320): Status Report: Elapsed Time: '126014.698520'
2019-11-25 03:29:42 (29320): Status Report: CPU Time: '120073.000000'
2019-11-25 05:09:40 (29320): Status Report: Elapsed Time: '132014.853378'
2019-11-25 05:09:40 (29320): Status Report: CPU Time: '125800.820000'
2019-11-25 06:49:40 (29320): Status Report: Elapsed Time: '138016.302110'
2019-11-25 06:49:40 (29320): Status Report: CPU Time: '131517.270000'


Still crunching since 39 hours. CPU usage is 100%.
ID: 40645 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,930,147
RAC: 137,650
Message 40646 - Posted: 25 Nov 2019, 7:21:56 UTC - in response to Message 40645.  

The logfile looks fine.

Since you have VirtualBox Guest Additions installed you can click on "Show VM Console".
Then use ALT-F2 to switch to ATLAS Event Progress Monitoring.
ID: 40646 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 40648 - Posted: 25 Nov 2019, 10:36:02 UTC - in response to Message 40645.  
Last modified: 25 Nov 2019, 10:36:41 UTC

Is this task ok?

2019-11-23 16:31:34 (29320): Guest Log: Checking CVMFS...
2019-11-23 16:31:34 (29320): Guest Log: Failed to check CVMFS, check output from cvmfs_config probe:
I don't think so. Normally when CVMFS is OK there should come this:
Guest Log: Mounting shared directory
Guest Log: Copying input files
Guest Log: Copied input files into RunAtlas.
ID: 40648 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,180,005
RAC: 0
Message 40662 - Posted: 25 Nov 2019, 19:30:13 UTC - in response to Message 40648.  
Last modified: 25 Nov 2019, 19:33:06 UTC

I clicked "Show VM Console". There were some sentences ending with something like "disabled.", I can't remember.
Then Alt+F2 didn't show anything apart from an underscore.
I tried to restart BOINC. Something crashed while resuming that VM.

https://lhcathome.cern.ch/lhcathome/result.php?resultid=252684654
ID: 40662 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,369,585
RAC: 101,945
Message 40664 - Posted: 25 Nov 2019, 19:55:20 UTC - in response to Message 40662.  

https://lhcathome.cern.ch/lhcathome/result.php?resultid=252684654
there was something wrong with this task or with the VM processing.
And that's why the VM Console didn't work.
The excpert from the stderr says it cleraly:

Hypervisor System Log:
68:52:25.239712 nspr-4 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85cd948e-a71f-4289-281e-0ca7ad48cd89} aComponent={SessionMachine} aText={No storage device attached to device slot 1 on port 0 of controller 'Hard Disk Controller'}, preserve=false aResultDetail=0
ID: 40664 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,180,005
RAC: 0
Message 40667 - Posted: 25 Nov 2019, 20:31:08 UTC - in response to Message 40356.  

Yeah, thank you. Otherwise I can write a bash script that parses stderr.txt and automatically aborts the concerning task when three "Probing /cvmfs/*... Failed!" are raised (and those three lines must be consecutive).
Ok, it should work.

[...]
Obviously, this script does not work if you restart boinc client because it adds up all cvmfs fails regardless of whether some fails are from a previous start.
E.g. if you started boinc client 2 times and there were 2 fails the first time and 1 fail the second time, fail counter will be equal to 3 and script suspends that task. Wrong!

Now it checks if 3 fails are from consecutive lines or I guess it should do that.

Code: https://pastebin.com/r82vuzGM

Output example:
2 consecutive probing fails found in /home/luis/Applicazioni/boinc/slots/1/stderr.txt after line No. 77
2 consecutive probing fails found in /home/luis/Applicazioni/boinc/slots/1/stderr.txt after line No. 151
Total: 4 fails.
ID: 40667 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,180,005
RAC: 0
Message 40668 - Posted: 25 Nov 2019, 20:38:23 UTC - in response to Message 40664.  
Last modified: 25 Nov 2019, 20:38:57 UTC

https://lhcathome.cern.ch/lhcathome/result.php?resultid=252684654
there was something wrong with this task or with the VM processing.
And that's why the VM Console didn't work.
The excpert from the stderr says it cleraly:

Hypervisor System Log:
68:52:25.239712 nspr-4 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85cd948e-a71f-4289-281e-0ca7ad48cd89} aComponent={SessionMachine} aText={No storage device attached to device slot 1 on port 0 of controller 'Hard Disk Controller'}, preserve=false aResultDetail=0
I really don't understand whiskey-tango-foxtrot it have crunched for 51 hours and it would have liked to continue. :(
ID: 40668 · Report as offensive     Reply Quote
Marjan

Send message
Joined: 25 May 14
Posts: 6
Credit: 3,633,724
RAC: 0
Message 40785 - Posted: 4 Dec 2019, 19:32:55 UTC
Last modified: 4 Dec 2019, 19:43:06 UTC

Hello,

I have a problem with ATLAS tasks running on WM Version 6.0.14 r133895 (Qt5.6.2) and my PC running Windows10 64 bits.

Every second ATLAS task is stopping after 'Checking CVMFS...' :

Normal task:

00:00:10.649235 VMMDev: Guest Log: vboxguest: misc device minor 58, IRQ 20, I/O port d020, MMIO at 00000000f0400000 (size 0x400000)
00:00:10.835497 Display::i_handleDisplayResize: uScreenId=0 pvVRAM=000000000d470000 w=800 h=600 bpp=32 cbLine=0xC80 flags=0x1 origin=0,0
00:00:11.821696 NAT: IPv6 not supported
00:00:12.101758 NAT: DHCP offered IP address 10.0.2.15
00:00:12.102189 NAT: DHCP offered IP address 10.0.2.15
00:00:12.577408 VMMDev: Guest Log: Checking CVMFS...
00:00:16.535978 VMMDev: Guest Log: VBoxService 5.2.32 r132073 (verbosity: 0) linux.amd64 (Jul 12 2019 10:32:28) release log
00:00:16.536006 VMMDev: Guest Log: 00:00:00.000125 main Log opened 2019-12-04T20:06:44.657029000Z
00:00:16.536053 VMMDev: Guest Log: 00:00:00.000216 main OS Product: Linux
00:00:16.536081 VMMDev: Guest Log: 00:00:00.000246 main OS Release: 3.10.0-957.27.2.el7.x86_64
00:00:16.536103 VMMDev: Guest Log: 00:00:00.000269 main OS Version: #1 SMP Mon Jul 29 17:46:05 UTC 2019
00:00:16.536126 VMMDev: Guest Log: 00:00:00.000291 main Executable: /opt/VBoxGuestAdditions-5.2.32/sbin/VBoxService
00:00:16.536133 VMMDev: Guest Log: 00:00:00.000291 main Process ID: 1657
00:00:16.536138 VMMDev: Guest Log: 00:00:00.000292 main Package type: LINUX_64BITS_GENERIC
00:00:16.536769 VMMDev: Guest Log: 00:00:00.000933 main 5.2.32 r132073 started. Verbose level = 0
00:00:16.537354 Guest Control: GUEST_MSG_REPORT_FEATURES: 0x1, 0x8000000000000000
00:00:26.538370 VMMDev: Guest Log: 00:00:10.002511 timesync vgsvcTimeSyncWorker: Radical guest time change: -3 589 017 775 000ns (GuestNow=1 575 486 415 641 299 000 ns GuestLast=1 575 490 004 659 074 000 ns fSetTimeLastLoop=true )
00:00:40.329289 VMMDev: Guest Log: CVMFS is ok
00:00:40.484761 VMMDev: Guest Log: Mounting shared directory

'Bad' tasks are stopping (they are runing, but they are not doing anything) before line: 00:00:40.329289 VMMDev: Guest Log: CVMFS is ok

After I abort the 'bad' task, the next task run normally - I get 'VMMDev: Guest Log: CVMFS is ok'

Then the next task stops again and I have to abort it and so on and so on ...

Can anyone help me ?
ID: 40785 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,100,795
RAC: 103,685
Message 40787 - Posted: 5 Dec 2019, 6:54:20 UTC

Hi Marjan, welcome,
do you have WiFi for this connection or a problem with your ISP?
Atlas needs a correct network-connection to work well.
Help is for example in Yeti's Checklist in the Atlas-Folder.
You can reduce the working for only one Atlas only, to see if this is working well.
ID: 40787 · Report as offensive     Reply Quote
Marjan

Send message
Joined: 25 May 14
Posts: 6
Credit: 3,633,724
RAC: 0
Message 40794 - Posted: 5 Dec 2019, 10:06:22 UTC - in response to Message 40787.  

Hi maeax,

Thank you for your reply. I will check my networks settings later at home, because i'm at work now. I'm not sure when i will be able to make some tests, because i think i don't have a lot of ATLAS tasks available.

But I found a temporary 'emergency' solution yesterday:

When i get 'CVMFS is OK', i suspend current task and start another task. And when i get 'CVMFS is OK' from the new task, i suspend it again and so on. After 5 - 10 suspended tasks i resume all of them (at the same time) and all of them finish properly without any problem.

Regards.
ID: 40794 · Report as offensive     Reply Quote
Marjan

Send message
Joined: 25 May 14
Posts: 6
Credit: 3,633,724
RAC: 0
Message 40822 - Posted: 6 Dec 2019, 21:39:39 UTC - in response to Message 40794.  

Hi maeax,

i checked my settings with Yeti checklist. I found that boinc.exe didn't have incoming communications (it had outgoing comunications only).
i modified settings in my firewall and now i'm waiting to see the effect. . . .

Thanks again for your advice.

Regards.
ID: 40822 · Report as offensive     Reply Quote
Marjan

Send message
Joined: 25 May 14
Posts: 6
Credit: 3,633,724
RAC: 0
Message 40852 - Posted: 8 Dec 2019, 17:48:53 UTC - in response to Message 40822.  

No improvement. I think if the previous ATLAS task is doing upload, the VM for the next ATLAS task doesn't start properly - it hangs ( there is no: 'VMMDev: Guest Log: CVMFS is ok') and the atlas job does not start at all.

Regards.
ID: 40852 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,100,795
RAC: 103,685
Message 40854 - Posted: 8 Dec 2019, 18:34:58 UTC - in response to Message 40852.  

Hi Marjan,
you have many tasks waiting for working.
Boinc calculate the next download, when it is time therefore.
Don't know how you are possible to have so many tasks waiting.
You can check your prefs for each project (Atlas, CMS, Theory...) and reduce the number of tasks.
Your Atlas-tasks finished well today.
ID: 40854 · Report as offensive     Reply Quote
Marjan

Send message
Joined: 25 May 14
Posts: 6
Credit: 3,633,724
RAC: 0
Message 40877 - Posted: 9 Dec 2019, 16:57:18 UTC - in response to Message 40854.  

Hi maeax,

I put in computing preferences to store at least 5 days of work.

How can i set prefs for each LHC project individually ?

All ATLAS job finished well, because i set all of them to 'SUSPEND' and now i 'RESUME' them one by one when the last finished ATLAS job completed UPLOAD.

I know it isn't ideal solution, but is the best i have according to my knowledge. ;-)

It is also another solution, i describe it before, which allow me to be away from my computer for a long time. But preparation takes one or more hours . . .

Regards.
ID: 40877 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,930,147
RAC: 137,650
Message 40880 - Posted: 9 Dec 2019, 19:49:09 UTC - in response to Message 40877.  

Can you post some details regarding your network?

Is your host connected via wi-fi or cable (what speed)?

What bandwidth do you get from your ISP?
Upload?
Download?
Ping timing, e.g. to lhcathome.cern.ch?



How many vbox tasks do you start/run concurrently?
ID: 40880 · Report as offensive     Reply Quote
Profile zepingouin
Avatar

Send message
Joined: 7 Jan 07
Posts: 41
Credit: 15,959,427
RAC: 271
Message 40881 - Posted: 9 Dec 2019, 20:12:56 UTC - in response to Message 40123.  
Last modified: 9 Dec 2019, 20:17:25 UTC

computezrmle wrote:
BOINC uses 2 main factors to calculate estimated runtimes (as well as credits):
- estimated GFLOPS for a task
- peak GFLOPs of a computer

Although it is a BOINC recommendation to estimate the task's GFLOPS as accurate as possible before the server sends it to a client, ATLAS always uses a fixed value.


I have 2 hosts for which it's true but the 3rd has a variable GFLOPS :
Host #1 has 23.08 GFLOPS for every tasks
Host #2 has 55.01 GFLOPS for every tasks
Host #3 has 35.77, 12.06, 9.04, 6.03 and 3.01 GFLOPS

What could be the reason for the third host to have different figures ?
ID: 40881 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 40882 - Posted: 9 Dec 2019, 21:27:07 UTC - in response to Message 40881.  

Host #3 has 35.77, 12.06, 9.04, 6.03 and 3.01 GFLOPS

What could be the reason for the third host to have different figures ?
Probably you have changed your preference (home,school,work) for that host a few times.
The GFLOPS-values you gave are for 1-, 2-, 3-, 4- and 16 threads and they come from your preference # of CPUs.
Maybe you correct that locally with an app_config.xml
ID: 40882 · Report as offensive     Reply Quote
Profile zepingouin
Avatar

Send message
Joined: 7 Jan 07
Posts: 41
Credit: 15,959,427
RAC: 271
Message 40888 - Posted: 10 Dec 2019, 12:15:28 UTC - in response to Message 40882.  

Preferences are the same and I have not changed settings for a while.
Otherwise, I noticed that Boinc client version is 7.6.33 for the host having multiple GFLOPS and that the two others hosts are using version 7.9.3.
Boinc 7.6.33 is the release for Debian Stretch 9.11.
Boinc 7.9.3 is the release for Ubuntu Bionic Beaver 18.04.3 LTS.
I will give a try for the Stretch backport which is at version 7.10.2 and see if the behaviour is still the same.
ID: 40888 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : ATLAS application : ATLAS vbox version 2.00


©2024 CERN