Message boards : ATLAS application : Bad WUs?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,903,888
RAC: 125,478
Message 47253 - Posted: 13 Sep 2022, 16:52:52 UTC - in response to Message 47178.  
Last modified: 13 Sep 2022, 16:54:19 UTC

Guest Log: Checking CVMFS... and no response
maeax had the same problem some time ago and his problem was proxy settings, if I remember correctly.
hm, I am not using a proxy.
Furthermore, I am just realizing that on other computers on which I run Theory tasks, the ones which were downloaded within the past few hours, don't run either - also there:
"Guest Log: Probing /cvmfs/sft.cern.ch... Failed!"
https://lhcathome.cern.ch/lhcathome/result.php?resultid=364413464

I had not made any changes on any of my computers. So I guess the problem must be with CERN ???

Had last night also this problem with 50 Atlas-Tasks.
During the morning and running only one Task for testing, no more problems.
Don't know why.
https://lhcathome.cern.ch/lhcathome/results.php?userid=75468&offset=0&show_names=0&state=6&appid=
ID: 47253 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2401
Credit: 225,536,525
RAC: 122,366
Message 47254 - Posted: 13 Sep 2022, 17:46:58 UTC

https://lhcathome.cern.ch/lhcathome/result.php?resultid=365207008
00:37:46.333756 Changing the VM state from 'RUNNING' to 'GURU_MEDITATION'
00:37:46.334111 VM: Raising runtime error 'HostMemoryLow' (fFlags=0x2)
00:37:46.334328 Console: Machine state changed to 'GuruMeditation'

This is what VirtualBox wrote to the logfile.
Believe it or not, errors like this can't be solved by the project.
ID: 47254 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,903,888
RAC: 125,478
Message 47257 - Posted: 14 Sep 2022, 5:32:10 UTC - in response to Message 47254.  
Last modified: 14 Sep 2022, 5:36:41 UTC

Saw that, but WHY? NOW WCG AND only some ATLAS.
Erich wrote also, nothing changed.
btw Guru Meditation is not a answer from Windows11pro.
Both Threadripper 3995?
ID: 47257 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,903,888
RAC: 125,478
Message 47272 - Posted: 21 Sep 2022, 9:02:42 UTC - in response to Message 47257.  

When CVMFS is not connected well, Atlas running, running, running...
This is with SQUID also seeing with fast Threadripper, yesterday!
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10795955&offset=0&show_names=0&state=6&appid=

2022-09-21 08:44:10 (9488): Guest Log: int13_harddisk: function 02, unmapped device for ELDL=8f
2022-09-21 08:44:18 (9488): Guest Log: vgdrvHeartbeatInit: Setting up heartbeat to trigger every 2000 milliseconds
2022-09-21 08:44:18 (9488): Guest Log: vboxguest: misc device minor 58, IRQ 20, I/O port d020, MMIO at 00000000f0400000 (size 0x400000)
2022-09-21 08:44:26 (9488): Guest Log: Checking CVMFS...
2022-09-21 08:44:32 (9488): Guest Log: VBoxService 5.2.32 r132073 (verbosity: 0) linux.amd64 (Jul 12 2019 10:32:28) release log
2022-09-21 08:44:32 (9488): Guest Log: 00:00:00.000194 main Log opened 2022-09-21T08:44:28.948634000Z
2022-09-21 08:44:32 (9488): Guest Log: 00:00:00.000294 main OS Product: Linux
2022-09-21 08:44:32 (9488): Guest Log: 00:00:00.000316 main OS Release: 3.10.0-957.27.2.el7.x86_64
2022-09-21 08:44:32 (9488): Guest Log: 00:00:00.000333 main OS Version: #1 SMP Mon Jul 29 17:46:05 UTC 2019
2022-09-21 08:44:32 (9488): Guest Log: 00:00:00.000351 main Executable: /opt/VBoxGuestAdditions-5.2.32/sbin/VBoxService
2022-09-21 08:44:32 (9488): Guest Log: 00:00:00.000351 main Process ID: 1678
2022-09-21 08:44:32 (9488): Guest Log: 00:00:00.000351 main Package type: LINUX_64BITS_GENERIC
2022-09-21 08:44:32 (9488): Guest Log: 00:00:00.001225 main 5.2.32 r132073 started. Verbose level = 0
2022-09-21 08:44:42 (9488): Guest Log: 00:00:10.006615 timesync vgsvcTimeSyncWorker: Radical guest time change: -7 187 762 778 000ns (GuestNow=1 663 742 681 189 605 000 ns GuestLast=1 663 749 868 952 383 000 ns fSetTimeLastLoop=true )
2022-09-21 10:24:12 (9488): Status Report: Elapsed Time: '6000.000000'
2022-09-21 10:24:12 (9488): Status Report: CPU Time: '49.203125'
ID: 47272 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1688
Credit: 103,123,089
RAC: 124,326
Message 47294 - Posted: 26 Sep 2022, 5:35:02 UTC

when I got up this morning, I saw that on one of my computers where 1-core ATLAS tasks normally run between 5 and 6 hours, they have been running for about 11 hours. The console was no longer accessible, and under the task "properties" (in the BOINC manager) I saw that the CPU time was, as should be, between 5 and 6 hours. So obviously computation was finished the way it was supposed to, but the tasks as such could not get finished properly.
So I aborted them manually, the results can be seen here:

https://lhcathome.cern.ch/lhcathome/result.php?resultid=366355764
https://lhcathome.cern.ch/lhcathome/result.php?resultid=366355961

anyone any idea what the problem was?

Just a pitty that with a total CPU time of >12 hours, the result was nil :-(
ID: 47294 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,903,888
RAC: 125,478
Message 47301 - Posted: 26 Sep 2022, 10:02:16 UTC - in response to Message 47294.  

Today, also some Atlas in Win canceled.
Yesterday evening a lot of connect-errors also.
Today so far no new errors seen.
ID: 47301 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,903,888
RAC: 125,478
Message 47540 - Posted: 17 Nov 2022, 11:42:08 UTC - in response to Message 47301.  

This are a few Tasks from last night with:
2022-11-17 03:25:13 (39172): Guest Log: ATHENA_PROC_NUMBER=10
2022-11-17 03:25:13 (39172): Guest Log: *** Starting ATLAS job. (PandaID=5664632268 taskID=31189529) ***
2022-11-17 04:07:41 (39172): VM state change detected. (old = 'running', new = 'gurumeditation')
2022-11-17 04:07:41 (39172): Powering off VM.
2022-11-17 04:07:41 (39172): Deregistering VM. (boinc_d49ab12313b46efb, slot#7)
https://lhcathome.cern.ch/lhcathome/result.php?resultid=369267712
ID: 47540 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,903,888
RAC: 125,478
Message 47545 - Posted: 18 Nov 2022, 13:51:46 UTC - in response to Message 47540.  

Yesterday:Three Tasks with always 10 CPU's since 17 UTC doing nothing, canceled them.
Two more with 10 Cpu's from the other PC and one with 2 CPU's (8 hour runtime).
This Info was deleted from a Moderator.
ID: 47545 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,903,888
RAC: 125,478
Message 47557 - Posted: 23 Nov 2022, 22:04:47 UTC - in response to Message 47545.  

Computer ID 10797673
Laufzeit 5 Stunden 4 min. 39 sek.
CPU Zeit 8 sek.
ID: 47557 · Report as offensive     Reply Quote
Previous · 1 . . . 5 · 6 · 7 · 8

Message boards : ATLAS application : Bad WUs?


©2024 CERN