21) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43227)
Posted 17 Aug 2020 by keputnam
Post:
Double checked Yeti's check list - all good

Called my ISP and asked if there was anything they could do to reduce latency

They couldn't access the modem control panel from their end, and I couldn't access it from within my home network

They also said it was over 7 years old and offered to send me a new one, which should be here WED afternoon sometime

Let me get it hooked up and set "accept new tasks" here and lets see what happens



Thanks for all the responses so far

Really appreciated
22) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43225)
Posted 17 Aug 2020 by keputnam
Post:
Thanks, Guys


Guest Edition?

Not familiar with that term

Already at VBox 6.0.14 with Extension Pack 6.0.14


I'll review Yeti's checklist one more time
23) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43199)
Posted 8 Aug 2020 by keputnam
Post:
Sorry for the delayed response I was out of town most of the week



I'm not seeing what you describe, but here are the few lines of the VBox.log


00:00:10.586190 IEM: rdmsr(0x4e) -> #GP(0)
00:00:10.720435 APIC0: Switched mode to x2APIC
00:00:10.828486 PIT: mode=0 count=0x10000 (65536) - 18.20 Hz (ch=0)
00:00:10.863375 APIC1: Switched mode to x2APIC
00:00:10.863391 GIM: KVM: VCPU 1: Enabled system-time struct. at 0x000000014bf84040 - u32TscScale=0x9b26c76a i8TscShift=-1 uVersion=2 fFlags=0x1 uTsc=0x4d5156a93 uVirtNanoTS=0x176e06bcd
00:00:10.864397 IEM: rdmsr(0x4e) -> #GP(0)
00:00:12.276698 IEM: wrmsr(0xc90,0x0`000fffff) -> #GP(0)
00:00:13.344427 PIIX3 ATA: Ctl#0: RESET, DevSel=0 AIOIf=0 CmdIf0=0xc4 (-1 usec ago) CmdIf1=0x00 (-1 usec ago)
00:00:13.344508 PIIX3 ATA: Ctl#0: finished processing RESET
00:00:13.344924 PIIX3 ATA: Ctl#1: RESET, DevSel=0 AIOIf=0 CmdIf0=0xa1 (-1 usec ago) CmdIf1=0x00 (-1 usec ago)
00:00:13.344994 PIIX3 ATA: Ctl#1: finished processing RESET
00:00:15.339531 NAT: Link up
00:00:17.175622 VMMDev: Guest Additions information report: Version 5.2.32 r132073 '5.2.32'
00:00:17.175672 VMMDev: Guest Additions information report: Interface = 0x00010004 osType = 0x00053100 (Linux >= 2.6, 64-bit)
00:00:17.175813 VMMDev: Guest Additions capability report: (0x0 -> 0x0) seamless: no, hostWindowMapping: no, graphics: no
00:00:17.176096 VMMDev: Guest reported fixed hypervisor window at 00001800000 LB 0x2400000 (rc=VINF_SUCCESS)
00:00:17.176121 VMMDev: vmmDevReqHandler_HeartbeatConfigure: No change (fHeartbeatActive=false)
00:00:17.176141 VMMDev: Heartbeat flatline timer set to trigger after 4 000 000 000 ns
00:00:17.176199 VMMDev: Guest Log: vgdrvHeartbeatInit: Setting up heartbeat to trigger every 2000 milliseconds
00:00:17.176806 VMMDev: Guest Log: vboxguest: misc device minor 58, IRQ 20, I/O port d020, MMIO at 00000000f0400000 (size 0x400000)
00:00:17.576133 Display::i_handleDisplayResize: uScreenId=0 pvVRAM=000000000ac60000 w=800 h=600 bpp=32 cbLine=0xC80 flags=0x1 origin=0,0
00:00:21.852605 NAT: IPv6 not supported
00:00:22.141405 NAT: DHCP offered IP address 10.0.2.15
00:00:22.141752 NAT: DHCP offered IP address 10.0.2.15
00:00:24.308484 VMMDev: Guest Log: Checking CVMFS...
00:00:35.698186 VMMDev: Guest Log: VBoxService 5.2.32 r132073 (verbosity: 0) linux.amd64 (Jul 12 2019 10:32:28) release log
00:00:35.698208 VMMDev: Guest Log: 00:00:00.000211 main Log opened 2020-08-08T12:40:50.916026000Z
00:00:35.698272 VMMDev: Guest Log: 00:00:00.000313 main OS Product: Linux
00:00:35.698310 VMMDev: Guest Log: 00:00:00.000354 main OS Release: 3.10.0-957.27.2.el7.x86_64
00:00:35.698344 VMMDev: Guest Log: 00:00:00.000389 main OS Version: #1 SMP Mon Jul 29 17:46:05 UTC 2019
00:00:35.698383 VMMDev: Guest Log: 00:00:00.000424 main Executable: /opt/VBoxGuestAdditions-5.2.32/sbin/VBoxService
00:00:35.698391 VMMDev: Guest Log: 00:00:00.000426 main Process ID: 1490
00:00:35.698396 VMMDev: Guest Log: 00:00:00.000427 main Package type: LINUX_64BITS_GENERIC
00:00:35.703679 VMMDev: Guest Log: 00:00:00.005729 main 5.2.32 r132073 started. Verbose level = 0
00:00:35.704600 Guest Control: GUEST_MSG_REPORT_FEATURES: 0x1, 0x8000000000000000
00:00:45.706543 VMMDev: Guest Log: 00:00:10.008554 timesync vgsvcTimeSyncWorker: Radical guest time change: 25 211 221 087 000ns (GuestNow=1 596 915 662 144 209 000 ns GuestLast=1 596 890 450 923 122 000 ns fSetTimeLastLoop=true )
00:10:11.861076 Display::i_handleDisplayResize: uScreenId=0 pvVRAM=000000000ac60000 w=800 h=600 bpp=0 cbLine=0xC80 flags=0x5 origin=0,0
24) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43179)
Posted 3 Aug 2020 by keputnam
Post:
??

Is that file structure on LINUX, because it doesn't exist on my Win10 machine


I have a Boinc Data Folder

Under that I have Projects and Slots directories among others

Current Theory job is running at CPU 00:00:52 Elapsed 00:34:07 in slot 8

Neither the “cernvm\shared” folder or the runRivet.log file exists anywhere in the DATA directory
25) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43176)
Posted 3 Aug 2020 by keputnam
Post:
Gunde,


That's the point

The CPU time stops increasing and never starts up again
26) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43173)
Posted 3 Aug 2020 by keputnam
Post:
Aaaaaaand the next Theory WU is at 3 hours CPU time and counting


sigh
27) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43172)
Posted 3 Aug 2020 by keputnam
Post:
OK Tried Theory

Same thing

Well, the first two ran OK,

then with the rest processor time gets to a random point, then stops while elapsed time keeps going

I tried in both uni-thread and MT modes
28) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43168)
Posted 2 Aug 2020 by keputnam
Post:
OK

I've turn off Atlas and selected Theory

Take a day or two to clear the two Atlas WUs I've already got
29) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43137)
Posted 30 Jul 2020 by keputnam
Post:
I've run ATLAS as a standalone project and under the LHC umbrella since SEPT of 2004

Had a hiccup several years ago and my logs were showing no connection to server Did some network setting tweaks and bumped my bandwidth with my ISP and things settled down

up until about 8 weeks ago I was chugging through WUs with no problems whatsoever

I have one concurrent task, two CPUs max, and the correct RAM settings in app_config I have made no changes for over a year, until this started happening

I dont stop and start BOINC randomly


And as i said above
The last 6 Wus that I have aborted had one other aborted WU for the same task

Last night I had on with 35 seconds of CPU and 3 and a half hours Elapsed Just for grins I left it running Somewhere around 02:30 local (based on current CPU time), it took off and is still running
30) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43130)
Posted 30 Jul 2020 by keputnam
Post:
Just a note for consideration


The last 6 Wus that I have aborted had one other aborted WU for the same task

Apparently I'm not the only one, eh?
31) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43125)
Posted 29 Jul 2020 by keputnam
Post:
If it is a network issue, it is ONLY on ATLAS I can upload/download all other projects I crunch for, I can access all other web sites

I did have problems downloading three xxx.pool.1 files, Ended up aborting them

But I really don't think my CPU time problem has anything to do with network
32) Message boards : Number crunching : Download Server Offline? (Message 43120)
Posted 28 Jul 2020 by keputnam
Post:
Still get a failure to connect error
33) Message boards : Number crunching : Download Server Offline? (Message 43116)
Posted 28 Jul 2020 by keputnam
Post:
BOINC is attempting to download a new WU

7/28/2020 12:32:12 PM | LHC@home | Started download of
7/28/2020 12:37:17 PM | | Project communication failed: attempting access to reference site
7/28/2020 12:37:19 PM | | Internet access OK - project servers may be temporarily down.


Server Status page says the download server is GREEN
34) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43115)
Posted 28 Jul 2020 by keputnam
Post:
And the saga continues

3 valid WUs and 18 that I had to abort in the last 5 days

Anybody else have any suggestions
35) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43102)
Posted 24 Jul 2020 by keputnam
Post:
Again thanks for the response


RAM set back to 4800 The rest of my projects thank you

I have ATLAS max_concurrent set to 1 in the app_config file

I'll make the project prefs change, I've occasionally gotten a waiting for memory in BOINC Manager, but only when I've been doing other stuff on the computer

latest WU clicking along nicely 19:42 CPU time with 10:00 elapsed time
36) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43099)
Posted 23 Jul 2020 by keputnam
Post:
Thanks for the response Made the recommended change and even re-booted to make sure nothing hung around

First WU to run 1 minute 54 seconds CPU time over 3 hours of elapsed time
37) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43093)
Posted 23 Jul 2020 by keputnam
Post:
In the last several weeks I've had a ton of WUs where the processor time freezes anywhere between one minute and 5 minutes while elapsed time continues to climb This is confirmed by Resource Monitor VBox monitor shows the job as "running"

They will stay like this til I abort them

When I look at my Tasks List, after aborting them, many show zero CPU or Elapsed time. I assure you they all had at least an half hour of elapsed time and as I said 1 to 5 minutes of CPU time

I was using the current version of VBox, (6.1.12) so I down levelled to the one on the project download page (6.0.14) Still happens

Doesn't happen every time, but often enough I thought I'd mention it

Anyone else seeing something similar?
38) Message boards : ATLAS application : Change in Credit? (Message 42001)
Posted 27 Mar 2020 by keputnam
Post:
Aaaaaaaaaand

Credit is back where it was

My last 5 WUs have all been in the 450-600 credit range


<shrug>
39) Message boards : ATLAS application : Change in Credit? (Message 41997)
Posted 24 Mar 2020 by keputnam
Post:
I don't remember changing anything there, certainly not maxcpus

I DO use an APP_CONFIG to control that, which is why I mentioned it before and I've restarted BOINC several times recently (for service application) So the APP_CONFIG would over-ride the options there, wouldn't it?


in any case, how can that possibly affect the credit awarded for an equivalent CPU time? Doesn't CPU time account for different numbers of "processors" as oppose to wall-clock time?
40) Message boards : ATLAS application : Change in Credit? (Message 41992)
Posted 24 Mar 2020 by keputnam
Post:
Really?

APP_CONFIG was last changed 2020/02/20

And <avg_ncpus>2.000000</avg_ncpus for Atlas has been 2 since I started running it years ago

This vastly reduced credit started 2020/03/21


Previous 20 · Next 20


©2022 CERN