Message boards :
Theory Application :
Day Light Saving time ended and all Theory tasks that were running got aborted by client
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 28 Sep 04 Posts: 780 Credit: 60,001,110 RAC: 47,111 ![]() ![]() ![]() |
During the night daylight saving time ended and clocks were adjusted 1 hour back at 4:00 o'clock. All Theory tasks that were active at the time were aborted by the client. Below a section of stderr shows what happened. 2018-10-28 03:06:07 (6268): Guest Log: [INFO] New Job Starting in slot1 2018-10-28 03:06:07 (6268): Guest Log: [INFO] Condor JobID: 477556.291 in slot1 2018-10-28 03:06:12 (6268): Guest Log: [INFO] MCPlots JobID: 46980559 in slot1 2018-10-28 03:55:25 (6268): Guest Log: [INFO] Job finished in slot1 with 0. 2018-10-28 03:56:02 (6268): Guest Log: [INFO] New Job Starting in slot1 2018-10-28 03:56:02 (6268): Guest Log: [INFO] Condor JobID: 477846.199 in slot1 2018-10-28 03:56:08 (6268): Guest Log: [INFO] MCPlots JobID: 47001352 in slot1 2018-10-28 03:07:59 (6268): VM Heartbeat file specified, but missing heartbeat. 2018-10-28 03:07:59 (6268): Capturing screenshot. 2018-10-28 03:08:00 (6268): Screenshot completed. 2018-10-28 03:08:00 (6268): Powering off VM. 2018-10-28 03:13:01 (6268): VM did not power off when requested. 2018-10-28 03:13:01 (6268): VM was successfully terminated. 2018-10-28 03:13:01 (6268): Deregistering VM. (boinc_fbe6af95e6090fad, slot#3) 2018-10-28 03:13:01 (6268): Removing network bandwidth throttle group from VM. 2018-10-28 03:13:01 (6268): Removing VM from VirtualBox. [edit] Atlas tasks were not affected ![]() |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,950,792 RAC: 82,053 ![]() ![]() ![]() |
not only Theory tasks were terminated. I had also quite a number of LHCb tasks running, they were terminated, too :-( (It's due time to end this nonsense of chaning the time twice a year) |
Send message Joined: 2 May 07 Posts: 2277 Credit: 178,709,076 RAC: 100,489 ![]() ![]() |
Have voted against this summertime-arrangement. More than 4 Mio. EU-Citizens have voted. Hoping this is for next year history. Yes, Atlas had no problems. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
I didn't lose any tasks last night, neither Theory nor ATLAS. As for LHCb... well... I no longer waste resources on LHCb. Why was I immune? I bet it has something to do with system clocks on my hosts being set to network time rather than local time. When local time changed last night, systems using local clock shifted back 1 hour therefore the heartbeat appeared to be 1 hour late therefore BOINC terminated tasks. Network time does NOT change twice a year, so last night the heartbeat on my systems did NOT appear to be 1 hour late so BOINC did NOT terminate tasks. Seem to recall Windoze systems are by default configured with system clock set to local time with option to set to network time. Linux systems by default configure system clock to follow network time. All my hosts are Linux. Nothing wrong with daylight savings when ya know how to work a clock :) Prognostication: When local clocks leap ahead in the spring any host with system clock set to follow local clock will again have tasks cancelled because BOINC will freak when it sees a heartbeat from a task that appears to have travelled 1 hour back in time. Those who don't learn from the past are doomed to repeat their mistakes in the future. Why did Harri's ATLAS tasks not get terminated? Because he runs them on a host that has system clock configured to follow network time? His theory tasks failed because they run on a host with system clock set to follow local clock? |
![]() Send message Joined: 28 Sep 04 Posts: 780 Credit: 60,001,110 RAC: 47,111 ![]() ![]() ![]() |
Both my machines at home did change their clocks (win7 & win10), so no network time in use here. I don't see network time setting available, at least not with that name. ![]() |
![]() ![]() Send message Joined: 24 Oct 04 Posts: 1234 Credit: 79,798,352 RAC: 75,861 ![]() ![]() |
|
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
Both my machines at home did change their clocks (win7 & win10), so no network time in use here. I don't see network time setting available, at least not with that name. Both clocks changed but ATLAS tasks didn't fail... interesting. Perhaps another cause than the one I have suggested. I am probably using the wrong name. I wish I could help you find the setting but I don't have any Win machines here to explore, just Linux. It's been several years since I've done much with Windoze. As I recall the setting is buried ~20 clicks deep to discourage users from playing with it. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
Well then what's going on? I just got off the phone with Toddler Tyrant, he claims he has proof it's because Killary is messing with the Euro clocks. <edit> He just called me back. Says he didn't say that. Says he knows a guy who knows a guy who says his dog has the proof and he insists it's not the same guy who photoshopped the pics of his inauguration crowd. |
Send message Joined: 26 Oct 18 Posts: 110 Credit: 5,268,443 RAC: 31,648 ![]() ![]() |
On Windows 10 (17134.376) Internet time setting can be found here: 1. Right-click clock on the Taskbar 2. Click 'Adjust date/time' 3. Click 'Additional date, time & regional settings' 4. Click 'Set the time and date' 5. Choose tab 'Internet Time' My computer says: "This computer is set to automatically synchronize with 'time.windows.com" and "This computer is set to automatically synchronize on a scheduled basis." If I click 'Change settings' there's only an option to check the box 'Synchronize with an Internet time server' and set time server to time.windows.com or time.nist.gov. There's also an option to click 'Update now'. I don't know how often this synchronization is done on the background on Windows 10. On Windows 7 that setting can be found the same way. Windows 7 info says: "Your clock is typically updated once a week and needs to be connected to the Internet for the synchronization to occur." I had one LHCb task running and it errored out during the day light time change. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
@ Richie_unstable Thanks for that. @ Magic Thanks for wiki article. It clears some confusion. I assumed clocks in N. America changed this weekend along European clocks. Now I see ours don't change until November 4. OK. Here's the wager. I bet 1 barrel of Kokanee (the finest beer on the planet) that when our clocks change on November 4 my hosts don't fail any tasks with the heartbeat related error or any error other than errors attributable to project infrastructure failure. Any takers? Here's your opportunity, computezrme. |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,950,792 RAC: 82,053 ![]() ![]() ![]() |
heartbeatthis hearbeat thing is a big headache anyway. Some time ago, someone here gave a thorough technical explanation how it works. And so it became clear to me why once and so often my VM tasks fail on the two notebooks which are connected via WLAN, once the WLAN connection gets interrupted for a second or two. As a consequence, always and again this results in VM tasks which have run for many hours and are almost finished, and suddenly stop due to "hearbeat missing". Damned thing. |
Send message Joined: 2 May 07 Posts: 2277 Credit: 178,709,076 RAC: 100,489 ![]() ![]() |
Have voted against this summertime-arrangement. More than 4 Mio. EU-Citizens have voted. This WU was starting at the Window of DLS change and worked with 3 DAYS CPU and got a confirmation Error: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=102386219 RDP was showing a successful work of 50 Collisions every CPU. Task had 4 CPU's working. So, there must be something other going wrong. |
Send message Joined: 18 Dec 15 Posts: 1908 Credit: 144,950,792 RAC: 82,053 ![]() ![]() ![]() |
So, there must be something other going wrong.I would say as long as the LHC people don't tell us what the problem really was, we can only guess. Further, I don't remember having had this problem in the years before, at the dates of time change. Maybe the technical configuration of the tasks was different then, who knows. |
![]() Send message Joined: 28 Sep 04 Posts: 780 Credit: 60,001,110 RAC: 47,111 ![]() ![]() ![]() |
On Windows 10 (17134.376) Internet time setting can be found here: That's the way I have both my win7 and win10 computers set, to update time from internet. But this setting doesn't mean that the time would not change when DSL starts or ends. But If you select to change your time zone there you can find setting whether the computer should follow DSL automatically or not. Anyway I prefer the computer to show the actual time. ![]() |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
But this setting doesn't mean that the time would not change when DSL starts or ends. True. And now I see that the explanation I proposed earlier is incorrect. That explanation was based on a discussion I read years ago on Stack Overflow regarding why the system clock gets messed up on dual-boot (Windows <-> Linux) systems when switching between OS's. I recalled the facts incorrectly and so came up with a partially incorrect explanation for why your tasks failed. The part that is wrong is where I claimed Linux is immune to the problem. I won't be surprised if my Linux hosts lose tasks when the time changes here in N. America on Nov. 4. and I am forced to give up a barrel of beer :-( |
©2025 CERN