1) Message boards : CMS Application : Pausing and resuming CMS tasks (Message 51454)
Posted 11 days ago by Profile Guy
Post:
ATLAS and CMS resumed - with confidence! (for some time..!)
Thank you.
2) Message boards : CMS Application : Pausing and resuming CMS tasks (Message 51384)
Posted 27 days ago by Profile Guy
Post:
Right... :-)
Thank you.
Thanks for the link. Just seeing the swathes of analysis at mcplots bolsters just how much regard I've got for everyone involved with the fantastic subatomic science in progress at CERN, elsewhere and all their collaborators. As a volunteer, no matter what, I feel part of that. It is, ultimately, excellent work. Love it. Thanks everyone.
3) Message boards : CMS Application : Pausing and resuming CMS tasks (Message 51381)
Posted 28 days ago by Profile Guy
Post:
Hello,
Thanks all for your descriptions of what's happening. It's all very enlightening.
Currently I've stopped downloading ATLAS and CMS because -
CMS and ATLAS do need an internet connection. When a task is paused/suspended the VM doesn't have one.
Only short network interruptions are allowed - shorter than one hour.
- and as I said - I often pause BOINC to do other things... (Six Track and Theory are still go though.)
Now.
Because the >1hr pause causes a failure to complete the ATLAS/CMS task, I "suspect" that I may be wasting the LHC@home project's time in some way, like; it has to send the task out again -
...CMS tends to grant credits even if errors happen.
This is caused by the fact that most errors are not clear enough to blame the user...
- and the project is only being polite by granting credit.

But am I wasting time? To elaborate: Is the time I spend crunching simply contributing to a work unit communal result that's serviced by many clients? Has an incomplete task still has assisted a bit? The granted credit suggests as much... And in which case, of course, I'd resume downloading the mt ATLAS and CMS tasks.
Thank you sincerely for your continuing attention to my enquiries.
4) Message boards : CMS Application : Pausing and resuming CMS tasks (Message 51374)
Posted 8 Jan 2025 by Profile Guy
Post:
Thank you. You are very knowledgable.
It sounds like a simple thing to fix though...

Ah. And it's only January... LOL
5) Message boards : CMS Application : Pausing and resuming CMS tasks (Message 51372)
Posted 8 Jan 2025 by Profile Guy
Post:
Thanks, that's informative.
I hope I'm not asking what's been clearly described elsewhere.
But does that mean that being out of contact with a >1hr pause causes synchronisation issues?
Which raises the question of is there somewhere a Symmetric Multi-Processor system managing the task simultaneously across multiple remote LHC@home users? Or not?
Perhaps a script to sync-up my squid exists somewhere - beyond my skills to make one.
What's going on, please?
Thanks.
6) Message boards : CMS Application : Pausing and resuming CMS tasks (Message 51369)
Posted 8 Jan 2025 by Profile Guy
Post:
As mentioned in this post (about how BOINC tries to estimate the time a task will take), pausing and resuming CMS vm tasks causes problems.
The BOINC benchmarks are not relevant since each VM runs internal benchmarks that are not reported back to BOINC.
...
Each pause/resume initiated by BOINC disturbs this calculation and may result in loosing the last scientific job (18 h are a hard limit).
Since this is also not reported back to BOINC it is recommended to avoid pause/resume cycles.
Ah well that explains this happening when I paused and resumed a CMS task between 22:26 last night and 01:59 this morning -
<core_client_version>8.0.4</core_client_version>
<![CDATA[
<stderr_txt>
2025-01-07 12:33:26 (5920): vboxwrapper version 26207
2025-01-07 12:33:26 (5920): BOINC client version: 8.0.4
2025-01-07 12:33:27 (5920): Detected: VirtualBox VboxManage Interface (Version: 7.1.4)
...
...
2025-01-07 22:26:52 (5920): Stopping VM.
2025-01-07 22:26:56 (5920): Successfully stopped VM.
2025-01-08 01:59:49 (4231): vboxwrapper version 26207
2025-01-08 01:59:49 (4231): BOINC client version: 8.0.4
2025-01-08 01:59:52 (4231): Detected: VirtualBox VboxManage Interface (Version: 7.1.4)
2025-01-08 01:59:52 (4231): Detected: Heartbeat check (file: 'heartbeat' every 1200.000000 seconds)
2025-01-08 01:59:52 (4231): Guest Log: BIOS: VirtualBox 7.1.4
...
2025-01-08 01:59:52 (4231): Starting VM using VBoxManage interface. (boinc_257993d36581a549, slot#3)
2025-01-08 01:59:58 (4231): Successfully started VM. (PID = '4297')
2025-01-08 01:59:58 (4231): Reporting VM Process ID to BOINC.
2025-01-08 01:59:58 (4231): VM state change detected. (old = 'poweredoff', new = 'running')
2025-01-08 01:59:58 (4231): Detected: Web Application Enabled (http://localhost:47895)
2025-01-08 01:59:58 (4231): Status Report: Job Duration: '64800.000000'
2025-01-08 01:59:58 (4231): Status Report: Elapsed Time: '35938.000000'
2025-01-08 01:59:58 (4231): Status Report: CPU Time: '135895.160000'
2025-01-08 01:59:58 (4231): Preference change detected
2025-01-08 01:59:58 (4231): Setting CPU throttle for VM. (100%)
2025-01-08 01:59:58 (4231): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 60 seconds) or (Vbox_job.xml: 600 seconds))
2025-01-08 02:00:02 (4231): Guest Log: 09:52:47.542720 timesync vgsvcTimeSyncWorker: Radical host time change: 12 795 334 000 000ns (HostNow=1 736 301 601 618 000 000 ns HostLast=1 736 288 806 284 000 000 ns)
2025-01-08 02:00:12 (4231): Guest Log: 09:52:57.585671 timesync vgsvcTimeSyncWorker: Radical guest time change: 12 795 384 121 000ns (GuestNow=1 736 301 611 748 110 000 ns GuestLast=1 736 288 816 363 989 000 ns fSetTimeLastLoop=true )
2025-01-08 02:04:37 (4231): Guest Log: [INFO] glidein exited with return value 0.
2025-01-08 02:04:37 (4231): Guest Log: [INFO] Shutting Down.
2025-01-08 02:04:37 (4231): VM Completion File Detected.
2025-01-08 02:04:37 (4231): VM Completion Message: glidein exited with return value 0.
.
2025-01-08 02:04:37 (4231): Powering off VM.
2025-01-08 02:04:38 (4231): Successfully stopped VM.
2025-01-08 02:04:38 (4231): Deregistering VM. (boinc_257993d36581a549, slot#3)
2025-01-08 02:04:38 (4231): Removing network bandwidth throttle group from VM.
2025-01-08 02:04:38 (4231): Removing VM from VirtualBox.
2025-01-08 02:04:43 (4231): called boinc_finish(0)

</stderr_txt>
But I still got credit granted -
Task          Work unit     Computer      Sent            Time reported   Status                     Run        CPU           Credit    Application
                                                          or deadline                                time       time
---------------------------------------------------------------------------------------------------------------------------------------------------
418699676     229706420     10860321      7 Jan 2025      8 Jan 2025      Completed and validated    35,891.24  135,935.70    702.38    CMS Simulation v70.30 (vbox64_mt_mcore_cms)
                                                                                                                                        x86_64-pc-linux-gnu

Another task was downloaded (CMS_2779181_1736299900.108510_0).
But my PC power settings caused it to enter sleep mode instead of just blanking the screen. And, when I woke it up this morning -
...
2025-01-08 02:29:31 (5323): Guest Log: [INFO] CMS application starting. Check log files.
2025-01-08 02:32:51 (5323): VM state change detected. (old = 'running', new = 'paused')
2025-01-08 07:29:33 (5323): Error in resume VM for VM: -182
Command:
VBoxManage -q controlvm "boinc_52dab00a2f68e27b" resume
Output:
VBoxManage: error: Could not resume the machine execution (VERR_VM_INVALID_VM_STATE)
VBoxManage: error: Details: code VBOX_E_VM_ERROR (0x80bb0003), component ConsoleWrap, interface IConsole, callee nsISupports
VBoxManage: error: Context: "Resume()" at line 393 of file VBoxManageControlVM.cpp

2025-01-08 07:29:34 (5323): Guest Log: 00:04:00.037174 timesync vgsvcTimeSyncWorker: Radical host time change: 17 811 435 000 000ns (HostNow=1 736 321 373 456 000 000 ns HostLast=1 736 303 562 021 000 000 ns)
2025-01-08 07:29:34 (5323): VM state change detected. (old = 'paused', new = 'running')
2025-01-08 07:29:43 (5323): Guest Log: 00:04:10.037675 timesync vgsvcTimeSyncWorker: Radical guest time change: 17 811 437 786 000ns (GuestNow=1 736 321 383 456 583 000 ns GuestLast=1 736 303 572 018 797 000 ns fSetTimeLastLoop=true )
2025-01-08 07:54:41 (5323): Guest Log: [INFO] glidein exited with return value 0.
2025-01-08 07:54:41 (5323): Guest Log: [INFO] Shutting Down.
2025-01-08 07:54:41 (5323): VM Completion File Detected.
2025-01-08 07:54:41 (5323): VM Completion Message: glidein exited with return value 0.
.
2025-01-08 07:54:41 (5323): Powering off VM.
2025-01-08 07:54:41 (5323): Successfully stopped VM.
2025-01-08 07:54:41 (5323): Deregistering VM. (boinc_52dab00a2f68e27b, slot#1)
2025-01-08 07:54:41 (5323): Removing network bandwidth throttle group from VM.
2025-01-08 07:54:41 (5323): Removing VM from VirtualBox.
2025-01-08 07:54:46 (5323): called boinc_finish(0)

</stderr_txt>
I even got credit for this! Completed and validated - 31.99

Do all VMs have this pause/resume problem?
It's a nuisance. Being an "LHC@home" user I often want to pause BOINC as, I'm sure, do many others... Probably.
Thanks.
7) Message boards : CMS Application : Why all CMS virtualbox task were all failed... host was damaged? (Message 51336)
Posted 28 Dec 2024 by Profile Guy
Post:
Yes. All CMS tasks are "empty" units at the moment. Despite this, they are still being distributed and are assigned 4 (or more) CPUs on your BOINC host PC and run from anything between a couple of minutes to about half an hour. It's been like that for some weeks! The LHC "system" is being reconfigured - an extensive operation. It is speculated that these "void-of-data" workunits are providing some sort of feedback to the system engineers and are useful in that way. There's just no actual number crunching as such.
See this thread for the latest information about the progress being made.
8) Message boards : News : Seasons greetings (Message 51316)
Posted 20 Dec 2024 by Profile Guy
Post:
Merry Christmas and best wishes for the new year.
9) Message boards : CMS Application : How do I set up a local HTTP proxy? (Message 51260)
Posted 9 Dec 2024 by Profile Guy
Post:
Hello,
Squid is working.
Aside -
Is this cache refresh -
192.eg.eg.eg 3128 - - [09/Dec/2024:07:51:03 +0000] "GET http://s1ral-cvmfs.openhtc.io/cvmfs/cernvm-prod.cern.ch/.cvmfspublished HTTP/1.1" 200 1995 "-" "cvmfs Fuse 2.5.2 11ac1fe9-e05a-44b5-b29e-ca3d2370db11" TCP_REFRESH_MODIFIED:HIER_DIRECT
192.eg.eg.eg 3128 - - [09/Dec/2024:07:51:06 +0000] "GET http://s1ral-cvmfs.openhtc.io/cvmfs/sft.cern.ch/.cvmfspublished HTTP/1.1" 200 2000 "-" "cvmfs Fuse 2.5.2 11ac1fe9-e05a-44b5-b29e-ca3d2370db11" TCP_REFRESH_MODIFIED:HIER_DIRECT
supposed to happen - every 4 minutes?
My concern is that this frequent communication with the data servers seems to negate the point of cacheing in my machine.
10) Message boards : CMS Application : How do I set up a local HTTP proxy? (Message 51257)
Posted 6 Dec 2024 by Profile Guy
Post:
Well thanks both. Food for thought.
11) Message boards : CMS Application : How do I set up a local HTTP proxy? (Message 51254)
Posted 6 Dec 2024 by Profile Guy
Post:
Did you add "client_request_buffer_max_size" to squid.conf as suggested
No. But I have now, for sure.

client_request_buffer_max_size 100 MB

It does work.
The connection is now handling file_upload_handler_large
where previously not.

I have no work from LHC@home to test with.
This was an Einstein@home GPU task (O3 1.15 (GW-cuda)) -

BOINC Manager - Event log
...
Fri 06 Dec 2024 13:00:16 GMT |               | Using proxy info from GUI
Fri 06 Dec 2024 13:00:16 GMT |               | Using HTTP proxy 192.168.##.##:3128
...
Fri 06 Dec 2024 13:27:18 GMT | Einstein@Home | Computation for task h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_32148_1 finished
Fri 06 Dec 2024 13:27:18 GMT | Einstein@Home | Starting task h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_32013_1
Fri 06 Dec 2024 13:27:20 GMT | Einstein@Home | Started upload of h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_32148_1_0
Fri 06 Dec 2024 13:27:20 GMT | Einstein@Home | Started upload of h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_32148_1_1
Fri 06 Dec 2024 13:27:21 GMT | Einstein@Home | Finished upload of h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_32148_1_1 (2557 bytes)
Fri 06 Dec 2024 13:27:25 GMT | Einstein@Home | Finished upload of h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_32148_1_0 (4889766 bytes)
Fri 06 Dec 2024 13:27:25 GMT | Einstein@Home | Sending scheduler request: To report completed tasks.
Fri 06 Dec 2024 13:27:25 GMT | Einstein@Home | Reporting 1 completed tasks
Fri 06 Dec 2024 13:27:25 GMT | Einstein@Home | Not requesting tasks: "no new tasks" requested via Manager
Fri 06 Dec 2024 13:27:26 GMT | Einstein@Home | Scheduler request completed
(Einstein@home downloads are working too)

/var/log/squid/access.log
...
192.168.##.## 3128 - - [06/Dec/2024:13:27:21 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 200 1004 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.##.## 3128 - - [06/Dec/2024:13:27:21 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 200 3697 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.##.## 3128 - - [06/Dec/2024:13:27:24 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 200 4890936 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.##.## 3128 - - [06/Dec/2024:13:27:31 +0000] "CONNECT scheduler.einsteinathome.org:443 HTTP/1.1" 200 46729 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_TUNNEL:HIER_DIRECT

Thanks very much for that.

Question(s): (please, forgive my absence of and/or diabolical abuse of network syntax. I'm not testing you - I'm just a TCP novice asking irritating questions!)
How does client_request_buffer_max_size 100 MB work? Is it merely adjusting a local squid pass-through filter? Or manipulating an established TCP connection?
Or is this talking to the remote server target directly? Because adjusting this has enabled a connection that, whatever the default setting was, wasn't allowing.

With all due respect, I wish someone would please post a visual TCP Tree this Christmas! Thank you.
It seems the best practice with TCP is to keep it simple. I just don't know what (are the most important things) to keep simple!

And finally -
192.168.##.## 3128 - - [05/Dec/2024:20:32:47 +0000] "CONNECT scheduler.einsteinathome.org:443 HTTP/1.1" 200 33682 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_TUNNEL:HIER_DIRECT
This indicates an encrypted connection (via HTTPS port 443).
Those connections are forwarded by Squid without inspecting the packets - like any router between client and server does.
Is this "tunnelling"/forwarding a default squid response to any connection request that isn't described in squid.conf? I mean - is this squid access.log line describing itself or what something has told it?
Thank you.
12) Message boards : CMS Application : How do I set up a local HTTP proxy? (Message 51251)
Posted 5 Dec 2024 by Profile Guy
Post:
Here it is in the process of failing at 21:47
with Einstein@home jobs only -

BOINC Manager Event log:
Thu 05 Dec 2024 21:45:06 GMT | Einstein@Home | Computation for task LATeah2111F_1304.0_973852_0.0_0 finished
Thu 05 Dec 2024 21:45:06 GMT | Einstein@Home | Starting task LATeah2111F_1336.0_453046_0.0_0
Thu 05 Dec 2024 21:45:08 GMT | Einstein@Home | Started upload of LATeah2111F_1304.0_973852_0.0_0_0
Thu 05 Dec 2024 21:45:08 GMT | Einstein@Home | Started upload of LATeah2111F_1304.0_973852_0.0_0_1
Thu 05 Dec 2024 21:45:10 GMT | Einstein@Home | Finished upload of LATeah2111F_1304.0_973852_0.0_0_0 (356 bytes)
Thu 05 Dec 2024 21:45:10 GMT | Einstein@Home | Finished upload of LATeah2111F_1304.0_973852_0.0_0_1 (339 bytes)
Thu 05 Dec 2024 21:47:23 GMT | Einstein@Home | Started upload of h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_40584_1_0
Thu 05 Dec 2024 21:47:27 GMT |               | Project communication failed: attempting access to reference site
Thu 05 Dec 2024 21:47:27 GMT | Einstein@Home | Temporarily failed upload of h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_40584_1_0: transient HTTP error
Thu 05 Dec 2024 21:47:27 GMT | Einstein@Home | Backing off 00:44:47 on upload of h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_40584_1_0
Thu 05 Dec 2024 21:47:29 GMT |               | Internet access OK - project servers may be temporarily down.
...
Thu 05 Dec 2024 21:55:21 GMT | Einstein@Home | Started upload of h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_40546_0_0
Thu 05 Dec 2024 21:55:25 GMT |               | Project communication failed: attempting access to reference site
Thu 05 Dec 2024 21:55:25 GMT | Einstein@Home | Temporarily failed upload of h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_40546_0_0: transient HTTP error
Thu 05 Dec 2024 21:55:25 GMT | Einstein@Home | Backing off 00:05:46 on upload of h1_0161.80_O3aLC01Cl1In0__O3ASBu_162.00Hz_40546_0_0
Thu 05 Dec 2024 21:55:27 GMT |               | Internet access OK - project servers may be temporarily down.

/var/log/squid/access.log:
192.168.etc.etc 3128 - - [05/Dec/2024:21:45:10 +0000] "POST http://einstein4.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_medium HTTP/1.1" 200 1477 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.etc.etc 3128 - - [05/Dec/2024:21:45:10 +0000] "POST http://einstein4.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_medium HTTP/1.1" 200 1460 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.etc.etc 3128 - - [05/Dec/2024:21:47:24 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 200 1004 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.etc.etc 3128 - - [05/Dec/2024:21:47:26 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 100 266609 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS_ABORTED:HIER_DIRECT
...
192.168.etc.etc 3128 - - [05/Dec/2024:21:55:22 +0000] "CONNECT www.google.com:443 HTTP/1.1" 200 15708 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_TUNNEL:HIER_DIRECT
192.168.etc.etc 3128 - - [05/Dec/2024:21:55:22 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 200 1004 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.etc.etc 3128 - - [05/Dec/2024:21:55:24 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 100 189248 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS_ABORTED:HIER_DIRECT
13) Message boards : CMS Application : How do I set up a local HTTP proxy? (Message 51250)
Posted 5 Dec 2024 by Profile Guy
Post:
I initially installed Leap years ago. Now occasionally it thinks it's Tumbleweed for unknown reasons... It depends on how it's updating or what it's connected to - sometimes it identifies as one thing and other-times the other. I've got my repos crossed somewhere. As long as it works I don't pay too much attention to it. Sometimes I'm suspicious but it only pays to be as suspicious as you are apt. I'm really only a hobbyist user.

And I'm uncertain which LHC@home jobs use the proxy now.
All except SixTrack.
Ah well my /etc/squid/squid.conf is definitely at fault for that. I've made a mistake thinking that Atlas doesn't use squid any more and commented out "extra section 2". I must have missed something in that 3-page thread.
The trouble is that I first used the squid.conf as posted with just my ip inserted where needed. It didn't work. So I dug around the new messages thread trying to fix it! Basically - Help! I'm lost.

Here are some current logs -
OK
I have Einstein@home and LHC@home.
There is no LHC work on my PC at the moment.

I reinstated the squid at 20:21 GMT to get some logging.
This is my BOINC client Event log during communication:
...
Thu 05 Dec 2024 20:21:54 GMT |               | Using proxy info from GUI
Thu 05 Dec 2024 20:21:54 GMT |               | Using HTTP proxy 192.168.etc.etc:3128
Thu 05 Dec 2024 20:32:41 GMT | Einstein@Home | Computation for task LATeah2111F_1304.0_94864_0.0_0 finished
Thu 05 Dec 2024 20:32:41 GMT | Einstein@Home | Starting task LATeah2111F_1304.0_1007072_0.0_0
Thu 05 Dec 2024 20:32:41 GMT | Einstein@Home | Sending scheduler request: To fetch work.
Thu 05 Dec 2024 20:32:41 GMT | Einstein@Home | Requesting new tasks for CPU
Thu 05 Dec 2024 20:32:43 GMT | Einstein@Home | Started upload of LATeah2111F_1304.0_94864_0.0_0_0
Thu 05 Dec 2024 20:32:43 GMT | Einstein@Home | Started upload of LATeah2111F_1304.0_94864_0.0_0_1
Thu 05 Dec 2024 20:32:43 GMT | Einstein@Home | Scheduler request completed: got 1 new tasks
Thu 05 Dec 2024 20:32:43 GMT | Einstein@Home | Project requested delay of 60 seconds
Thu 05 Dec 2024 20:32:45 GMT | Einstein@Home | Finished upload of LATeah2111F_1304.0_94864_0.0_0_0 (342 bytes)
Thu 05 Dec 2024 20:32:45 GMT | Einstein@Home | Finished upload of LATeah2111F_1304.0_94864_0.0_0_1 (338 bytes)
...
It's behaving itself - It uploaded!

/var/log/squid/* shows:
192.168.etc.etc 3128 - - [05/Dec/2024:20:27:06 +0000] "CONNECT einsteinathome.org:443 HTTP/1.1" 200 8590 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_TUNNEL:HIER_DIRECT
192.168.etc.etc 3128 - - [05/Dec/2024:20:27:09 +0000] "CONNECT scheduler.einsteinathome.org:443 HTTP/1.1" 200 48936 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_TUNNEL:HIER_DIRECT
192.168.etc.etc 3128 - - [05/Dec/2024:20:32:44 +0000] "POST http://einstein4.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_medium HTTP/1.1" 200 1458 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.etc.etc 3128 - - [05/Dec/2024:20:32:44 +0000] "POST http://einstein4.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_medium HTTP/1.1" 200 1462 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.etc.etc 3128 - - [05/Dec/2024:20:32:47 +0000] "CONNECT scheduler.einsteinathome.org:443 HTTP/1.1" 200 33682 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_TUNNEL:HIER_DIRECT

I'm aware that TCP_HIT/MISS are whether or not the requested data was found in the squid cache. And that HIER_DIRECT means the data was pulled from the target url on the remote server. And 200 means continue (or OK).
I guess that TCP_TUNNEL:HIER_DIRECT means that it goes straight through or... Is it a firewall message? My knowledge of TCP/IP is very limited. And therefore firewalls - nope. I've had a look and don't know where to start. I'm too scared to open ports or play with network security in any way - without an official trustworthy okey dokey.

The squid ran for about 24 hours before I switched it off. The access log is 4.7 MB plus.
The only LHC@home units at the moment are empty CMS jobs. I don't want to download those because they'll use four cores pointlessly for 20 minutes while my other project is putting them to use.

But I made a note of some of the errant activity from last week, here -
The BOINC Manager's Event Log:
...
Thu 28 Nov 2024 22:00:12 GMT | Einstein@Home | Computation for task LATeah2111F_504.0_63492_0.0_0 finished
Thu 28 Nov 2024 22:00:14 GMT | Einstein@Home | Started upload of LATeah2111F_504.0_63492_0.0_0_0
Thu 28 Nov 2024 22:00:14 GMT | Einstein@Home | Started upload of LATeah2111F_504.0_63492_0.0_0_1
Thu 28 Nov 2024 22:00:15 GMT | Einstein@Home | Finished upload of LATeah2111F_504.0_63492_0.0_0_0 (355 bytes)
Thu 28 Nov 2024 22:00:15 GMT | Einstein@Home | Finished upload of LATeah2111F_504.0_63492_0.0_0_1 (346 bytes)
...
Thu 28 Nov 2024 22:01:50 GMT | Einstein@Home | Computation for task h1_1679.60_O3aC01Cl1In0__O3ASHF1d_1680.00Hz_51816_0 finished
Thu 28 Nov 2024 22:01:51 GMT | Einstein@Home | Starting task h1_1679.60_O3aC01Cl1In0__O3ASHF1d_1680.00Hz_51814_0
Thu 28 Nov 2024 22:01:53 GMT | Einstein@Home | Started upload of h1_1679.60_O3aC01Cl1In0__O3ASHF1d_1680.00Hz_51816_0_0
Thu 28 Nov 2024 22:01:53 GMT | Einstein@Home | Started upload of h1_1679.60_O3aC01Cl1In0__O3ASHF1d_1680.00Hz_51816_0_1
Thu 28 Nov 2024 22:01:55 GMT |               | Project communication failed: attempting access to reference site
Thu 28 Nov 2024 22:01:55 GMT | Einstein@Home | Temporarily failed upload of h1_1679.60_O3aC01Cl1In0__O3ASHF1d_1680.00Hz_51816_0_0: transient HTTP error
Thu 28 Nov 2024 22:01:55 GMT | Einstein@Home | Backing off 00:02:21 on upload of h1_1679.60_O3aC01Cl1In0__O3ASHF1d_1680.00Hz_51816_0_0
Thu 28 Nov 2024 22:01:55 GMT | Einstein@Home | Temporarily failed upload of h1_1679.60_O3aC01Cl1In0__O3ASHF1d_1680.00Hz_51816_0_1: transient HTTP error
Thu 28 Nov 2024 22:01:55 GMT | Einstein@Home | Backing off 00:03:44 on upload of h1_1679.60_O3aC01Cl1In0__O3ASHF1d_1680.00Hz_51816_0_1
Thu 28 Nov 2024 22:01:55 GMT | Einstein@Home | Started upload of h1_1679.60_O3aC01Cl1In0__O3ASHF1d_1680.00Hz_51816_0_2
Thu 28 Nov 2024 22:01:56 GMT |               | Internet access OK - project servers may be temporarily down.
...
Thu 28 Nov 2024 22:04:27 GMT | Einstein@Home | Started download of Ter5_1_dns_cfbf00084_segment_5_dms_200_168.binary
Thu 28 Nov 2024 22:04:30 GMT | Einstein@Home | Finished download of Ter5_1_dns_cfbf00084_segment_5_dms_200_168.binary (9402308 bytes)
...
...
Fri 29 Nov 2024 01:12:24 GMT |      LHC@home | Started upload of fpyKDm1Cza6nsSi4ap6QjLDmwznN0nGgGQJmq4hLDmaXSLDmbNiYgn_0_r2141493864_ATLAS_result
Fri 29 Nov 2024 01:12:28 GMT |               | Project communication failed: attempting access to reference site
Fri 29 Nov 2024 01:12:28 GMT |      LHC@home | Temporarily failed upload of fpyKDm1Cza6nsSi4ap6QjLDmwznN0nGgGQJmq4hLDmaXSLDmbNiYgn_0_r2141493864_ATLAS_result: transient HTTP error
Fri 29 Nov 2024 01:12:28 GMT |      LHC@home | Backing off 00:55:36 on upload of fpyKDm1Cza6nsSi4ap6QjLDmwznN0nGgGQJmq4hLDmaXSLDmbNiYgn_0_r2141493864_ATLAS_result
Fri 29 Nov 2024 01:12:30 GMT |               | Internet access OK - project servers may be temporarily down.
...

And the simultaneous
/var/log/squid/access.log:
...
192.168.etc.etc 3128 - - [28/Nov/2024:22:00:14 +0000] "POST http://einstein4.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_medium HTTP/1.1" 200 1474 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.etc.etc 3128 - - [28/Nov/2024:22:00:14 +0000] "POST http://einstein4.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_medium HTTP/1.1" 200 1465 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
...
192.168.etc.etc 3128 - - [28/Nov/2024:22:01:53 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 200 1006 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.etc.etc 3128 - - [28/Nov/2024:22:01:53 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 200 1006 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.etc.etc 3128 - - [28/Nov/2024:22:01:54 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 100 189250 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS_ABORTED:HIER_DIRECT
192.168.etc.etc 3128 - - [28/Nov/2024:22:01:54 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 100 189250 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS_ABORTED:HIER_DIRECT
192.168.etc.etc 3128 - - [28/Nov/2024:22:01:55 +0000] "POST http://einstein3.aei.uni-hannover.de/EinsteinAtHome/cgi-bin/file_upload_handler_large HTTP/1.1" 200 3750 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
...
192.168.etc.etc 3128 - - [28/Nov/2024:22:04:30 +0000] "GET http://einstein8.aei.uni-hannover.de/EinsteinAtHome/download/2b4/Ter5_1_dns_cfbf00084_segment_5_dms_200_168.binary HTTP/1.1" 200 9402992 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS:HIER_DIRECT
192.168.etc.etc 3128 - - [28/Nov/2024:22:04:30 +0000] "CONNECT scheduler.einsteinathome.org:443 HTTP/1.1" 200 102962 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_TUNNEL:HIER_DIRECT
...
...
192.168.etc.etc 3128 - - [29/Nov/2024:01:12:27 +0000] "POST http://lhcathome-upload.cern.ch/lhcathome_cgi/file_upload_handler HTTP/1.1" 0 196933 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS_ABORTED:HIER_DIRECT
192.168.etc.etc 3128 - - [29/Nov/2024:01:12:27 +0000] "POST http://lhcathome-upload.cern.ch/lhcathome_cgi/file_upload_handler HTTP/1.1" 0 196933 "-" "BOINC client (x86_64-suse-linux-gnu 8.0.4)" TCP_MISS_ABORTED:HIER_DIRECT
...

I think a squid is a great idea. I just can't understand the security of TCP/IP. Any help with this would be so welcome.
Thanks.

Append:
the next post has squid logs showing it working and starting to fail.
14) Message boards : CMS Application : How do I set up a local HTTP proxy? (Message 51247)
Posted 5 Dec 2024 by Profile Guy
Post:
Yes, Thank you.

I have set up my LHC@home to run just 1 CMS task at a time with this app_config.xml
1 multi-threaded CMS task at a time is all my 8 core CPU can manage comfortably - a minimum of 4 cores for each individual task is a project requirement. So theoretically I could run 2 CMS tasks at a time. I have actually. But using all 8 cores of my CPU for crunching leaves no room for background OS tasks to run and overall the system slows down.
Anyhoo. I was using squid successfully on Windows 10 a few months ago before I migrated my BOINC projects to Linux. Because of that I feel that squid helps. In fact it's really cool - if you can set it up properly.
I read the very helpful HowTo set up a local Squid and had little more to do than enter my ip address where it said.

Now it's not working on my linux box (Tumbleweed 15.6). It caches what it's supposed to and passes through everything else. But nothing uploads.
Not a major problem because you can just stop your BOINC client using the squid http proxy in the "Options -> Other options" -> HTTP Proxy tab and it resumes communicating - as if it wasn't there in the first place with no problems at all.
I'd like to get it to work though.
There are a lot of changes posted in the new comments thread. I've tried to implement them. I've even had a look at firewalls. I'm not sure what the right thing to do is.

And I'm uncertain which LHC@home jobs use the proxy now.

Any insights, help would be welcome.
Thanks
15) Message boards : CMS Application : Short CMS-Tasks ok? (Message 51246)
Posted 5 Dec 2024 by Profile Guy
Post:
The server hardware is being swapped around. Major reconfigurations are taking place. In other words - the machines the volunteers connect to are taken off line sporadically during this.
There is no work is available while the entire LHC@home crew are all busy doing this There's just no need to cater for generating the BOINC volunteers work while the system is mostly off line.
The work units that 'pop out' are just empty data transport vehicles with no actual LHC@home data for crunching in them. The data transport system is functioning but there are no "passengers", or in this case data in them.

There are gaps but often there is work available from the LHC@home project during this maintenance. I still have a very long Theory job running from last week. Last week saw ATLAS jobs available too. Sometimes you get all three at the same time. CMS, ATLAS & Theory!

These empty tasks are, some think, a bit of a waste of time. Stop them if you want.
I stopped pulling CMS work units a few days ago.
To do this:
Click on the "Project" item in the menu bar at the top of any LHC@home web page.
In the drop down list select "Preferences".
Click "Edit preferences".
Un-check the "CMS Simulation".
Un-check "If no work for selected applications is available, accept work from other applications?" (Leave everything else alone!)
Click "Update preferences".

At this point all you can do is keep an eye on the CMS Application (this) message board for news of new work available.
You could look at the "Computing -> Server status" page - but it doesn't say if the jobs are hollow or not. Check the message boards.

On the technical side, for example -
The errors I found logged in the stderr output generated by the various CMS simulations I downloaded revealed one LHC@home server after another going off line and coming back on again while the crew worked. Each job generates this stderr on the Cern servers upon completion.
To find this particular stderr output - yes, there's more than one for your task - (It's best to do this in another browser tab while you read instructions here)
Click on the "Project" item in the menu bar at the top of any LHC@home web page.
In the drop down list select "Account" to open your account page.
Click Tasks View
In the page that opens is a table of your current and recent tasks.
(IMHO it's not easy to tell which job you want in this list. You have to click each one's Task number and look at the "Name" or the "Date" to identify it.)
Find the job you're interested in examining and click on it's number in the first column - that's its Task number.
And the stderr output is only available for completed tasks. Error or not.

This example snippet shows the error logged at Cern's stderr by my computer when one of those functioning but empty transport vehicles arrived last week.
It shows that a server called "HTCondor" was off line. - Yes - All this for just one server off line!
...
2024-11-17 20:03:37 (14664): Guest Log: [INFO] Testing connection to HTCondor
2024-11-17 20:03:53 (14664): Guest Log: [DEBUG] Status run 1 of up to 3: 1
2024-11-17 20:04:14 (14664): Guest Log: [DEBUG] Status run 2 of up to 3: 1
2024-11-17 20:04:39 (14664): Guest Log: [DEBUG] Status run 3 of up to 3: 1
2024-11-17 20:04:39 (14664): Guest Log: [DEBUG] run 1
2024-11-17 20:04:39 (14664): Guest Log: Ncat: Version 7.50 ( https://nmap.org/ncat )
2024-11-17 20:04:39 (14664): Guest Log: Ncat: Connection timed out.
2024-11-17 20:04:39 (14664): Guest Log: run 2
2024-11-17 20:04:39 (14664): Guest Log: Ncat: Version 7.50 ( https://nmap.org/ncat )
2024-11-17 20:04:39 (14664): Guest Log: Ncat: Connection timed out.
2024-11-17 20:04:39 (14664): Guest Log: run 3
2024-11-17 20:04:39 (14664): Guest Log: Ncat: Version 7.50 ( https://nmap.org/ncat )
2024-11-17 20:04:39 (14664): Guest Log: NCAT DEBUG: Using system default trusted CA certificates and those in /usr/share/ncat/ca-bundle.crt.
2024-11-17 20:04:39 (14664): Guest Log: NCAT DEBUG: Unable to load trusted CA certificates from /usr/share/ncat/ca-bundle.crt: error:02001002:system library:fopen:No such file or directory
2024-11-17 20:04:39 (14664): Guest Log: libnsock nsi_new2(): nsi_new (IOD #1)
2024-11-17 20:04:39 (14664): Guest Log: libnsock nsock_connect_tcp(): TCP connection requested to 137.138.156.85:9618 (IOD #1) EID 8
2024-11-17 20:04:39 (14664): Guest Log: libnsock nsock_trace_handler_callback(): Callback: CONNECT TIMEOUT for EID 8 [137.138.156.85:9618]
2024-11-17 20:04:39 (14664): Guest Log: Ncat: Connection timed out.
2024-11-17 20:04:39 (14664): Guest Log: [ERROR] Could not connect to vocms0840.cern.ch on port 9618
2024-11-17 20:04:39 (14664): Guest Log: [INFO] Testing connection to WMAgent
2024-11-17 20:04:39 (14664): Guest Log: [INFO] Testing connection to EOSCMS
2024-11-17 20:04:40 (14664): Guest Log: [INFO] Testing connection to CMS-Factory
2024-11-17 20:04:40 (14664): Guest Log: [INFO] Testing connection to CMS-Frontier
2024-11-17 20:04:40 (14664): Guest Log: [INFO] Testing connection to Frontier
2024-11-17 20:04:40 (14664): Guest Log: [DEBUG] Check your firewall and your network load
2024-11-17 20:04:40 (14664): Guest Log: [ERROR] Could not connect to all required network services
...

So it's just a matter of time before we see a completion of the maintenance upgrades.
It's a big old system y'all. Patience needed by all.
16) Message boards : CMS Application : How do I limit the number of concurrent CMS VM's? (Message 51227)
Posted 30 Nov 2024 by Profile Guy
Post:
There is a similar post here -

More reasons to use an app_config.xml with your project.

with other reasons for using an app_config.xml described.
17) Message boards : CMS Application : Problems connecting to servers? (Message 51214)
Posted 28 Nov 2024 by Profile Guy
Post:
They must be dry-running the servers.
It would be nice if they kept that local.
18) Message boards : CMS Application : Problems connecting to servers? (Message 51212)
Posted 28 Nov 2024 by Profile Guy
Post:
OK thanks, Harri.
These "empty" CMS jobs still use 4 of my CPUs...
I'll stop pulling CMS tasks until work is available.
19) Message boards : CMS Application : Problems connecting to servers? (Message 51211)
Posted 28 Nov 2024 by Profile Guy
Post:
Yes. Thanks.
As noted - that failed utterly! Yikes!
20) Message boards : CMS Application : Problems connecting to servers? (Message 51209)
Posted 28 Nov 2024 by Profile Guy
Post:
Because CMS appears to be using only 1 CPU I tried adjusting my app_config.xml to use 1 CPU for CMS jobs.
They all failed!
CMS multithread jobs need 4 CPUs (minimum).
It "looked" like it was working... But all have since failed with this logged in stderr -
2024-11-28 11:15:41 (18379): Guest Log: [INFO] CMS application starting. Check log files.
2024-11-28 11:27:55 (18379): Guest Log: [ERROR] VM expects at least 4 CPUs but reports only 1.
Changed it back to 4 CPUs & threads. All OK now!
But this is a waste of compute units (CPUs). Three CPUs are doing exactly nothing.


Next 20


©2025 CERN