Setting up a local Squid to work with LHC@home

Author	Message
computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2631 Credit: 270,474,540 RAC: 70,071	Message 42988 - Posted: 9 Jul 2020, 14:20:14 UTC Last modified: 9 Jul 2020, 14:23:40 UTC This is a discussion thread to post comments and questions regarding the Squid Setup HowTo: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5473 Older comments regarding a Squid configuration can be found here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4611 ID: 42988 · Reply Quote

Greger Send message Joined: 9 Jan 15 Posts: 151 Credit: 431,596,822 RAC: 0	Message 42991 - Posted: 9 Jul 2020, 18:38:35 UTC Thanks computezrmle. ID: 42991 · Reply Quote

Henry Nebrensky Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,015,977 RAC: 195	Message 43003 - Posted: 10 Jul 2020, 12:42:46 UTC - in response to Message 42988. Nice! You could add that for "Connecting the BOINC Client" the command-line version is: boinccmd --set_proxy_settings squid_hostname_or_IP 3128 '' '' '' '' '' '' '' (Those are pairs of single-quotes, to specify seven null parameters.) ID: 43003 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2631 Credit: 270,474,540 RAC: 70,071	Message 43022 - Posted: 11 Jul 2020, 7:39:51 UTC - in response to Message 43003. Thanks for posting. Most work can be done using optional methods. This thread is the perfect place to mention them. ID: 43022 · Reply Quote

Jim1348 Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0	Message 43051 - Posted: 13 Jul 2020, 18:56:15 UTC - in response to Message 42988. As I recall (not very well), if you want to use SQUID only the same machine that you are running BOINC, then in the "squid.conf" file you set: # Either enter a list of IPs representing your computers that are permitted to use the proxy. # Each IP on a separate line. acl crunchers src 127.0.0.1 Is that correct, or should the actual IP address of the machine be used? ID: 43051 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2631 Credit: 270,474,540 RAC: 70,071	Message 43053 - Posted: 13 Jul 2020, 20:35:23 UTC - in response to Message 43051. It is redundant since a bit below "localhost" is explicitely allowed to use squid: http_access allow localhost "localhost" is a built-in keyword that squid resolves to 127.0.0.1 ID: 43053 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2631 Credit: 270,474,540 RAC: 70,071	Message 43655 - Posted: 19 Nov 2020, 11:58:37 UTC To all volunteers using the suggested squid.conf from here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5473 Since David Cameron switched ATLAS Frontier to use Cloudflare's openhtc.io "extra section 2" is now obsolete. Feel free to remove the following part from your squid.conf or disable all options using a "#" in front of each line (shown here): # # Start of extra section 2 # parent cache configuration # # ATLAS tasks route frontier requests via predefined WLCG proxy chains including load balancing and fail-over. # The following lines ensure those proxy chains are respected by a local squid as intended by the CERN ATLAS team. #acl request_via_atlasfrontier_chain url_regex -i ^http://+atlasfrontier[1-4]?-ai\.cern\.ch:8000/+[^/]+ #cache_peer atlas-db-squid.grid.uio.no parent 3128 0 no-query no-digest weighted-round-robin no-netdb-exchange connect-timeout=7 connect-fail-limit=1 #cache_peer_access atlas-db-squid.grid.uio.no allow request_via_atlasfrontier_chain #cache_peer dcache.ijs.si parent 3128 0 no-query no-digest weighted-round-robin no-netdb-exchange connect-timeout=7 connect-fail-limit=1 #cache_peer_access dcache.ijs.si allow request_via_atlasfrontier_chain #cache_peer atlasfrontier-ai.cern.ch parent 8000 0 no-query no-digest no-netdb-exchange connect-fail-limit=1 #cache_peer_access atlasfrontier-ai.cern.ch allow request_via_atlasfrontier_chain #never_direct allow request_via_atlasfrontier_chain # End of extra section 2 # Changes need to be activated running: squid -k reconfigure ID: 43655 · Reply Quote

Toby Broom Volunteer moderator Send message Joined: 27 Sep 08 Posts: 864 Credit: 716,601,802 RAC: 186,822	Message 43661 - Posted: 21 Nov 2020, 15:45:40 UTC You can have 1 server and multiple clients or you need one per PC? What about the other projects not LHC? ID: 43661 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2631 Credit: 270,474,540 RAC: 70,071	Message 43663 - Posted: 21 Nov 2020, 16:30:40 UTC - in response to Message 43661. A single Squid instance is enough for the whole LAN. Frontier experts at Fermilab suggest to run a 2nd instance if you have more than 500 worker slots. This is thought for fail-over, not because of the load. Other projects send/receive work via the proxy as soon as it is configured in the BOINC client. Since most of them use HTTPS Squid simply works as a gateway. It never touches the content of HTTPS traffic. So far WCG is the only project known to me that requires a special setting. It fails if it requests files that are already in Squid's local cache. This can be fixed with a few configuration lines in squid.conf that tell Squid not to cache WCG. Example (already included in squid.conf): # # Start of extra section 1 # Requests that need special handling # worldcommunitygrid doesn't like it if data is taken from the local cache acl wcg_nocache dstdomain .worldcommunitygrid.org cache deny wcg_nocache ID: 43663 · Reply Quote

Toby Broom Volunteer moderator Send message Joined: 27 Sep 08 Posts: 864 Credit: 716,601,802 RAC: 186,822	Message 43811 - Posted: 10 Dec 2020, 11:48:54 UTC Last modified: 10 Dec 2020, 11:57:59 UTC How do I know it's working? I also see in the cache log: 2020/12/10 12:44:05 kid1\| WARNING: DNS lookup for 'atlas-db-squid.grid.uio.no' failed! 2020/12/10 12:44:05 kid1\| WARNING: DNS lookup for 'dcache.ijs.si' failed! 2020/12/10 12:44:05 kid1\| WARNING: DNS lookup for 'atlasfrontier-ai.cern.ch' failed! Seem like it breaks radioactive@home: Radioactive@Home \| Scheduler request failed: HTTP service unavailable I added it to the settings like WCG I can't upload any data from theroy? ID: 43811 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2631 Credit: 270,474,540 RAC: 70,071	Message 43815 - Posted: 10 Dec 2020, 12:39:04 UTC - in response to Message 43811. I also see in the cache log: 2020/12/10 12:44:05 kid1\| WARNING: DNS lookup for 'atlas-db-squid.grid.uio.no' failed! 2020/12/10 12:44:05 kid1\| WARNING: DNS lookup for 'dcache.ijs.si' failed! 2020/12/10 12:44:05 kid1\| WARNING: DNS lookup for 'atlasfrontier-ai.cern.ch' failed! Squid uses the DNS service provided by the OS/LAN. Errors like that point out a wrong DNS setup. You may check the IP of your DNS server in squid.conf, option "dns_nameservers ...". See: http://www.squid-cache.org/Versions/v3/3.5/cfgman/dns_nameservers.html Independent from the DNS setup, those servers are now obsolete. See: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5474&postid=43655 How do I know it's working? Did you test the squid with a common web browser (chapter 6 of the HowTo)? Seem like it breaks radioactive@home: Radioactive@Home \| Scheduler request failed: HTTP service unavailable Their Server Status Page currently mentions that the upload/download server is disabled. I can't upload any data from theroy? Did the basic tests from chapter 6 succeed? Is Squid's name/IP configured via the BOINC client proxy form? ID: 43815 · Reply Quote

Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 455 Credit: 210,993,103 RAC: 16,614	Message 43816 - Posted: 10 Dec 2020, 12:55:28 UTC Last modified: 10 Dec 2020, 12:57:56 UTC Here are my experiences from switching to Squid: Setting up Squid with the help of computezmle: easy switching clients to use proxy: tricky What has happened? First all my clients, but one, where working fine und using squid. The one, that didn't really work, seemed to be okay, but all Atlas-WUs failed within 20 minutes. Finally I found that I had to setup the proxy-settings on the clients with it's full domain-name, not only the machine-name. Okay, I wanted to be professionell and changed all other clients to use the full domain-name for the proxy. This was a bad idea, because now all formerly working fine clients couldn't upload the results anymore. I had to flushdns-cache on all clients and since then all is working fine. Maybe, a reboot would have solved it also. Perhaps this helps someone. Oh, we checked what squid is doing for my clients; in the last 3 weeks it has served 8.500.000 http(s)-requests from it's RAM-Cache Supporting BOINC, a great concept ! ID: 43816 · Reply Quote

maeax Send message Joined: 2 May 07 Posts: 2262 Credit: 175,581,097 RAC: 15	Message 43817 - Posted: 10 Dec 2020, 13:47:11 UTC CVMFS-Cache on a Linux-VM is to be controlled, when squid is stopped. cvmfs_config probe failed for two server: atlas-condb.cern.ch and alice.cern.ch Saw this today, and have activated squid for this VM again! Now it's ok. ID: 43817 · Reply Quote

Toby Broom Volunteer moderator Send message Joined: 27 Sep 08 Posts: 864 Credit: 716,601,802 RAC: 186,822	Message 43818 - Posted: 10 Dec 2020, 16:13:11 UTC - in response to Message 43815. I didn't make the test with browser :(, I set to my router in config and I now seem to be able to browse the web. Seems like its working now after updating the DNS settings. ID: 43818 · Reply Quote

Toby Broom Volunteer moderator Send message Joined: 27 Sep 08 Posts: 864 Credit: 716,601,802 RAC: 186,822	Message 43819 - Posted: 10 Dec 2020, 16:27:07 UTC - in response to Message 43816. I'm on localhost for testing so it's easy at the moment :) Can you see somewhere in the tasks they are using the cache? So far the var/cache folder is 72 bytes localhost.localdomain 3128 - - [10/Dec/2020:17:16:12 +0100] "CONNECT lhcathome.cern.ch:443 HTTP/1.1" 200 172839 "-" "BOINC client (windows_x86_64 7.16.11)" TCP_TUNNEL:HIER_DIRECT I could cache something like 64GB's worth of things in ram, I was thinking to get some Optane DIMMS, then I could take my host upto 640GB of memory. ID: 43819 · Reply Quote

Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 455 Credit: 210,993,103 RAC: 16,614	Message 43820 - Posted: 10 Dec 2020, 16:44:00 UTC - in response to Message 43819. Last modified: 10 Dec 2020, 16:44:26 UTC I could cache something like 64GB's worth of things in ram, I was thinking to get some Optane DIMMS, then I could take my host upto 640GB of memory. Toby, sorry, this makes really no sense. I was on the same trip as you and computezmle told me, not to do so. We kept his suggestion with: # You don't believe this is enough? # For sure, it is! cache_mem 256 MB maximum_object_size_in_memory 24 KB memory_replacement_policy heap GDSF My 8.500.000 hits are coming mostly from this "small" cache segment in memory Supporting BOINC, a great concept ! ID: 43820 · Reply Quote

Toby Broom Volunteer moderator Send message Joined: 27 Sep 08 Posts: 864 Credit: 716,601,802 RAC: 186,822	Message 43822 - Posted: 10 Dec 2020, 19:10:25 UTC - in response to Message 43820. Thanks for comment, seems like a small cache is fine then. ID: 43822 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2631 Credit: 270,474,540 RAC: 70,071	Message 43825 - Posted: 10 Dec 2020, 19:22:58 UTC Some comments for clarification. By default squid binds to all available network interfaces at port 3128. This can be controlled by "http_port ...". If the computer name is set to squid.home.arpa and the LAN IP 192.168.25.25 is used an arbitrary app on the squid box can send HTTP requests to either 1. localhost:3128 2. 127.0.0.1:3128 3. squid.home.arpa:3128 4. 192.168.25.25:3128 5. squid:3128 The recommended way is to use either (3.), (4.) or (5.) (4.) should work even if the nameresolution is not correctly configured. (1.) or (2.) should only be used for "manager" access, e.g. to get statistic data (see below). (1.) and (2.) don't work if a request is made from any other computer. This includes requests made by LHC@home VMs on the squid box. In this case (3.), (4.) or (5.) must be used. Pitfall: If a BOINC client on the squid box is configured to use a proxy at localhost/127.0.0.1 it will ask for work via the proxy but the VMs will bypass the proxy. In addition squid must be configured to allow incoming requests. This can be done using many different ways. 2 of them (individual source IP and network IP range) are shown in the example squid.conf. All requests that don't match a permission rule will be denied. To check whether squid works a browser can be used (see chapter 6 of the HowTo). It either displays arbitrary internet pages or returns useful error messages. Via the manager interface many useful information pages can be retreived, e.g. using the commands: squidclient mgr:info squidclient mgr:refresh In addition squid's access.log and cache.log should be checked. ID: 43825 · Reply Quote

Toby Broom Volunteer moderator Send message Joined: 27 Sep 08 Posts: 864 Credit: 716,601,802 RAC: 186,822	Message 43827 - Posted: 10 Dec 2020, 19:49:46 UTC I guess using local host messed it up. Now it looks to be working after some tweeking the .conf Boron 3128 - - [10/Dec/2020:20:47:11 +0100] "GET http://s1cern-cvmfs.openhtc.io/cvmfs/alice.cern.ch/.cvmfspublished HTTP/1.1" 200 1570 "-" "curl/7.29.0" TCP_MEM_HIT:HIER_NONE Boron 3128 - - [10/Dec/2020:20:47:27 +0100] "GET http://s1cern-cvmfs.openhtc.io/cvmfs/sft.cern.ch/data/3c/c5fae334e92d6f8521ad59ee36661789fad42cC HTTP/1.1" 200 12189 "-" "cvmfs Fuse 2.5.2 11ac1fe9-e05a-44b5-b29e-ca3d2370db11" TCP_MISS:HIER_DIRECT ID: 43827 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2631 Credit: 270,474,540 RAC: 70,071	Message 43836 - Posted: 10 Dec 2020, 21:18:21 UTC - in response to Message 43827. I guess using local host messed it up. Now it looks to be working after some tweeking the .conf Boron 3128 - - [10/Dec/2020:20:47:11 +0100] "GET http://s1cern-cvmfs.openhtc.io/cvmfs/alice.cern.ch/.cvmfspublished HTTP/1.1" 200 1570 "-" "curl/7.29.0" TCP_MEM_HIT:HIER_NONE Boron 3128 - - [10/Dec/2020:20:47:27 +0100] "GET http://s1cern-cvmfs.openhtc.io/cvmfs/sft.cern.ch/data/3c/c5fae334e92d6f8521ad59ee36661789fad42cC HTTP/1.1" 200 12189 "-" "cvmfs Fuse 2.5.2 11ac1fe9-e05a-44b5-b29e-ca3d2370db11" TCP_MISS:HIER_DIRECT This looks fine. Typical logfile lines if you step into the localhost pitfall: 2020-12-10 20:09:44 (10180): Guest Log: [INFO] Detected local proxy http://localhost:3128 in init_data.xml 2020-12-10 20:09:44 (10180): Guest Log: [INFO] Testing connection to localhost on port 3128 2020-12-10 20:09:44 (10180): Guest Log: Ncat: Connection refused. 2020-12-10 20:09:44 (10180): Guest Log: [INFO] 1 2020-12-10 20:09:44 (10180): Guest Log: [INFO] Local proxy can't be contacted and will be ignored . . . 2020-12-10 20:09:46 (10180): Guest Log: Probing /cvmfs/sft.cern.ch... OK 2020-12-10 20:09:46 (10180): Guest Log: VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE 2020-12-10 20:09:46 (10180): Guest Log: 2.5.2.0 4087 0 27208 19355 3 1 264838 4096000 0 65024 0 0 n/a 5 25 http://s1cern-cvmfs.openhtc.io/cvmfs/sft.cern.ch DIRECT 1 This BOINC client configures "localhost" as HTTP proxy and forwards that configuration to the VM. The VM tests it's internal "localhost" and since this fails (there's no internal proxy) it configures it's CVMFS to use a DIRECT connection to s1*-cvmfs.openhtc.io. To solve this the BOINC client's proxy entry has to be changed to "name_of_your_local_proxy" or "IP_of_your_local_proxy". ID: 43836 · Reply Quote