HTTP-Proxy Setting(s)

Author	Message
Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 468 Credit: 214,943,087 RAC: 46,848	Message 29272 - Posted: 14 Mar 2017, 19:30:54 UTC Last modified: 14 Mar 2017, 19:32:23 UTC I'm wondering about following: My clients have no HTTP-Proxy, the old Proxy-Setting is simpy deactivated: So, I was wondering why I found this snippet from a boot-sequence of an Atlas-Task: 2017-03-14 18:58:32 (6308): VM state change detected. (old = 'paused', new = 'running') 2017-03-14 18:58:42 (6308): Guest Log: copied the webapp to /var/www 2017-03-14 18:58:42 (6308): Guest Log: set up http_proxy http://squid:8080 2017-03-14 18:58:42 (6308): Guest Log: ATHENA_PROC_NUMBER=5 2017-03-14 18:58:42 (6308): Guest Log: Starting ATLAS job. (PandaID=3281037544 taskID=10947180) I'm wondering because the Host seems to do fine with Atlas Supporting BOINC, a great concept ! ID: 29272 · Reply Quote

David Cameron Project administrator Project developer Project scientist Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0	Message 29277 - Posted: 14 Mar 2017, 21:54:27 UTC - in response to Message 29272. Last modified: 14 Mar 2017, 21:55:15 UTC Do you have something in BOINC configuration setting a proxy? The ATLAS scripts use the information in the init_data.xml file which is created by BOINC Client for each WU, by reading information in the client configuration. See here https://boinc.berkeley.edu/wiki/Client_configuration The section <proxy_info> contains the settings which are put in init_data.xml and used by ATLAS. ID: 29277 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,884,951 RAC: 55,805	Message 29283 - Posted: 15 Mar 2017, 7:19:26 UTC Once set via GUI or configuration file the BOINC client stores the proxy setting in client_state.xml <use_http_proxy/> # this tag is deleted if you switch the proxy usage off <http_server_name>proxy.example.com</http_server_name> # this tag remains filled <http_server_port>3128</http_server_port> # this tag remains filled If ATLAS reads only <http_server_name> and <http_server_port> but ignores a missing <use_http_proxy/> it will try to contact the project via proxy and will fallback to a direct connection if the proxy is down. ID: 29283 · Reply Quote

Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 468 Credit: 214,943,087 RAC: 46,848	Message 29296 - Posted: 15 Mar 2017, 11:34:07 UTC Okay, this seems to be a minor bug. As in my BOINC-Settings HTTP-Proxy was deactivated, the client should give this information correct to init_data.xml And / or the VirtualBox / Atlas-Application should interpret this correct. I can not decide who does it wrong but someone should take a closer look and fix it. Is it possible / thinkable that the fallback will not work ? When is this parameter (in init_data.xml) set? At Download of the WU or at first startup in BOINC-Client ? Supporting BOINC, a great concept ! ID: 29296 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,884,951 RAC: 55,805	Message 29300 - Posted: 15 Mar 2017, 12:14:33 UTC To be honest IÂ´m not really sure if the proxy setting works from inside an ATLAS VM as I redirect all HTTP traffic from the VM to CERN through my proxy using a set of netfilter rules. On the other hand: If your VMs have a proxy set and the tasks work although it is offline, the fallback works or the proxy setting is simply ignored. ID: 29300 · Reply Quote

gyllic Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,539,793 RAC: 175	Message 29524 - Posted: 22 Mar 2017, 12:12:11 UTC - in response to Message 29300. Since my internet bandwidth is pretty low, I am trying to/thinking about using a proxy server. Will this help to make sure that internet traffic from BOINC and ATLAS VMs is going to be less and the efficiency of the tasks will rise due to lower download times? If that is the case, I have some questions: - Which proxy software do you recommend? Squid? - How do you configurate the proxy in order to get the best performance for BOINC (i.e. ATLAS) (is sudo apt-get install squid3 on, for example, debian enough)? - How big is the benefit in lowering the internet traffic and in increasing the efficiency (if there is one)? I tried to set up and use a squid proxy server, and the task shows in its logs that it is using it: 2017-03-20 17:14:15 (4408): Guest Log: Copied input files into RunAtlas. 2017-03-20 17:14:25 (4408): Guest Log: copied the webapp to /var/www 2017-03-20 17:14:25 (4408): Guest Log: set up http_proxy http://192.168.1.2:3128 2017-03-20 17:14:25 (4408): Guest Log: ATHENA_PROC_NUMBER=4 2017-03-20 17:14:25 (4408): Guest Log: Starting ATLAS job. (PandaID=3296332715 taskID=10995520) But I did not notice any difference in running time or efficiency for a couple of tasks using the proxy compared to tasks without proxy. So I am wondering if I have configurate the proxy server in a wrong way for ATLAS or the benefit (if there is one) is so small that it is almost unnoticeable. ID: 29524 · Reply Quote

HerveUAE Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0	Message 29533 - Posted: 22 Mar 2017, 20:19:52 UTC or the benefit (if there is one) is so small that it is almost unnoticeable. This this post where computezrmle mentionned the expected benefit of using a local squid proxy: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4146&postid=29287#29287 I have not tried myself, but will try when time permits. We are the product of random evolution. ID: 29533 · Reply Quote

PHILIPPE Send message Joined: 24 Jul 16 Posts: 88 Credit: 239,917 RAC: 0	Message 29541 - Posted: 22 Mar 2017, 21:19:39 UTC - in response to Message 29533. Well , if i have understood the docs shared by David Cameron. The multi-core wu seems to have three main phases during the running. 1Â° The beginning where lots of communications are transmitted between server and client and where the client delays events operations to optimize the sharing of process memory inside the vm. -------------------------------------------------------------------------------- 2Â° The intermediate running where events are treated by a particular core inside the vms,independently of other cores.(During this phase all the cores are used:-->Max efficiency) -------------------------------------------------------------------------------- 3Â° The ending,where output results of each core are merged together as soon as they are ended seperately.(but not in the same time --> loss of efficiency) The influence of the proxy squid may only be focused on the beginning phase during the initializing communication.I don't see elsewhere it may improve the situation. To increase the wu efficiency, it's possible to increase the number of events treated (to lenghten the second phase and reduce the time of the 1Â° and 3Â°phase,less efficient) But why not stop the multi-core vm when the number of events running is below the number of core used by the vm (then the idle times of cores which ends their events in first position are cut).Thus these few events not treated might be treated in another wu and so on. Is it possible ? Is it worth? ID: 29541 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,884,951 RAC: 55,805	Message 29652 - Posted: 26 Mar 2017, 15:33:54 UTC Sorry for the late response. I was on vacation last week. Regarding the use of a proxy I wrote several comments in different CERN message boards. In general a user with a slow internet connection and a high number of hosts would have the highest benefit. Users with only 1 host but a slow internet connection would still see a significant speedup. My squid serves 2 crunching hosts with ATLAS and CMS (beside non CERN projects). Numbers vary but are typically inside the following ranges: Requests per day: 80000-150000 (2300 non CERN) Request hits: 90-95% Byte hits: 40-60% I recommend a setup with squid (version 3.5) combined with a set of iptables rules and a special routing table to enable policy based routing (PBR). Therefore I recommend linux as base OS. If my last information is correct PBR is not easy - if not impossible - to set up on windows. A workaround could be to run a linux box as standard gateway. Windows experts may have better proposals. In my LAN a reactivated laptop with 2 GB RAM and a 2core CPU does the job. 1 GB RAM would be enough as squid needs not more than 128 MB cache RAM. Benefits are (once the data is cached): - buffered downloads of the .vdi if there is more than one host or after project resets - shorter initialisation phase - buffered downloads during calculation phase (this amount is surprisingly high) ID: 29652 · Reply Quote

Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 468 Credit: 214,943,087 RAC: 46,848	Message 29653 - Posted: 26 Mar 2017, 15:57:11 UTC When I started with Atlas I set up a squid proxy on Linux and routed all traffic through this one. But I never was successful that files got buffered / cashed. All PC (up to 10) downloaded their own files and even the VDIs came never from Squid-Cash. So, after some time, I decided to switch off the proxy again. Supporting BOINC, a great concept ! ID: 29653 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,884,951 RAC: 55,805	Message 29654 - Posted: 26 Mar 2017, 16:48:48 UTC During my vacation I set ATLAS and CMS to NNT to avoid huge data transfers if the WUs throw errors. After my return I noticed that ATLAS is now non beta at lhcathome. So I reset the project on both hosts to get a clear restart. Here are the lines from my squid logfile: "GET http://lhcathomeclassic.cern.ch/sixtrack/download/ATLASM_2017_03_01.vdi.gz HTTP/1.1" 200 709837529 "-" "BOINC client (x86_64-pc-linux-gnu 7.6.31)" TCP_MISS:HIER_DIRECT "GET http://lhcathomeclassic.cern.ch/sixtrack/download/ATLASM_2017_03_01.vdi.gz HTTP/1.1" 200 709837535 "-" "BOINC client (x86_64-pc-linux-gnu 7.6.31)" TCP_REFRESH_UNMODIFIED:HIER_DIRECT "GET http://lhcathomeclassic.cern.ch/sixtrack/download/CMS_2016_10_31.vdi.gz HTTP/1.1" 200 665580408 "-" "BOINC client (x86_64-pc-linux-gnu 7.6.31)" TCP_MISS:HIER_DIRECT "GET http://lhcathomeclassic.cern.ch/sixtrack/download/CMS_2016_10_31.vdi.gz HTTP/1.1" 200 665580414 "-" "BOINC client (x86_64-pc-linux-gnu 7.6.31)" TCP_REFRESH_UNMODIFIED:HIER_DIRECT Comments: - first ATLAS vdi was requested from the original server - first CMS vdi was requested from the original server as the cached file was expired long ago and therefore not present - second download was taken from the cache in both cases - file size differs slightly due to my log configuration (itÂ´s not an error) And here are some "big dog" examples from this afternoon. All of them were taken from the local cache: "GET http://cvmfs-stratum-one.cern.ch/cvmfs/atlas.cern.ch/data/e2/4f258a36360f3a07323861b6a6dcfd0a1bf7e0C HTTP/1.1" 200 13802975 "-" "cvmfs Fuse 2.2.0 cde1ef90-c9c9-4e9e-9c99-e94dd884ad14" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/sft.cern.ch/data/c4/5988eb00d1390e0046f8566b2d2255f0cdbafcC HTTP/1.1" 200 24079944 "-" "cvmfs Fuse 2.2.0 cde1ef90-c9c9-4e9e-9c99-e94dd884ad14" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/atlas.cern.ch/data/92/d3f64b1d2337e95c2930c6bd2901d27b56fca1C HTTP/1.1" 200 39168735 "-" "cvmfs Fuse 2.2.0 cde1ef90-c9c9-4e9e-9c99-e94dd884ad14" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/grid.cern.ch/data/b3/b3c804ebba27aac0ff6ffcd83c34642ee52b7aC HTTP/1.1" 200 15565132 "-" "cvmfs Fuse 2.2.0 cde1ef90-c9c9-4e9e-9c99-e94dd884ad14" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/grid.cern.ch/data/01/893c59b64f904387f7d6e9845a7d0cd85a9b4aC HTTP/1.1" 200 15565255 "-" "cvmfs Fuse 2.2.0 cde1ef90-c9c9-4e9e-9c99-e94dd884ad14" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/grid.cern.ch/data/01/893c59b64f904387f7d6e9845a7d0cd85a9b4aC HTTP/1.1" 200 15565255 "-" "cvmfs Fuse 2.2.0 cde1ef90-c9c9-4e9e-9c99-e94dd884ad14" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/grid.cern.ch/data/01/893c59b64f904387f7d6e9845a7d0cd85a9b4aC HTTP/1.1" 200 15565318 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/c8/217f33a06b71de6b7f968ac3ec3993c94f69dfC HTTP/1.1" 200 12655983 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/72/04d37d95ead13c5c1416217df6cbead4ee1010C HTTP/1.1" 200 16129138 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/99/65445c8d12e2bfb607d983b35d63b3ff05c2a0C HTTP/1.1" 200 40265468 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/c9/67573daf3d0702d63cd9dc07192ee2241d3f81C HTTP/1.1" 200 10343694 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/bc/809383d667295b9145b275faf9258ae315ae8cC HTTP/1.1" 200 59210512 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/5a/87b1f559ad1c321ca2de8931a76d0911375a70C HTTP/1.1" 200 26384894 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/04/7e05e645cf1648b816032f56de9882f3416bdcC HTTP/1.1" 200 51276111 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/f7/a0b90ecea8de791333b9cd66f0587ea8cbc912C HTTP/1.1" 200 55211160 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/fb/9a3d704a7e91ffee4c2e6f72d46baa576ed915P HTTP/1.1" 200 11397272 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/f7/05406ac4ad8d393e5d32321723396d631701d2C HTTP/1.1" 200 12655967 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/ec/47fb249669f741a690e372968d2107b7f432b3C HTTP/1.1" 200 12655981 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/atlas.cern.ch/data/10/b0a217e585a43164f6c6a28908795ebe879885C HTTP/1.1" 200 13344323 "-" "cvmfs Fuse 2.2.0 cde1ef90-c9c9-4e9e-9c99-e94dd884ad14" TCP_HIT:HIER_NONE "GET http://cvmfs-stratum-one.cern.ch/cvmfs/cms.cern.ch/data/8a/d18585945dc9ff56a13b87ab284c0e11fc65d5C HTTP/1.1" 200 12655972 "-" "cvmfs Fuse 2.2.0 6d0ae20b-f23e-4d5d-b5ca-600a8fb1d26c" TCP_HIT:HIER_NONE ID: 29654 · Reply Quote

gyllic Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,539,793 RAC: 175	Message 29902 - Posted: 11 Apr 2017, 14:54:28 UTC - in response to Message 29652. Thanks for the information and sorry for the late response. To be honest IÂ´m not really sure if the proxy setting works from inside an ATLAS VM as I redirect all HTTP traffic from the VM to CERN through my proxy using a set of netfilter rules. Is this the only reason why you use the approach with policy based rounting or are there other benefits? How do you manage to only redirect the VM traffic to the proxy? Do you think a Banana Pi (similar to a raspberry pi) is powerful enough to do the job as proxy server without losing too much benefit because of the hardware? Are you using iptables and iproute2? ID: 29902 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2683 Credit: 286,884,951 RAC: 55,805	Message 29904 - Posted: 11 Apr 2017, 16:35:07 UTC - in response to Message 29902. Is this the only reason why you use the approach with policy based rounting or are there other benefits? I use policy based routing mainly for ATLAS and CMS VMs although ATLAS is (now) able to read BOINCÂ´s proxy setting. How do you manage to only redirect the VM traffic to the proxy? I mark relevant packets (from user: boinc; destination: CERN; to-port: 80, 3125, 3128) and send them via an additional routing table to my proxy instead of the standard gateway. The proxy is configured to handle normal traffic as well as intercepted traffic (extra port). Do you think a Banana Pi (similar to a raspberry pi) is powerful enough to do the job as proxy server without losing too much benefit because of the hardware? CPU: OK RAM: OK (128 MB cache is more than enough to serve thousands of the small ATLAS or CMS files) Disk: ?? I suggest to spend at least 10-30 GB for the big files Network: speed should fit to your LAN Configure squid to cache small files only in RAM and big files only on disk. Are you using iptables and iproute2? iptables, iproute2, conntrack-tools. ItÂ´s all included in my linux distribution ID: 29904 · Reply Quote

gyllic Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,539,793 RAC: 175	Message 29964 - Posted: 18 Apr 2017, 17:08:10 UTC - in response to Message 29904. thanks for your help! ID: 29964 · Reply Quote

LHC@home