Thread 'Did cvmfs download ~150GB of data two days ago?'

Author	Message
wujj123456 Send message Joined: 14 Sep 08 Posts: 53 Credit: 84,750,345 RAC: 50,295	Message 46561 - Posted: 31 Mar 2022, 8:13:59 UTC Last modified: 31 Mar 2022, 8:17:39 UTC I setup native apps for theory application, and thus installed cvmfs. I noticed just now that from 2022-03-29 01:16 GMT-7 to 2022-03-29 02:46 GMT-7 (accurate to a minute or two), my server that runs LHC downloaded at full speed for more than an hour, totaling around 150GB of data. From logging on my router, I can see the source of traffic was all from 2606:4700:3033::6815:48a2, which is a cloudflare address. Then I retrieved syslog for my system and cvmfs related logs stood out: https://pastebin.com/rTVx9r3C. s1asgc-cvmfs.openhtc.io resolves to that exact address: https://pastebin.com/YfiR9SKB Unfortunately I have data cap from ISP so I need to be a bit more careful about such incidents. I've been running native theory on the same server for a year or two now and this is the first time I noticed such thing happening. I haven't touched its setup for quite a while, so I am fairly confident nothing should have changed on my end. Was this some one-time big update? A bug? Or is it expected from time to time? ID: 46561 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2739 Credit: 301,884,168 RAC: 83,686	Message 46562 - Posted: 31 Mar 2022, 8:47:38 UTC - in response to Message 46561. s1asgc-cvmfs.openhtc.io Yes, this is a CVMFS server (more precise: a proxy alias) hosted by Cloudflare on behalf of CERN/Fermilab. Large downloads happen from time to time although 150 GB within 1 h is very unusual. CVMFS acts like a cache and tries to use as much data as possible from it's local store. Unfortunately I have data cap from ISP ... Your task list shows lots of ATLAS tasks beside the Theory tasks. Be aware that each ATLAS task downloads an EVNT file of ~400 MB plus lots of smaller files from CVMFS/Frontier. It also uploads a result file of ~140 MB. You may consider to run a local Squid proxy to increase caching ratio and to check your CVMFS configuration. See: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5473 https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5474 https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5594 https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5595 ID: 46562 · Reply Quote

wujj123456 Send message Joined: 14 Sep 08 Posts: 53 Credit: 84,750,345 RAC: 50,295	Message 46563 - Posted: 31 Mar 2022, 9:14:26 UTC - in response to Message 46562. Last modified: 31 Mar 2022, 9:35:34 UTC Thanks for the reply. For the data cap, I mostly need to understand how much data I allocate for BOINC. Regarding ATLAS, it was running on the other Windows machine and I have set concurrency limit. Its usage is indeed high but very predictable and I've set aside enough for that. The server I mentioned here is S8026 in my list of computers, which only runs native Theory. Usually it has pretty low usage, but this download caught me off-guard a bit. I actually have Squid setup at my router directly and the hit rate is superb like 99%+ in terms of bytes for my Windows machine running both ATLAS and Theory in vbox. I didn't observe similar excessive download during the same period from the vbox WUs. Does that mean I might be better off forgoing the native installation but fully rely on Squid caching if I want predictable bandwidth usage? ID: 46563 · Reply Quote

wujj123456 Send message Joined: 14 Sep 08 Posts: 53 Credit: 84,750,345 RAC: 50,295	Message 46564 - Posted: 31 Mar 2022, 9:32:37 UTC - in response to Message 46562. Last modified: 31 Mar 2022, 9:35:41 UTC Large downloads happen from time to time although 150 GB within 1 h is very unusual. CVMFS acts like a cache and tries to use as much data as possible from it's local store. This reminded me of a few interesting details. The server only had 70-80G available space left and it was not filled up AFAIK. There is no way it could have stored 150GB of data for sure. Meanwhile, the Squid cache I configured is 32GB on disk but 99%+ of hit bytes are served from the 4GB memory cache. It doesn't seem that I even need more than 4GB of data from cvmfs, assuming the vbox theory workload is similar to native other than setup. I'm pretty curious what this download is actually doing. It kinda feels like a bug TBH... ID: 46564 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2739 Credit: 301,884,168 RAC: 83,686	Message 46565 - Posted: 31 Mar 2022, 11:59:53 UTC - in response to Message 46564. The server I mentioned here is S8026 Nobody but the owner can see the computer's names. If you mention a distinct machine you may always use a link (it includes the DB ID) like: https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10595991 I actually have Squid setup at my router directly Did you check whether your squid correctly rejects requests initiated from outside your LAN? I suspect you intercept all HTTP traffic from inside your LAN directly at the router and force it through squid, right? This means at least traffic to destination port 80. Are you aware that some CVMFS/Frontier servers use ports 8000 or 8080. It would be worth to also route them through squid. the Squid cache I configured is 32GB on disk but 99%+ of hit bytes are served from the 4GB memory cache. Did you really configure 4 GB RAM to be used by squid? That seems to be far too much if you mainly use it for BOINC. You may try out the tuning options from my HowTo and reduce this to 256 MB. This would leave more RAM for other use on the squid box, e.g. for disk cache. 4-10 GB is suggested to be the CVMFS disk cache size. vbox theory workload is similar to native If you run Theory vbox each VM will set up it's own CVMFS cache (meanwhile old and degraded). Hence, each task will send out lots of update requests. They get all lost when the VM shuts down. Your squid should cover most of them but its more efficient to run Theory native and keep the data in the local CVMFS cache on the crunching box. It kinda feels like a bug TBH... Could have been a refresh after a maintenance restart at Cloudflare or at CERN/Fermilab. We might never find out what happened. ID: 46565 · Reply Quote

maeax Send message Joined: 2 May 07 Posts: 2292 Credit: 179,134,044 RAC: 19,998	Message 46566 - Posted: 31 Mar 2022, 14:16:43 UTC - in response to Message 46564. I'm pretty curious what this download is actually doing. It kinda feels like a bug TBH... Between RedHat-Squid and PC's using Atlas and Theory Tasks in one month (sending 1.7 TByte and receiving 1.7 TByte) in LAN. Don't know what a normal transfer-rate can be. ID: 46566 · Reply Quote

wujj123456 Send message Joined: 14 Sep 08 Posts: 53 Credit: 84,750,345 RAC: 50,295	Message 46567 - Posted: 31 Mar 2022, 17:21:47 UTC - in response to Message 46565. Last modified: 31 Mar 2022, 17:23:21 UTC Did you check whether your squid correctly rejects requests initiated from outside your LAN? I suspect you intercept all HTTP traffic from inside your LAN directly at the router and force it through squid, right? This means at least traffic to destination port 80. Are you aware that some CVMFS/Frontier servers use ports 8000 or 8080. It would be worth to also route them through squid. Yes, the squid proxy only listens on internal interfaces and port 80. I can put a monitoring rule to check how much traffic there is on 8000 and 8080, but at least from what I see, I don't think there are major traffic not captured by the current setup for vbox WUs. You may try out the tuning options from my HowTo and reduce this to 256 MB. This would leave more RAM for other use on the squid box, e.g. for disk cache. 4-10 GB is suggested to be the CVMFS disk cache size. Good to know. I intend to capture system update and Steam updates too, which is why it's large. Though they are rare enough so most of time LHC is just having a great hit rate. :-) If you run Theory vbox each VM will set up it's own CVMFS cache (meanwhile old and degraded). Hence, each task will send out lots of update requests. They get all lost when the VM shuts down. Your squid should cover most of them but its more efficient to run Theory native and keep the data in the local CVMFS cache on the crunching box. That's what I thought, and also native consumes much less memory. However, if such big downloads happen often enough, it would change the balance. Thus my question trying to understand what happened and how likely or often could it happen again. ID: 46567 · Reply Quote