Message boards :
Number crunching :
Recommended CVMFS Configuration for Native Apps - Comments and Questions
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 15 Jun 08 Posts: 1965 Credit: 139,563,493 RAC: 85,129 ![]() ![]() ![]() |
This is a discussion thread to post comments and questions regarding the CVMFS Configuration used by LHC@home: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5594 A previous version of the HowTo and older comments can be found here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5342 |
![]() Send message Joined: 8 May 17 Posts: 12 Credit: 11,235,723 RAC: 0 ![]() ![]() |
Hello, My systems are all configured to use at least 4GB of local CVMFS cache and according to stats, they seem to reach pretty high hit ratios. Here is just a look at a couple of them: ![]() Would a proxy still help here ? Also are there specific Squid options one should set in regards to caching CVMFS data (ie. max object size, retention or refresh policy) ? |
![]() Send message Joined: 15 Jun 08 Posts: 1965 Credit: 139,563,493 RAC: 85,129 ![]() ![]() ![]() |
My systems ... seem to reach pretty high hit ratios. Yes. Here are 2 major reasons. Each CVMFS client does only serve tasks that are running on the same box (or inside the same VM). A single Squid serves all boxes and all VMs running at your site. Tasks like ATLAS or CMS heavily use CERN's Frontier system. Frontier requests data via HTTP but unlike CVMFS it has no own local cache. A local Squid closes this gap and serves most Frontier requests from it's cache. Also are there specific Squid options one should set in regards to caching CVMFS data (ie. max object size, retention or refresh policy) ? It's all covered by the squid.conf in the Squid HowTo. Some of Squid's original settings have been made decades ago and focus on surfing the web via slow connections. The suggestions in this forum extend the original settings and are based on experience and analysing the data flow created by LHC@home. Nonetheless, surfing arbitrary internet pages with this settings is still possible but since most of them use HTTPS the hitrates for them would drop to 0 %. Questions regarding specific Squid options should be asked in the Squid thread. |
Send message Joined: 2 May 07 Posts: 1508 Credit: 48,472,146 RAC: 122,109 ![]() ![]() ![]() |
|
Send message Joined: 15 Nov 14 Posts: 589 Credit: 21,798,687 RAC: 6,756 ![]() ![]() ![]() |
Very nice; thanks. But I think it should be pointed out that the automatic configuration download no longer applies, insofar as I can see. (sudo wget https://lhcathome.cern.ch/lhcathome/download/default.local -O /etc/cvmfs/default.local) Maybe it could be updated? |
Send message Joined: 17 Feb 17 Posts: 42 Credit: 2,589,736 RAC: 2,602 ![]() ![]() ![]() |
Very nice; thanks. I had this problem, as well. Probing immediately failed. Perhaps that file could be updated with the minimum needed configuration, although I'm still unclear how one can actually optimize their configuration if it is just 1 or 2 machines on the same connection. |
![]() Send message Joined: 15 Jun 08 Posts: 1965 Credit: 139,563,493 RAC: 85,129 ![]() ![]() ![]() |
Perhaps that file could be updated with the minimum needed configuration ... The file on the server is already up to date. Be aware that it includes 2 optional settings (with proxy/without proxy) and one of them has to be activated by the user. In general: Native apps require more settings to be done by the user. This is easier, faster and more reliable than to guess certain values. In addition some steps require to be done by root. although I'm still unclear how one can actually optimize their configuration The simple answer Cache as much as possible as close as possible to the point were it is used. To avoid less efficient effort focus on the major bottlenecks first. More LHC@home specific CVMFS is heavily used but has it's own cache - one cache instance per machine. A machine can't share it's CVMFS cache with other machines. Each VM counts as individual machine. Outdated or missing data is requested from the project servers. Frontier is heavily used by ATLAS and CMS. It has no own local cache. Each app sends all Frontier requests to the project servers. Cloudflare's openhtc.io infrastructure helps to distribute CVMFS and Frontier data. They run a very fast worldwide network and one of their proxy caches will most likely be located much closer to your clients than any project server. VBox apps use openhtc.io by default but users running native apps have to set "CVMFS_USE_CDN=yes" in their CVMFS configuration. This is disabled in the default configuration because lots of computers in various datacenters use special connections and require this to be set "OFF". A local HTTP proxy closes the gap between openhtc.io and the local clients. It can cache data for all local CVMFS and Frontier clients as well as offload openhtc.io and the project servers. |
Send message Joined: 17 Feb 17 Posts: 42 Credit: 2,589,736 RAC: 2,602 ![]() ![]() ![]() |
Perhaps that file could be updated with the minimum needed configuration ... How does one go about cacheing as much as possible? Not sure what happened in my case, then, since as soon as I downloaded https://lhcathome.cern.ch/lhcathome/download/default.local -O /etc/cvmfs/default.local I got immediate failures after probing. Running the listed items in the how to fixed my issues, and I believe I also added the line containing openhtc.io. Thank you for the help and excellent clarification. |
Send message Joined: 2 May 07 Posts: 1508 Credit: 48,472,146 RAC: 122,109 ![]() ![]() ![]() |
VBox apps use openhtc.io by default but users running native apps have to set "CVMFS_USE_CDN=yes" in their CVMFS configuration. Release Notes from CVMFS-Documentation 2.7.5 Atlas-Applet in Windows is using CVMFS 2.6.3. |
![]() Send message Joined: 15 Jun 08 Posts: 1965 Credit: 139,563,493 RAC: 85,129 ![]() ![]() ![]() |
Atlas-Applet in Windows is using CVMFS 2.6.3. CVMFS_USE_CDN makes it easier to switch between the traditional CVMFS server list and the Cloudflare server list. Older setups had to configure this manually which is still possible. It's all fine as long as an application from this project uses Cloudflare servers. Even CMS VMs that use v2.4.4.0 work fine. Related to CVMFS_USE_CDN it's more important to use a recent cvmfs-config-default package than to upgrade the CVMFS client: http://ecsft.cern.ch/dist/cvmfs/cvmfs-config/ |
![]() Send message Joined: 12 Jun 18 Posts: 108 Credit: 37,970,761 RAC: 0 ![]() ![]() |
What does this mean??? cvmfs_config stat /usr/bin/cvmfs_config: line 907: cd: /cvmfs/atlas.cern.ch: Transport endpoint is not connected |
©2022 CERN