Message boards :
CMS Application :
slow gfal-copy
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Jun 08 Posts: 2549 Credit: 255,331,238 RAC: 60,653 |
Result upload via gfal-copy is extremely slow. Not more than 10 % of normal speed although my internet connection is free. <edit> Console messages: "Command timed out after 2400 seconds!" -> 48MB were uploaded "*newErr is not NULL impossible to overwrite ... old error wasHTTP 404 : File not found" </edit> <edit2> The following upload was successful. </edit2> |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 7,906,627 RAC: 12,738 |
We've seen the "impossible to overwrite" message before when a job has successfully completed but for some reason Condor thinks it hasn't. It then requeues the job, and when re-run it finds the output already there and throws the overwrite message and deletes the file. Then the third try usually succeeds. I used to see this when we were running CRAB, with WMAgent I don't have access to the logs, unfortunately. |
Send message Joined: 15 Jun 08 Posts: 2549 Credit: 255,331,238 RAC: 60,653 |
What made me wonder was the slow upload rate. 48 MB were already uploaded within 2400 s before the timeout cancelled it. That is around 20kB/s. I usually see 600kB/s. The connections to CERN (esp. condor) were established during the whole period and the 2nd slot finished a job and uploaded at normal speed. What I did not noticed is if the upload was interrupted by BOINC, e.g. if the client paused the CMS VM for a short period to run another project. |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 7,906,627 RAC: 12,738 |
|
Send message Joined: 15 Jun 08 Posts: 2549 Credit: 255,331,238 RAC: 60,653 |
Hi Ivan, I just noticed some unusual gfal-copy activities in a VM instead of it's regular end (it's not yet finished). In addition the job overview page shows "something" that could be the beginning of a dip. Is it false alarm or do the servers need a kick? |
Send message Joined: 15 Jun 08 Posts: 2549 Credit: 255,331,238 RAC: 60,653 |
... Is it false alarm or do the servers need a kick? It finally finished. Maybe just a hiccup. |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 7,906,627 RAC: 12,738 |
|
Send message Joined: 15 Jun 08 Posts: 2549 Credit: 255,331,238 RAC: 60,653 |
...You may have also noticed that since last night we have switched to a different proxy. You mean s1x-cvmfs.openhtc.io? They work like a charme since my hosts run v47.80. Fast and reliable, nothing to complain, neither DIRECT nor as parents of my local squid. :-) Cheers |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 7,906,627 RAC: 12,738 |
...You may have also noticed that since last night we have switched to a different proxy. Yes, that seems to be the badger. I got this message yesterday: [Adding Ivan -- we're talking about commercial (although free) caching, see http://openhtc.io, so far for LHC@Home CMS CVMFS usage and for U.S. CMS Opportunistic usage] followed by Ivan, could you please update that configuration? Change the line <proxy url="http://lhchomeproxy.cern.ch:3125"/> to the two lines <proxyconfig url="http://lhchomeproxy.cern.ch/wpad.dat"/> <proxyconfig url="http://lhchomeproxy.fnal.gov/wpad.dat"/> As I said earlier, we'll probably be changing between the two a few times to establish an optimum. There was a suggestion that I submit considerably shorter jobs for the study, but Laurence thinks he can extract the start-up times from existing log files. |
Send message Joined: 15 Jun 08 Posts: 2549 Credit: 255,331,238 RAC: 60,653 |
... Laurence thinks he can extract the start-up times from existing log files. If you look at the statistics the following numbers could be interesting. Timeframe: 2018-01-12 0:00 until 2018-01-12 22:48 Total request from CMS VMs to s1x-cvmfs.openhtc.io (forced to be checked by my local squid): 10520 (roughly 1.6 GB) Requests forwarded external: 1485 (byte count not calculated by default but available in the logs) Request efficiency of the local cache: >85 % This causes a relevant boost regarding the startup time of a VM. If other volunteers also use local proxies it could have an impact on the overall timing statistics. |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 7,906,627 RAC: 12,738 |
|
Send message Joined: 15 Jun 08 Posts: 2549 Credit: 255,331,238 RAC: 60,653 |
CMS uploads are currently very slow - only 10-12 % of normal upload speed. As ATLAS uploads use 100 % it's most likely not caused by something on my side. |
Send message Joined: 18 Dec 15 Posts: 1828 Credit: 119,573,359 RAC: 44,704 |
CMS uploads are currently very slow - only 10-12 % of normal upload speed.same here |
Send message Joined: 15 Jun 08 Posts: 2549 Credit: 255,331,238 RAC: 60,653 |
Today my CMS uploads run at 100 % speed. :-) |
Send message Joined: 18 Dec 15 Posts: 1828 Credit: 119,573,359 RAC: 44,704 |
Today my CMS uploads run at 100 % speed.here too :-) |
©2025 CERN