Message boards : ATLAS application : WUs fail after requests to ccsqfatlasli01.in2p3.fr
Message board moderation

To post messages, you must log in.

AuthorMessage
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1992
Credit: 143,857,330
RAC: 96,836
Message 29778 - Posted: 3 Apr 2017, 6:01:35 UTC

@David Cameron

ATLAS WUs try again to contact ccsqfatlasli01.in2p3.fr on TCP port 23128 and then fail.
IIRC from a thread at the old ATLAS board this is due to one or more failed services inside the CERN network.

Unfortunately the old thread is no longer visible for me so I can´t link to it.
There is only one hint at the new board:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4146&postid=29216
ID: 29778 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1992
Credit: 143,857,330
RAC: 96,836
Message 29783 - Posted: 3 Apr 2017, 10:59:47 UTC

@David Cameron

New WUs are starting now. That´s the good news.

They still request access to ccsqfatlasli01.in2p3.fr but now they ignore the firewall drop and fallback to another partner:
atlasfrontier-ai.cern.ch on TCP port 8000.

If atlasfrontier-ai.cern.ch:8000 is a permanent solution TCP port 8000 should be mentioned on the FAQ page.
ID: 29783 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 366
Credit: 13,262,778
RAC: 6,995
Message 29784 - Posted: 3 Apr 2017, 11:02:32 UTC - in response to Message 29783.  

Good point, thanks for mentioning it.

What I did this morning was to add atlasfrontier-ai.cern.ch as a fallback in case the normal database servers are down. I've had one completed task and one running for 30 mins so looks like this is working ok.
ID: 29784 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1992
Credit: 143,857,330
RAC: 96,836
Message 29785 - Posted: 3 Apr 2017, 11:06:13 UTC - in response to Message 29784.  

Good point, thanks for mentioning it.

What I did this morning was to add atlasfrontier-ai.cern.ch as a fallback in case the normal database servers are down. I've had one completed task and one running for 30 mins so looks like this is working ok.

So I guess it´s permanent?
ID: 29785 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 431
Credit: 117,525,067
RAC: 0
Message 29786 - Posted: 3 Apr 2017, 11:38:39 UTC - in response to Message 29778.  

ATLAS WUs try again to contact ccsqfatlasli01.in2p3.fr on TCP port 23128 and then fail.

Is this a typo ? 3128 instead of 23128


Supporting BOINC, a great concept !
ID: 29786 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1992
Credit: 143,857,330
RAC: 96,836
Message 29787 - Posted: 3 Apr 2017, 11:48:10 UTC - in response to Message 29786.  

ATLAS WUs try again to contact ccsqfatlasli01.in2p3.fr on TCP port 23128 and then fail.

Is this a typo ? 3128 instead of 23128

No. The request is really to port 23128.
I guess it´s a substitution for squid´s standard port 3128 if there are more than 1 squids on the same machine.
ID: 29787 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 366
Credit: 13,262,778
RAC: 6,995
Message 29789 - Posted: 3 Apr 2017, 13:35:43 UTC - in response to Message 29785.  

Good point, thanks for mentioning it.

What I did this morning was to add atlasfrontier-ai.cern.ch as a fallback in case the normal database servers are down. I've had one completed task and one running for 30 mins so looks like this is working ok.

So I guess it´s permanent?


Yes, I will keep it there. But it should only be used in exceptional circumstances when the other two servers are not working.

I have added this port and 23128 (not a typo) to the firewall info page.
ID: 29789 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1992
Credit: 143,857,330
RAC: 96,836
Message 29790 - Posted: 3 Apr 2017, 14:04:10 UTC - in response to Message 29789.  

@David Cameron

I have added this port and 23128 (not a typo) to the firewall info page.

Thank you.


Sorry to bother you.
nc -z -v -w 5 atlasfrontier-ai.cern.ch 8000
Connection to atlasfrontier-ai.cern.ch 8000 port [tcp/irdmi] succeeded!

O.K.


nc -z -v -w 5 ccsqfatlasli01.in2p3.fr 23128
nc: connect to ccsqfatlasli01.in2p3.fr port 23128 (tcp) timed out: Operation now in progress

Not O.K. although my firewall is now open.
ID: 29790 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 366
Credit: 13,262,778
RAC: 6,995
Message 29811 - Posted: 4 Apr 2017, 10:08:45 UTC - in response to Message 29790.  

Thanks for this, it's actually a mistake in our ATLAS@Home configuration. The server should be ccfrontier.in2p3.fr:23128. I've fixed this now for new tasks.
ID: 29811 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1992
Credit: 143,857,330
RAC: 96,836
Message 29813 - Posted: 4 Apr 2017, 10:52:04 UTC - in response to Message 29811.  

Thanks for this, it's actually a mistake in our ATLAS@Home configuration. The server should be ccfrontier.in2p3.fr:23128. I've fixed this now for new tasks.


That looks very good!
:-)

nc -z -v -w 5 ccfrontier.in2p3.fr 23128
Connection to ccfrontier.in2p3.fr 23128 port [tcp/*] succeeded!


And it delivers valid content:

telnet ccfrontier.in2p3.fr 23128
Trying 134.158.239.12...
Connected to ccfrontier.in2p3.fr.
Escape character is '^]'.
GET http://ccsqfatlasli01.in2p3.fr:23128/ccin2p3-AtlasFrontier/Frontier/type=frontier_request:1:DEFAULT&encoding=BLOBzip5&p1=eNplj8EKwyAMhl8l5DxGZ3ftIWi6ClaLEcZOvv9bzAlbIzsEku--85MIB7YFDsocS43JsXdAAjgQvHwdhR6joYOm280HpwM0.OnnupqbKi.pPkrxO3dVzQhrTjsglUBSbUohraE6K3jF1h1mmj4xphXCP3lunHn8b4HbDBTd8NMCpjN9djPe35fFTC8_ HTTP/1.0

HTTP/1.0 200 OK
Last-Modified: Thu, 09 Feb 2017 15:34:02 GMT
Content-Type: text/xml;charset=US-ASCII
Server: Apache-Coyote/1.1
Cache-Control: max-age=3000, stale-if-error=1
Date: Tue, 04 Apr 2017 10:02:21 GMT
Age: 2385
Content-Length: 596
X-Cache: HIT from ccosvms0012.in2p3.fr
Via: 1.1 ccosvms0012.in2p3.fr:23128 (squid/frontier-squid-2.7.STABLE9-26.1)
Connection: close

<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE frontier SYSTEM "http://frontier.fnal.gov/frontier.dtd">
<frontier version="3.34" xmlversion="1.0">
<transaction payloads="1">
<payload type="frontier_request" version="1" encoding="BLOBzip5">
<data>
<keepalive />eF5jY2Bg4A1wDHL1C4n383dx9XRhA4pw+YX6OrkGaRgaaIK4PFAFIY7u2OWdPTx9XHDr54bI49LO
HRwZHO/pFxzi6esK4nOEOQY5ezgGGbGDeEyGxiCK0QjCMUFwFI0MDA11DUx0DS3jDcysDEysDEz1
DCyNjE0tDQwMFNx9Q9gBzO8mxA==</data>
<quality error="0" md5="4ea0ac0c31d9a0422b9ceb3e7fcf47b4" records="1" full_size="223"/>
</payload>
</transaction>
</frontier>
Connection closed by foreign host.
ID: 29813 · Report as offensive     Reply Quote

Message boards : ATLAS application : WUs fail after requests to ccsqfatlasli01.in2p3.fr


©2022 CERN