Message boards : ATLAS application : WUs fail after requests to ccsqfatlasli01.in2p3.fr
Message board moderation

To post messages, you must log in.

AuthorMessage
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,898,697
RAC: 138,219
Message 29778 - Posted: 3 Apr 2017, 6:01:35 UTC

@David Cameron

ATLAS WUs try again to contact ccsqfatlasli01.in2p3.fr on TCP port 23128 and then fail.
IIRC from a thread at the old ATLAS board this is due to one or more failed services inside the CERN network.

Unfortunately the old thread is no longer visible for me so I can´t link to it.
There is only one hint at the new board:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4146&postid=29216
ID: 29778 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,898,697
RAC: 138,219
Message 29783 - Posted: 3 Apr 2017, 10:59:47 UTC

@David Cameron

New WUs are starting now. That´s the good news.

They still request access to ccsqfatlasli01.in2p3.fr but now they ignore the firewall drop and fallback to another partner:
atlasfrontier-ai.cern.ch on TCP port 8000.

If atlasfrontier-ai.cern.ch:8000 is a permanent solution TCP port 8000 should be mentioned on the FAQ page.
ID: 29783 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 29784 - Posted: 3 Apr 2017, 11:02:32 UTC - in response to Message 29783.  

Good point, thanks for mentioning it.

What I did this morning was to add atlasfrontier-ai.cern.ch as a fallback in case the normal database servers are down. I've had one completed task and one running for 30 mins so looks like this is working ok.
ID: 29784 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,898,697
RAC: 138,219
Message 29785 - Posted: 3 Apr 2017, 11:06:13 UTC - in response to Message 29784.  

Good point, thanks for mentioning it.

What I did this morning was to add atlasfrontier-ai.cern.ch as a fallback in case the normal database servers are down. I've had one completed task and one running for 30 mins so looks like this is working ok.

So I guess it´s permanent?
ID: 29785 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 29786 - Posted: 3 Apr 2017, 11:38:39 UTC - in response to Message 29778.  

ATLAS WUs try again to contact ccsqfatlasli01.in2p3.fr on TCP port 23128 and then fail.

Is this a typo ? 3128 instead of 23128


Supporting BOINC, a great concept !
ID: 29786 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,898,697
RAC: 138,219
Message 29787 - Posted: 3 Apr 2017, 11:48:10 UTC - in response to Message 29786.  

ATLAS WUs try again to contact ccsqfatlasli01.in2p3.fr on TCP port 23128 and then fail.

Is this a typo ? 3128 instead of 23128

No. The request is really to port 23128.
I guess it´s a substitution for squid´s standard port 3128 if there are more than 1 squids on the same machine.
ID: 29787 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 29789 - Posted: 3 Apr 2017, 13:35:43 UTC - in response to Message 29785.  

Good point, thanks for mentioning it.

What I did this morning was to add atlasfrontier-ai.cern.ch as a fallback in case the normal database servers are down. I've had one completed task and one running for 30 mins so looks like this is working ok.

So I guess it´s permanent?


Yes, I will keep it there. But it should only be used in exceptional circumstances when the other two servers are not working.

I have added this port and 23128 (not a typo) to the firewall info page.
ID: 29789 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,898,697
RAC: 138,219
Message 29790 - Posted: 3 Apr 2017, 14:04:10 UTC - in response to Message 29789.  

@David Cameron

I have added this port and 23128 (not a typo) to the firewall info page.

Thank you.


Sorry to bother you.
nc -z -v -w 5 atlasfrontier-ai.cern.ch 8000
Connection to atlasfrontier-ai.cern.ch 8000 port [tcp/irdmi] succeeded!

O.K.


nc -z -v -w 5 ccsqfatlasli01.in2p3.fr 23128
nc: connect to ccsqfatlasli01.in2p3.fr port 23128 (tcp) timed out: Operation now in progress

Not O.K. although my firewall is now open.
ID: 29790 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 29811 - Posted: 4 Apr 2017, 10:08:45 UTC - in response to Message 29790.  

Thanks for this, it's actually a mistake in our ATLAS@Home configuration. The server should be ccfrontier.in2p3.fr:23128. I've fixed this now for new tasks.
ID: 29811 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,898,697
RAC: 138,219
Message 29813 - Posted: 4 Apr 2017, 10:52:04 UTC - in response to Message 29811.  

Thanks for this, it's actually a mistake in our ATLAS@Home configuration. The server should be ccfrontier.in2p3.fr:23128. I've fixed this now for new tasks.


That looks very good!
:-)

nc -z -v -w 5 ccfrontier.in2p3.fr 23128
Connection to ccfrontier.in2p3.fr 23128 port [tcp/*] succeeded!


And it delivers valid content:

telnet ccfrontier.in2p3.fr 23128
Trying 134.158.239.12...
Connected to ccfrontier.in2p3.fr.
Escape character is '^]'.
GET http://ccsqfatlasli01.in2p3.fr:23128/ccin2p3-AtlasFrontier/Frontier/type=frontier_request:1:DEFAULT&encoding=BLOBzip5&p1=eNplj8EKwyAMhl8l5DxGZ3ftIWi6ClaLEcZOvv9bzAlbIzsEku--85MIB7YFDsocS43JsXdAAjgQvHwdhR6joYOm280HpwM0.OnnupqbKi.pPkrxO3dVzQhrTjsglUBSbUohraE6K3jF1h1mmj4xphXCP3lunHn8b4HbDBTd8NMCpjN9djPe35fFTC8_ HTTP/1.0

HTTP/1.0 200 OK
Last-Modified: Thu, 09 Feb 2017 15:34:02 GMT
Content-Type: text/xml;charset=US-ASCII
Server: Apache-Coyote/1.1
Cache-Control: max-age=3000, stale-if-error=1
Date: Tue, 04 Apr 2017 10:02:21 GMT
Age: 2385
Content-Length: 596
X-Cache: HIT from ccosvms0012.in2p3.fr
Via: 1.1 ccosvms0012.in2p3.fr:23128 (squid/frontier-squid-2.7.STABLE9-26.1)
Connection: close

<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE frontier SYSTEM "http://frontier.fnal.gov/frontier.dtd">
<frontier version="3.34" xmlversion="1.0">
<transaction payloads="1">
<payload type="frontier_request" version="1" encoding="BLOBzip5">
<data>
<keepalive />eF5jY2Bg4A1wDHL1C4n383dx9XRhA4pw+YX6OrkGaRgaaIK4PFAFIY7u2OWdPTx9XHDr54bI49LO
HRwZHO/pFxzi6esK4nOEOQY5ezgGGbGDeEyGxiCK0QjCMUFwFI0MDA11DUx0DS3jDcysDEysDEz1
DCyNjE0tDQwMFNx9Q9gBzO8mxA==</data>
<quality error="0" md5="4ea0ac0c31d9a0422b9ceb3e7fcf47b4" records="1" full_size="223"/>
</payload>
</transaction>
</frontier>
Connection closed by foreign host.
ID: 29813 · Report as offensive     Reply Quote

Message boards : ATLAS application : WUs fail after requests to ccsqfatlasli01.in2p3.fr


©2024 CERN