Websocket connection interrupted

Hello Everyone!

When i check a running job, i got this error on developers tools.

Firefox can’t establish a connection to the server at wss://domain/websocket/. connectJobSocket.js:4:7
The connection to wss://domain/websocket/ was interrupted while the page was loading. connectJobSocket.js:4:7
Socket error:  
error { target: WebSocket, isTrusted: true, srcElement: WebSocket, currentTarget: WebSocket, eventPhase: 2, bubbles: false, cancelable: false, returnValue: true, defaultPrevented: false, composed: false, … }
 Disconnecting... connectJobSocket.js:41:12
Firefox can’t establish a connection to the server at wss://domain/websocket/. connectJobSocket.js:4:7
Socket closed. Reconnecting... 
close { target: WebSocket, isTrusted: true, wasClean: false, code: 1006, reason: "", srcElement: WebSocket, currentTarget: WebSocket, eventPhase: 2, bubbles: false, cancelable: false, … }
connectJobSocket.js:32:14
The connection to wss://domain/websocket/ was interrupted while the page was loading. connectJobSocket.js:4:7
Socket error:  
error { target: WebSocket, isTrusted: true, srcElement: WebSocket, currentTarget: WebSocket, eventPhase: 2, bubbles: false, cancelable: false, returnValue: true, defaultPrevented: false, composed: false, … }
 Disconnecting... connectJobSocket.js:41:12
Socket closed. Reconnecting... 
close { target: WebSocket, isTrusted: true, wasClean: false, code: 1006, reason: "", srcElement: WebSocket, currentTarget: WebSocket, eventPhase: 2, bubbles: false, cancelable: false, … }
connectJobSocket.js:32:14
Uncaught DOMException: An attempt was made to use an object that is not, or is no longer, usable
    <anonymous> connectJobSocket.js:17
    rye connectJobSocket.js:4
    onclose connectJobSocket.js:34
    setTimeout handler*rye/eye.onclose connectJobSocket.js:33
    rye connectJobSocket.js:29
    Fye JobOutput.js:271
    Fl React
    unstable_runWithPriority scheduler.production.min.js:18
    React 4
    unstable_runWithPriority scheduler.production.min.js:18
    React 5
    n useRequest.js:44
    u runtime.js:63
    _invoke runtime.js:294
    j runtime.js:119
    Babel 6
    j_e Job.js:113
    Fl React
    unstable_runWithPriority scheduler.production.min.js:18
    React 3
    M scheduler.production.min.js:16
    onmessage scheduler.production.min.js:12
    6813 scheduler.production.min.js:12
    Webpack 10
main.13ab8549.js:2

But when i can connect with postman and wscat.

% wscat -c wss://domain/websocket/ --auth wsstest:asdfasdfasdf
Connected (press CTRL+C to quit)
< {"accept": true, "user": 10}
>

Any idea would be appreciated.

Hello Leon!

Unfortunately, websockets have had recurring issues since at least as far back as 15.1.0 from personal experience. There have been many “fixes” since then, including some very recent ones, but part of the problem seems to be outside of the AWX team’s control. Things like load balancers, ingress controllers, proxies, etc, anything that handles the various layers of routing between your browser (including the browser itself) and the AWX websocket relay service may impact the success of connections. In my case, the ingress controller of my kubernetes platform doesn’t support websocket connections over http2, so the sockets sort of work, but I have to refresh the page periodically.

Can you share what version of AWX you’re running and what your deployment is like?

1 Like

Hello Denney!

Thank you for your answer. I am currently on AWX 24.2 and AWX Operator 2.15.0 in OpenShift 4.14 with OVN as CNI. We are using Netscaler as loadbalancer and ssl termination happens on netscaler rather than OCP. I tried with these new settings;

    - setting: USE_X_FORWARDED_HOST
      value: "True"

    - setting: USE_X_FORWARDED_PORT
      value: "True"

    - setting: BROADCAST_WEBSOCKET_PROTOCOL
      value: "http"

    - setting: BROADCAST_WEBSOCKET_PORT
      value: "80"

    - setting: BROADCAST_WEBSOCKET_VERIFY_CERT
      value: "False"

But didnt help…

Thanks!

1 Like

I have a similar setup. We’re still on OCP 4.12, and have an haxproxy load balancer in front of our cluster, but I think we terminate SSL at the cluster. The ingest controller doesn’t support websockets over http2 in this version, but it should in 4.14, so I’m looking forward to when we upgrade.

Anyways, I don’t know anything about NetScaler specifically, but there is an FAQ about binding an HTTP profile with “Enable WebSocket connection” option enabled.
FAQ: NetScaler and WebSockets (citrix.com)

Have you configured netscaler for forwarding WebSockets?

1 Like

Also, since it just dawned on me, have you built your own AWX image, or are you just using the operator defaults?

1 Like

Hey again Denney!

Yes, the netscaler already configured. I accept it as working accordingly since i can able to connect with Postman if i dont miss anything… And yes we are using stock images except EEs.

Thanks!

1 Like

So, unless they’ve made changes since I posted this a few months ago:

AWX doesn’t really support OCP out of the box due to the random UID’s that each namespace runs under. My solution was to build my own AWX images, but I think you can change the SCC to allow AWX pods/ServiceAccounts to runAsUser: type: RunAsAny

For me, the latter wasn’t approved by management, so building from scratch was the only option.

Anyways, I can’t say for certain that this will fix your Web Socket issue specifically, but it may fix other weird bugs/quirks or otherwise wonky behavior.

Great piece of advise Denney! I will keep you posted if i find the cause and fix it!

Thanks

1 Like