Blazor App: An error has occurred. This application may no longer respond until reloaded. Reload

Summary

Running a Blazor Server application and refreshing the page sometimes creates the following error for the user:

An error has occurred. This application may no longer respond until reloaded. Reload
Blazor Server application in web browser display error message.
Example error message displayed to the user

Turning on browser debugging may show connection errors similar to the following:

blazor.server.js:1 WebSocket connection to 'ws://helloblazor.test/_blazor?id=gsHIh62GVm39WvDVxjpJMg' failed: Error during WebSocket handshake: Unexpected response code: 404
---
Error: Failed to start the transport 'WebSockets': Error: There was an error with the transport.
---
GET http://helloblazor.test/_blazor?id=oobUaSlKLPyrC5RPZVg3uw&_=1597287612026 404 (Not Found)
---
Error: Failed to start the transport 'LongPolling': Error: Not Found
---
Error: Failed to start the connection: Error: Unable to connect to the server with any of the available transports. WebSockets failed: Error: There was an error with the transport. ServerSentEvents failed: Error: 'ServerSentEvents' does not support Binary. LongPolling failed: Error: Not Found
---
Error: Error: Unable to connect to the server with any of the available transports. WebSockets failed: Error: There was an error with the transport. ServerSentEvents failed: Error: 'ServerSentEvents' does not support Binary. LongPolling failed: Error: Not Found
---
Uncaught (in promise) Error: Cannot send data if the connection is not in the 'Connected' State.
    at e.send (blazor.server.js:1)
    at e.sendMessage (blazor.server.js:1)
    at e.sendWithProtocol (blazor.server.js:1)
    at blazor.server.js:1
    at new Promise (<anonymous>)
    at e.invoke (blazor.server.js:1)
    at e.<anonymous> (blazor.server.js:15)
    at blazor.server.js:15
    at Object.next (blazor.server.js:15)
    at blazor.server.js:15

These errors show that two (2) types of connection transports were attempted, (1) WebSocket and (2) LongPolling (ServerSentEvents), and both failed to connect.

This connection error can occur when the host is running in a load-balanced server environment. In that type of environment, change the load balancer algorithm for the user to have server affinity (also referred to as sticky sessions). Enabling server affinity ensures the user connection is re-established to the same server on refreshes.

Description

Load balancing two (2) or more servers helps ensure application availability in case one of the servers experience an outage. A server outage could be caused by component failure (hardware/software) or by being overloaded relative to its physical capacity (CPU, Memory, Disk, Network). Adding a second server in a load balanced configuration allow users to be connected to any of the servers that are available to accept the request.

Load Balancer illustration for distributing load between two servers and to one server if the other becomes unavailable

For many web server type applications, the content the server provides to the user does not require the user to stay connected to the server for long durations. They work by a request-response pattern where the user’s browser requests content, such as ‘homepage.htm’, and the server responds by sending the content back. Once the request-response round-trip has completed the user has a local copy from the server and disconnects. Next, if the user refreshes the browser for the same homepage.htm, a 2nd request-response round trip is performed. As the user was disconnected from the server after the 1st request-response completed, a load balancer may forward the 2nd request to a different server to respond.

Illustration of 1st and 2nd request to a page being load balanced to 2 different web servers

When a load balancer operates in this mode where a request can receive a response from any server, it is referred to as having ‘no affinity’. When no affinity is used, the load balancer can have different ways (aka algorithms) for how it chooses to distribute the requests between the available servers. A common algorithm is round-robin where the load balancer gives each server a turn when a request is received. When all available servers have taken a turn, it starts over again from the first server in the round-robin list.

Image of round robin load balancing
Load Balancer with No Affinity and Round-Robin algorithm

As a load balancer can support ‘no affinity’, it can also support ‘affinity’. With affinity enabled the load balancer will stick all requests from a specific user to the same server. This is also referred to as ‘sticky sessions’. The session part stems from the stickiness often being temporal in nature either due to an expiration by the load balancer or the server becoming unavailable. If the server becomes unavailable, the load balancer will establish affinity to a new server for the user.

Image of load balancer with affinity
Load Balancer configured with Affinity

With the background of load balancing and affinity rules behind us, lets come back to the beginning of this article where the user is experiencing an error in the Blazor Server application. What we reviewed on the web server request-response pattern above is referred to as a one-way communication. The user calls the web server and responds, but the web server does not call the user. This one-way communication is foundational to the HTTP protocol used on the Internet.

A Blazor Server application works over a WebSocket protocol. This protocol allows two-way communication where both the user and web server can initiate requests to each other. For this two-way communication to work, the user is the initiating party that starts a request to the server over the HTTP protocol and then negotiates a protocol transition to WebSocket if supported. Once the WebSocket connection is established, both the user and the web server can initiate a call to each other, thus enabling two-way communication.

For the web server to call the user, it needs to know which of its connections is connected to what user to ensure it is sending the request to the right recipient. This logic is provided by the Blazor Server framework and is transparent to the application code but conceptually looks something similar to the following:

Image depicting User connections on Web Server for WebSocket protocol

As we experienced in the early days of cellular phones (thankfully less these days), dropped calls can happen. With cellular phones we call the party back and continue the conversation where we left off because with both have ‘memory’ of us talking to each other. Like cellular phone communication, a WebSocket connection may experience a ‘drop’ and will need to be re-established. In order to keep the communication going from the point where it dropped, both the user and web server needs ‘memory’ of each other like people with cellular phones. This memory is held on both the user and web server side so when communication is re-established both parties remember each other and continue from where they dropped. All the user and web server memory and communication reconnects are done for us by the Blazor Server framework.

This memory and reconnect works fine when the load balancer only sends us to one web server as above. But what happens to our memory on a reconnect if we are in a multi-web server environment that is load balanced with no affinity?

Image of reconnect to new web server thru load balancer with no affinity
Blazer Server application reconnect thru load balancer with no affinity

As depicted above, if the user was having a two-way communication with Server 1 that drops, then the ‘memory’ of the communication is between the user and Server 1. When the connection reconnects to Server 2, Server 2 will have no memory of the prior communication. It’s like calling your friend back on the cellular phone and continue the conversation only to realize you called the wrong number. The Blazor Server framework accounted for this situation so instead of having Server 2 play along in a conversation it ha no memory of, it responds with a ‘wrong number’ (404 Not Found) to inform the user that it will not establish the connection. This ultimately leads to a connection error on the user side which can be seen in the browser debugger with the following connection errors:

blazor.server.js:1 WebSocket connection to 'ws://helloblazor.test/_blazor?id=gsHIh62GVm39WvDVxjpJMg' failed: Error during WebSocket handshake: Unexpected response code: 404
---
Error: Failed to start the transport 'WebSockets': Error: There was an error with the transport.
---
GET http://helloblazor.test/_blazor?id=oobUaSlKLPyrC5RPZVg3uw&_=1597287612026 404 (Not Found)

That is very polite gesture instead of running a prank and attempt to play along in a conversation it knows nothing about 🙂

The Blazor Server framework comes with an additional backup transports if WebSockets fail, including ServerSentEvents (SSE) and long-polling. However, as the reconnect is happening to the wrong server, all three (3) connections fail as seen in the subsequent errors:

Error: Failed to start the transport 'LongPolling': Error: Not Found
---
Error: Failed to start the connection: Error: Unable to connect to the server with any of the available transports. WebSockets failed: Error: There was an error with the transport. ServerSentEvents failed: Error: 'ServerSentEvents' does not support Binary. LongPolling failed: Error: Not Found
---
Error: Error: Unable to connect to the server with any of the available transports. WebSockets failed: Error: There was an error with the transport. ServerSentEvents failed: Error: 'ServerSentEvents' does not support Binary. LongPolling failed: Error: Not Found

As indicated by the errors, ServerSentEvents never actually tried to connect because the transport errored out due to the binary message format not being supported over that transport. Both WebSocket and long-polling were attempted to reconnect.

This is analogous to first calling the friend back with the wrong number on the cellular phone (WebSocket) and then retrying the wrong number on a landline (long-polling). Both phones are connecting to the wrong number. The end result is a connection failure as both attempts failed:

Uncaught (in promise) Error: Cannot send data if the connection is not in the 'Connected' State.
    at e.send (blazor.server.js:1)
    at e.sendMessage (blazor.server.js:1)
    at e.sendWithProtocol (blazor.server.js:1)
    at blazor.server.js:1
    at new Promise (<anonymous>)
    at e.invoke (blazor.server.js:1)
    at e.<anonymous> (blazor.server.js:15)
    at blazor.server.js:15
    at Object.next (blazor.server.js:15)
    at blazor.server.js:15

With the understanding of load balancing affinity behavior we can then change the load balancer from ‘no affinity’ to ‘affinity’ to ensure that the load balancer always sends the user back to the same server upon reconnects.

Image of load balancer configured with affinity allowing Blazer Server reconnects
Load Balancer configured with affinity allowing Blazer Server reconnects

With the load balancer affinity enabled we should now be able to reconnect successful and verify the connection handshake succeeded in the browser debugger:

Information: WebSocket connected to ws://helloblazor.test/_blazor?id=NfBNcfX6EMrOjIxWpDUwsg.

This is not the only cause for an error has occurred, but it may be one to explore as well as reviewing the browser debugger information for additional information to help troubleshoot. As multiple replicas are common for server availability and easy to do with Docker containerization it can be an early cause to write off the troubleshooting list.

See Docker Blazor App on Linux ARM for additional information on containerizing Blazor Server applications.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.