Skip to content

Automatically choose workspace-cluster based on lowest latency.  #5596

Open
@meysholdt

Description

@meysholdt

Context: #5534 (comment)

Problem Statement

We currently have workspace clusters in one region in the EU and one region in the US. To offer service at a good latency (e.g. < 100ms), we will need more clusters, maybe as many as one or two per continent. See https://gcping.com/ for your personal latency to every google cloud region. See the GCP network map for available regions and connections between them.

Prior Art

Proposed Solution

The user's web browser should measure the latency for every available workspace cluster and send the measurements to the gitpod-server, so that the server can make an informed decision about what workspace-cluster is best for the user.

Considerations

  • latency measurement should not slow down workspace startup time
  • the decision what workspace-cluster to choose should remain with the gitpod-server, because in the future, other factors besides latency may influence the decision: Example: cluster health.

Proposed Design Choices:

  • to keep workspace startup fast, the latency measurement should be cached. For example in a cookie in the web-browser.
  • to keep workspace startup fast, the latency measurement should preferable not be done when a workspace starts, but when a user visits any website of gitpod.
  • every workspace clusters should have a public endpoint that can be "pinged" from the web browser for latency measurement.
  • the server should make a cache-key and the ws-cluster-endpoints available to the users. The cache-key should encode the public IP address of the user, so that the latency will be measured again if the user changes his/her network.

Example Flow 1:

  1. the user visit gitpod.io/workspaces.
  2. the users browser receives {'cache-key': 'FJJDSKD', "clusters": {"us07": "https://us07.gitpod.io/ping", "sing01": "https://sing01.gitpod.io/ping" } }
  3. the user browser measures the latency to all clusters in the background and stores the result in a cookie: {"us07": 230, "sing01": 60}
  4. When the user opens a workspace, the cookie will be send to the gitpod-server and the server will use the latency measurement to chose the best workspace cluster.

Example Flow 2:

  1. the user opens a workspace. The cookie is already there. No delay during workspace-start.

Example Flow 3:

  1. the user opens a workspace. The cookie is not yet there. The is the case we want to avoid, but I don't think it can be avoided all the time.
  2. measure the latency. Maybe the measurement can be aborted when the first workspace-cluster responds, because the first to respond will also be the one with the lowest latency (duh!). While there is the risk that the measurement is slightly inaccurate and repeated measurements would be needed for more accurate results, it seems like a good compromise to preserve fast workspace startup time. This way, if not cookie is present, 15 to ~200 ms will be added to to the workspace startup time.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Scheduled

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions