With version 5.5, Anaconda began to roll out support for the
installation of Anaconda Enterprise on customer-supplied Kubernetes
clusters—both cloud-hosted and on-premise. We intend for this to
become the preferred installation host for Anaconda Enterprise.
In this document, we explain some of our rationale and provide
guidance for our existing, Gravity-based customers.Anaconda remains committed to maintaining support for Gravity for
the foreseeable future. But for reasons we explain below, we encourage
customers to begin the investigation of a migration path to an
alternative, internally supported Kubernetes platform.
Gravity
is an open-source application delivery system, developed by
Teleport
(formerly Gravitational).
Gravity creates installers that bundle application assets with
Planet, a containerized
Kubernetes stack. Gravity has enabled us to deliver Anaconda Enterprise
to customers for installation on bare metal or virtual machine clusters
with no existing Kubernetes support.The benefits of the Gravity approach come with a number of practical
challenges. The most important of these challenges stems from recent
changes to Gravity’s support model. Until recently, Teleport offered
paid commercial support, providing us with fast access to technical
experts when needed. In 2021, they chose to sunset that offering
completely. This change has important practical consequences:
Reliance solely on community support limits our velocity and
ability to diagnose low-level performance or functionality issues.
We rely wholly on the upstream developers to deliver bug and
security fixes for the platform. As a result, we are unable to
offer firm deadlines for resolving CVEs related to Kubernetes
or the underlying virtualization layer.
While Teleport still relies heavily on Gravity to support their
business, and continues to fund its development, we cannot guarantee
that will continue—nor can we be certain that their specific
development priorities align with ours.
Their support for newer versions of Kubernetes lags behind the
official sources. The latest production version of Gravity ships
with Kubernetes 1.17.9, with versions 1.19.15 and 1.21.5 are
in pre-release only. In contrast, the latest upstream
Kubernetes release as
of the writing of this document is 1.23.3.
In addition to these logistical challenges, our experience is that
Gravity performance is very sensitive to the precise underlying operating
system configuration. Our document
Understanding Anaconda Enterprise system requirements
represents a compendium of the challenges our customers have experienced.We have observed that many of these configuration issues are a
natural consequence of standard, legitimate IT policies that
are not designed with Kubernetes in mind. Transferring ownership of
Kubernetes uptime to your IT department, therefore, should help
ensure a stable, performant platform for users. It also allows them
to make security decisions about the Kubernetes stack that properly
balance application stability with security risk.
We recognize that, for many of our customers, the benefits of Gravity
outweigh these practical concerns. Their IT departments simply may
not be prepared to formally support Kubernetes infrastructure. For
these reasons, Anaconda intends to continue to support Gravity
as a valid destination for AE5. Here are our plans moving forward:
The production build of AE 5.5.1 is currently built on top of
Gravity 6.1.46, which utilizes Kubernetes 1.15.2.
For AE 5.5.2, our preferred installer utilizes Gravity
7.0.34 (k8s 1.17.9). This installer has received our full QA
cycle, including tests of upgrades from an existing Gravity 6.1
environment.
In some environments, in-place upgrades that update the major
version of Gravity can fail. For this reason, for AE 5.5.2 only,
we are supporting a second version of the installer that utilizes
Gravity 6.1, to ensure the feasibility of an in-place upgrades
for this release. This installer has also received our full QA cycle.
Subsequent releases of AE5 will offer only a single Gravity installer,
utilizing the latest, stable version of the platform.
Gravity is currently offering beta versions of Gravity 8.0 (k8s 1.19)
and 9.0 (k8s 1.21). We will test AE5 with these versions of Gravity
only when those versions exit beta.
If a critical security vulnerability arises in our released versions of
Gravity, and the Gravity maintainers release a supported patch update
that addresses it, we will build a special “Gravity-only” installer
that incorporates the patch. These installers perform in-place
upgrades of Gravity itself without disturbing the installed
application. We can only offer basic smoke testing for Gravity-only
upgrades. A full QA cycle will be reserved for the next official
AE5 release. Therefore, customers will need to weigh the urgency of
the given patch against this concern. That said, our experience is
that Gravity-only updates of this sort are reliable and quick to apply.
When a customer is ready to migrate from Gravity to an in-house
Kubernetes platform, Anaconda is committed to working with them
to ensure that this process proceeds smoothly. To that end,
Anaconda can support the following migration workflow:
In a standard pre-implementation meeting, we review our Kubernetes
system requirements with your cluster administrators.
We assist in the installation of a new instance of AE on a
customer-supplied Kubernetes cluster—running in parallel with
an existing, Gravity-based cluster.
We use our standard DR & Sync tooling to transfer a snapshot
of the current cluster’s content to the new cluster, so that
users can exercise the new environment.
Once the customer is satisfied that the new cluster is ready
to be promoted to production, we transfer a final snapshot,
including the hostname and SSL certificate, and update the
DNS records to point to the new cluster.
Once the cutover is complete, the customer is free to
retire the old cluster.
Some logistical notes:
We cannot provide direct assistance with the specification and
provisioning of your new Kubernetes cluster. However, our
documentation offers a set of
templates and provisioning details for the the major Kubernetes
offerings, both cloud and on-premise, that your administrators are
free to build from. Our installation process leverages Helm,
making it compatible with all major Kubernetes platforms.
We consider it essential that both clusters are running simultaneously
for at least a brief validation period. For this reason, we do not
support an in-place “upgrade” of a production Gravity-based cluster
to an alternative Kubernetes platform.
It is reasonable to consider initially under-provisioning the
destination Kubernetes cluster. That is, the initial allocation
of worker nodes to the new cluster can be smaller, in anticipation
that the nodes from the Gravity cluster will be converted to
additional workers once validation is complete. In this scenario,
it would be necessary only to ensure that the new cluster has
enough resources to complete the validation process before the cutover.
Many Kubernetes administrators are accustomed to hosting workloads
that are less resource intensive than a data science development
session or machine learning model. In particular, our Docker image
sizes and recommended resource profiles are likely to surprise them.
We encourage you to review the BYOK8s section of our document
Understanding Anaconda Enterprise system requirements
with your Kubernetes team prior to firming up a migration plan. They
should also review the BYOK8s installation requirements
and the pre-install checklist.
Finally, let us address two concerns about cost.
There will be no license charge levied for a parallel
installation, as long as the intent is to fully migrate
workloads to the new cluster, and decommission the original,
once the installation is verified.
When migrating from Gravity to an alternative Kubernetes cluster,
particularly one with multiple tenants, you may find it necessary
to allow AE5 to run on more worker nodes to support the same
workload. There will be no additional per-node charges levied as
a result of this migration. Upon renewal, we will cap your per-node fees.