sudo ./gravity install
command—is actually performing
the following steps:
/var/lib/gravity
must be excluded from auditd monitoring./opt/anaconda
should be excluded as well. That said, we do not
have strong evidence that system instability can be tied to monitoring of that directory./var/lib/gravity
must be excluded from on-demand scanning.
/var/lib/gravity
, /opt/anaconda
, and /tmp
must be
respected. The installer includes disk space checks in its pre-flight checks.
With managed persistence, generous disk space allocations are even
more important. This disk holds a copy of every project (and one
copy for each collaborator), and every custom conda environment
created by users. A single conda environment can consume multiple
gigabytes. For this reason, we encourage that the size of this
disk should start at 1TB, and preferably support live resizing.
/var/lib/gravity
directory is essential for the stability of the platform. In particular,
the master node hosts the Kubernetes etcd key-value store there.
In practice, we have found that the use of platter disks for
/var/lib/gravity
is a primary cause of system instability.
Use of an SSD for this directory is effectively required.
Direct-attached storage is preferred whenever possible, but we
do believe that a sufficiently performant network-attached storage
volume for /opt/anaconda is acceptable. Indeed, our positive experience
with shared storage for BYOK8s installations validate this belief.
/var/lib/gravity
. When it is
necessary to use additional attached block volumes, respect
the IOPS recommendations in our system requirements.
Each cloud provider offers different mechanisms for
ensuring disk performance.
/var/lib/gravity
are
tied to the need to ensure a performant Kubernetes stack./opt/anaconda/storage
volume does not have the same strict performance
requirements that /var/lib/gravity
has on a Gravity installation.
However, we definitely encourage the use of a “premium” performance tier for
this volume if possible, as well as for the managed persistence volume.
A high-performance storage tier should be chosen for the managed persistence
volume as well. Remember, users will be interacting with that volume to
create Python environments and run data science workloads. Performance
limitations on this volume will directly impact the user experience.