sudo ./gravity install
command—is actually performing the following steps:
/var/lib/gravity
must be excluded from auditd monitoring./opt/anaconda
should be excluded as well. That said, we do not have strong evidence that system instability can be tied to monitoring of that directory./var/lib/gravity
must be excluded from on-demand scanning.
/var/lib/gravity
, /opt/anaconda
, and /tmp
must be respected. The installer includes disk space checks in its pre-flight checks.
With managed persistence, generous disk space allocations are even more important. This disk holds a copy of every project (and one copy for each collaborator), and every custom conda environment created by users. A single conda environment can consume multiple gigabytes. For this reason, we encourage that the size of this disk should start at 1TB, and preferably support live resizing.
/var/lib/gravity
directory is essential for the stability of the platform. In particular, the master node hosts the Kubernetes etcd key-value store there.
In practice, we have found that the use of platter disks for /var/lib/gravity
is a primary cause of system instability. Use of an SSD for this directory is effectively required. Direct-attached storage is preferred whenever possible, but we do believe that a sufficiently performant network-attached storage volume for /opt/anaconda is acceptable. Indeed, our positive experience with shared storage for BYOK8s installations validate this belief.
/var/lib/gravity
. When it is necessary to use additional attached block volumes, respect the IOPS recommendations in our system requirements. Each cloud provider offers different mechanisms for ensuring disk performance.
/var/lib/gravity
are tied to the need to ensure a performant Kubernetes stack./opt/anaconda/storage
volume does not have the same strict performance requirements that /var/lib/gravity
has on a Gravity installation. However, we definitely encourage the use of a “premium” performance tier for this volume if possible, as well as for the managed persistence volume.
A high-performance storage tier should be chosen for the managed persistence volume as well. Remember, users will be interacting with that volume to create Python environments and run data science workloads. Performance limitations on this volume will directly impact the user experience.