anaconda-ai
package exposes both a CLI and a Python SDK for working with Anaconda AI Navigator. With it, you can:
~/.anaconda/config.toml
under [plugin.ai]
.
Parameter | Environment Variable | Description | Default |
---|---|---|---|
stop_server_on_exit | ANACONDA_AI_STOP_SERVER_ON_EXIT | Automatically stop servers when Python interpreter exits | true |
<AUTHOR>
: Model’s publisher name (optional)<MODEL>
: Model name<QUANT>
: Quantization method (Q4_K_M
, Q5_K_M
, Q6_K
, Q8_0
)<EXT>
: File extension, usually .gguf
(optional)/
or _
.Example model references
anaconda ai
command.
Subcommand | Description |
---|---|
create-table | Create a table in the vector database |
download | Download a model |
drop-table | Drop a table from the vector database |
launch | Launch an inference server for a model |
launch-vectordb | Start a vector database |
list-tables | List all tables in the vector database |
models | List model information |
remove | Remove a downloaded model |
servers | List running servers |
stop | Stop a running server |
stop-vectordb | Stop the vector database |
--help
to any subcommand for more information.
.models
- Model listing, metadata retrieval, and download.servers
- Server creation and control.models
.models
accessor provides methods for listing and downloading models.
Method | Return Type | Description |
---|---|---|
.list() | List[ModelSummary] | List all available and downloaded models |
.download('<MODEL>/<QUANT>') | None | Download a quantized model file |
.get('<MODEL>') | ModelSummary | Fetch model metadata |
ModelSummary
Attribute/Method | Return | Description |
---|---|---|
.id | str | Model ID in format <author>/<model-name> |
.name | str | Model name |
.metadata | ModelMetadata | Model metadata and quantization files |
ModelMetadata
Attribute/Method | Return | Description |
---|---|---|
.numParameters | int | Number of model parameters |
.contextWindowSize | int | Context window length |
.trainedFor | str | Training purpose ('sentence-similarity' or 'text-generation' ) |
.description | str | Model description |
.files | List[ModelQuantization] | Available quantization files |
.get_quantization('<QUANT>') | ModelQuantization | Get metadata for a specific quantization |
ModelQuantization
Attribute/Method | Return | Description |
---|---|---|
.download() | None | Download the quantization file |
.id | str | SHA256 checksum of the model file |
.modelFileName | str | File name on disk |
.method | str | Quantization method |
.sizeBytes | int | File size in bytes |
.maxRamUsage | int | Required RAM in bytes |
.isDownloaded | bool | Download status |
.localPath | str | Local file path (if downloaded) |
.servers
.servers
accessor provides methods for creating, listing, starting, and stopping servers.
Method | Return | Description |
---|---|---|
.list() | List[Server] | List running servers |
.match() | Server | Find a running server matching configuration |
.create() | Server | Create new server configuration |
.start('<server-id>') | None | Start the API server |
.status('<server-id>') | str | Get server status |
.stop('<server-id>') | None | Stop a running server |
.delete('<server-id>') | None | Remove server configuration |
.create
method creates a new server configuration. By default, creating a server configuration downloads the model file (if it is not already downloaded) and selects a random, unused port for the server. For example:
None
are omitted from the server launch command and fall back to backend-defined defaults.Parameter | Return | Description |
---|---|---|
api_params | APIParams or dict | Parameters for how the server is configured |
load_params | LoadParams or dict | Parameters for how the model is loaded |
infer_params | InferParams or dict | Parameters for inference configuration |
Attribute | Description |
---|---|
.url | Full server URL (example: http://127.0.0.1:8000 ) |
.openai_url | URL with /v1 for OpenAI compatibility |
.openai_client() | Pre-configured OpenAI client |
.openai_async_client() | Pre-configured Async OpenAI client |
llm
package in your environment:
llm
first ensures that the model has been downloaded, then starts the server using AI Navigator. Standard OpenAI and the SDK’s server parameters are supported. For example:
llm
package, see the official documentation.
langchain-openai
in your environment:
pip install
packages in your conda environment once all other packages and their dependencies have been installed. For more information on installing pip packages in your conda environment, see Installing pip packages.api_params
: Dict or APIParams class aboveload_params
: Dict or LoadParams class aboveinfer_params
: Dict or InferParams class above (excluding AnacondaQuantizedEmbedding)AnacondaQuantizedEmbedding
.
For more information on using the langchain-openai
package, see the official documentation.
llama-index-llms-openai
package:
pip install
packages in your conda environment once all other packages and their dependencies have been installed. For more information on installing pip packages in your conda environment, see Installing pip packages.AnacondaModel
class supports the following arguments
Parameter | Type | Description | Default |
---|---|---|---|
model | str | Model name | Required |
system_prompt | str | System prompt | None |
temperature | float | Sampling temperature | 0.1 |
max_tokens | int | Max tokens to predict | Model default |
api_params | dict or APIParams | Server configuration | None |
load_params | dict or LoadParams | Model loading | None |
infer_params | dict or InferParams | Inference config | None |
llama-index-llms-openai
package, see the official documentation.
litellm
. But, since litellm does not currently support entrypoints to register the provider, the user must import the module first.
litellm.completion()
for both standard and streamed completions (stream=True
). Most OpenAI-compatible inference parameters are available and behave as expected, with the exception of the n
parameter (for multiple completions), which is not currently supported.
You can also configure server behavior using the optional_params
argument. This accepts a dictionary with api_params
, load_params
, and infer_params
keys, matching the parameter schemas defined in the SDK:
litellm
package, see the official documentation.
litellm
, allowing you to use quantized local models with any DSPy module that relies on the LM
interface.
Install the dspy
package:
pip install
packages in your conda environment once all other packages and their dependencies have been installed. For more information on installing pip packages in your conda environment, see Installing pip packages.dspy
package, see the official documentation.
ChatInterface
callback requires panel
, httpx
, and numpy
to be installed in your environment:AnacondaModelHandler
supports the following keyword arguments:
Parameter | Description |
---|---|
display_throughput | Show a speed dial next to the response. Default is False |
system_message | Default system message applied to all responses |
client_options | Optional dict passed as keyword arguments to chat.completions.create |
api_params | Optional dict or APIParams object |
load_params | Optional dict or LoadParams object |
infer_params | Optional dict or InferParams object |