AI Navigator SDK
The anaconda-ai
package exposes both a CLI and a Python SDK for working with Anaconda AI Navigator. With it, you can:
- Download and manage quantized LLMs
- Launch and manage inference servers
- Integrate with popular frameworks like LangChain, LlamaIndex, DSPy, and more
Installation
Install the package into your conda environment:
Configuration
Configuration settings are defined in ~/.anaconda/config.toml
under [plugin.ai]
.
Configurable parameters
Parameter | Environment Variable | Description | Default |
---|---|---|---|
stop_server_on_exit | ANACONDA_AI_STOP_SERVER_ON_EXIT | Automatically stop servers when Python interpreter exits | true |
Model reference format
Quantized models follow this reference format:
<AUTHOR>
: Model’s publisher name (optional)<MODEL>
: Model name<QUANT>
: Quantization method (Q4_K_M
,Q5_K_M
,Q6_K
,Q8_0
)<EXT>
: File extension, usually.gguf
(optional)
The model name and quantization method must be separated by either /
or _
.
AI Navigator CLI
The CLI provides functionality for managing models and servers using the anaconda ai
command.
Subcommand | Description |
---|---|
create-table | Create a table in the vector database |
download | Download a model |
drop-table | Drop a table from the vector database |
launch | Launch an inference server for a model |
launch-vectordb | Start a vector database |
list-tables | List all tables in the vector database |
models | List model information |
remove | Remove a downloaded model |
servers | List running servers |
stop | Stop a running server |
stop-vectordb | Stop the vector database |
Append --help
to any subcommand for more information.
AI Navigator SDK
The AI Navigator SDK provides a Python interface for managing models and inference servers. Use it to list and download quantized models, configure and launch API servers, and integrate directly into your application’s workflows.
Client initialization
Initializing the client exposes:
.models
- Model listing, metadata retrieval, and download.servers
- Server creation and control
.models
The .models
accessor provides methods for listing and downloading models.
Method | Return Type | Description |
---|---|---|
.list() | List[ModelSummary] | List all available and downloaded models |
.download('<MODEL>/<QUANT>') | None | Download a quantized model file |
.get('<MODEL>') | ModelSummary | Fetch model metadata |
ModelSummary
Attribute/Method | Return | Description |
---|---|---|
.id | str | Model ID in format <author>/<model-name> |
.name | str | Model name |
.metadata | ModelMetadata | Model metadata and quantization files |
ModelMetadata
Attribute/Method | Return | Description |
---|---|---|
.numParameters | int | Number of model parameters |
.contextWindowSize | int | Context window length |
.trainedFor | str | Training purpose ('sentence-similarity' or 'text-generation' ) |
.description | str | Model description |
.files | List[ModelQuantization] | Available quantization files |
.get_quantization('<QUANT>') | ModelQuantization | Get metadata for a specific quantization |
ModelQuantization
Attribute/Method | Return | Description |
---|---|---|
.download() | None | Download the quantization file |
.id | str | SHA256 checksum of the model file |
.modelFileName | str | File name on disk |
.method | str | Quantization method |
.sizeBytes | int | File size in bytes |
.maxRamUsage | int | Required RAM in bytes |
.isDownloaded | bool | Download status |
.localPath | str | Local file path (if downloaded) |
Downloading models
Download a quantized model file using one of the following approaches:
client.models.download()
accepts either a string reference or a ModelQuantization
object.
If the model has already been downloaded, the function exits immediately. Otherwise, a progress bar displays.
.servers
The .servers
accessor provides methods for creating, listing, starting, and stopping servers.
Method | Return | Description |
---|---|---|
.list() | List[Server] | List running servers |
.match() | Server | Find a running server matching configuration |
.create() | Server | Create new server configuration |
.start('<server-id>') | None | Start the API server |
.status('<server-id>') | str | Get server status |
.stop('<server-id>') | None | Stop a running server |
.delete('<server-id>') | None | Remove server configuration |
Creating servers
The .create
method creates a new server configuration. By default, creating a server configuration downloads the model file (if it is not already downloaded) and selects a random, unused port for the server. For example:
If a server with the specified configuration is already running, the existing configuration is returned, and no new server is created.
Server configuration parameters
The optional parameters listed below can be passed as dictionaries and used to avoid the automatic download of the model file.
Parameters set to None
are omitted from the server launch command and fall back to backend-defined defaults.
Parameter | Return | Description |
---|---|---|
api_params | APIParams or dict | Parameters for how the server is configured |
load_params | LoadParams or dict | Parameters for how the model is loaded |
infer_params | InferParams or dict | Parameters for inference configuration |
For example, to create a server configuration that uses a specific port; customizes the context size, number of GPU layers, and temperature; and avoids downloading the model file, your code might look like this:
Managing servers
New servers are not automatically started when their configuration is created. You can start or stop a server using the following methods:
This approach automatically stops the server when exiting the indented block:
Server attributes
Attribute | Description |
---|---|
.url | Full server URL (example: http://127.0.0.1:8000 ) |
.openai_url | URL with /v1 for OpenAI compatibility |
.openai_client() | Pre-configured OpenAI client |
.openai_async_client() | Pre-configured Async OpenAI client |
Framework integrations
The SDK provides integrations with several popular AI frameworks:
LLM
Install the llm
package in your environment:
To list the Anaconda AI models, run:
When you invoke a model, llm
first ensures that the model has been downloaded, then starts the server using AI Navigator. Standard OpenAI and the SDK’s server parameters are supported. For example:
To view server parameters, run:
For more information on using the llm
package, see the official documentation.
LangChain
The LangChain integration provides Chat and Embedding classes that automatically manage downloading models and starting servers.
Install langchain-openai
in your environment:
Only pip install
packages in your conda environment once all other packages and their dependencies have been installed. For more information on installing pip packages in your conda environment, see Installing pip packages.
Here is a minimal setup example for using LangChain with Anaconda’s models:
The following keyword arguments are supported:
api_params
: Dict or APIParams class aboveload_params
: Dict or LoadParams class aboveinfer_params
: Dict or InferParams class above (excluding AnacondaQuantizedEmbedding)
The SDK’s server parameter classes are supported for working with LangChain with the exception of AnacondaQuantizedEmbedding
.
For more information on using the langchain-openai
package, see the official documentation.
LlamaIndex
Install the llama-index-llms-openai
package:
Only pip install
packages in your conda environment once all other packages and their dependencies have been installed. For more information on installing pip packages in your conda environment, see Installing pip packages.
Here is a minimal setup example for using LlamaIndex with Anaconda AI Navigator:
The AnacondaModel
class supports the following arguments
Parameter | Type | Description | Default |
---|---|---|---|
model | str | Model name | Required |
system_prompt | str | System prompt | None |
temperature | float | Sampling temperature | 0.1 |
max_tokens | int | Max tokens to predict | Model default |
api_params | dict or APIParams | Server configuration | None |
load_params | dict or LoadParams | Model loading | None |
infer_params | dict or InferParams | Inference config | None |
For more information on using the llama-index-llms-openai
package, see the official documentation.
LiteLLM
This provides a CustomLLM provider for use with litellm
. But, since litellm does not currently support entrypoints to register the provider, the user must import the module first.
This integration supports litellm.completion()
for both standard and streamed completions (stream=True
). Most OpenAI-compatible inference parameters are available and behave as expected, with the exception of the n
parameter (for multiple completions), which is not currently supported.
You can also configure server behavior using the optional_params
argument. This accepts a dictionary with api_params
, load_params
, and infer_params
keys, matching the parameter schemas defined in the SDK:
For more information on using the litellm
package, see the official documentation.
DSPy
The SDK integrates with DSPy using litellm
, allowing you to use quantized local models with any DSPy module that relies on the LM
interface.
Install the dspy
package:
Only pip install
packages in your conda environment once all other packages and their dependencies have been installed. For more information on installing pip packages in your conda environment, see Installing pip packages.
Here is an example of how to use DSPy with Anaconda’s models:
For more information on using the dspy
package, see the official documentation.
Panel
Use Panel’s ChatInterface callback to build a chatbot that uses one of Anaconda’s models by serving it through the SDK.
The ChatInterface
callback requires panel
, httpx
, and numpy
to be installed in your environment:
Here’s an example Panel chatbot application:
AnacondaModelHandler
supports the following keyword arguments:
Parameter | Description |
---|---|
display_throughput | Show a speed dial next to the response. Default is False |
system_message | Default system message applied to all responses |
client_options | Optional dict passed as keyword arguments to chat.completions.create |
api_params | Optional dict or APIParams object |
load_params | Optional dict or LoadParams object |
infer_params | Optional dict or InferParams object |
For more information on using Panel, see the official documentation.