Skip to main content
Anaconda Desktop is currently available through a limited early access program. Anaconda AI Navigator provides many of the same capabilities and is available to all users.
The built-in API server enables you to run open-source large language models (LLMs) locally on your own machine. It facilitates direct interaction with your models through API calls, providing a powerful tool for testing your application’s AI workflows without the need for external cloud services. By hosting your own API server locally, you have full control over the model’s behavior and maintain the privacy and security of your data.

Understanding the API server

The API server is the core component of Anaconda Desktop that enables you to interact with your locally downloaded LLMs through API calls. Access the API server by selecting Model Servers from the left-hand navigation.
Hover over a tooltip to view information about the API Server’s configurable .

Server address

Your local server address, often referred to as localhost or 127.0.0.1, is the default address for the API server. It is a loopback address, meaning any network communication sent to this address remains within the same machine, keeping your data secure and private. This is common for developing applications.
localhost and 127.0.0.1 are semantically identical.
If you are working in an office network and would like to make your server accessible to other devices within the same network, you can set the server address to your machine’s local IP address. These are typically private networks, meaning they’re not routable on the public internet. Setting the server address to 0.0.0.0 configures it to accept connections on all network interfaces. This can expose your server to the public internet if your network is configured to allow external connections and route traffic from external sources to your server.

Server port

The server port tells the built-in API server where to listen for incoming traffic (that is, where to listen and pick up API requests). For more information, see Cloudflare’s article on ports.
Your base URL for communicating with the API server combines the server address and port number from the Model Servers page. For example, http://localhost:8080/.

API key

Establish an API key that must be passed as an Authorization: Bearer token in the header of requests made to your server. You can choose any unique string you want as the API key, but the value you enter on the Model Servers page must match what you use in the Authorization: Bearer header.
To test an API key:
  1. Select Model Servers from the left-hand navigation.
  2. Enter your API key in the API Key field.
  3. Load a model into the API server.
    You must use a text-generation type model.
  4. Open Anaconda Prompt (Terminal on macOS/Linux) and run the following command:
    curl --request POST ^
        --url http://localhost:8080/completion ^
        --header 'Authorization: Bearer <API_KEY>' ^
        --header "Content-Type: application/json" ^
        --data '{
            "prompt": "Hello, how are you?"
        }'
    
    Replace <API_KEY> with the value you entered on the Model Servers page.
This command will reach out to the local API server’s /completion endpoint to interact with the model you have loaded. If you receive a response, your API key is working. If you receive a 401 error, double check your command and try again.

Loading a model

Use the Model dropdown to select the model you want to load into the API server and the File dropdown to choose the quantization level to interact with.
Servers page showing model and file dropdown buttons

Starting the server

To start the API server, select Start.
Servers page showing start button
To stop the API server, select Stop.

Viewing API server logs

The API server records all incoming traffic for the currently running server and displays relevant information in the Server Logs. To view your server logs, select Logs on the Model Servers page. The server logs provide information for the following metrics:
  • System information - Provides information about your system’s hardware capabilities.
  • Build information - Provides information about the version of the API server you are using.
  • Chat template - Shows the sequence of messaging defined for the system.
  • Server listening - Displays the server address and port number being listened to.
  • Slot information - Displays the number of available slots for the server. Each slot is able to manage one user API request at a time.
  • Task information - Displays information such as time spent processing and responding to a request, the request task ID, and which slot fulfilled the task.
Once the server is stopped, the log is saved to the following location with a timestamped filename (for example, <YYYY><MM><DD><HH><MM><SS>_api-server.log), so you can efficiently locate specific server logs if necessary:
  • Windows
  • macOS
For Windows machines, you can find the logs here: C:\Users\<USERNAME>\AppData\Roaming\anaconda-desktop\logs