Developing a chatbot

In this tutorial, you’ll learn to build a simple chatbot in Python that utilizes Anaconda Desktop’s OpenAI-compatible API server to process natural language queries. You will use conda to establish a working environment to develop the chatbot, build the chatbot application using the OpenAI Python library, and interact with the chatbot at the command line. Prerequisites

Before you begin, ensure that you have the conda installed on your machine. You can install conda by downloading either Anaconda Distribution or Miniconda.
You must have a text-generation type model downloaded onto your local machine. Walk through the Getting started with Anaconda Desktop guide before you begin the tutorial.

Setting up your environment

When working on a new conda project, it is recommended that you create a new environment for development. Follow these steps to set up an environment for your chatbot:

Open Anaconda Prompt (Terminal on macOS/Linux).
This terminal can be opened from within an (JupyterLab, PyCharm, VS Code, Spyder), if preferred.
Create the for your chatbot development and install the packages you’ll need by running the following command:
```
conda create --name chataconda python openai
```
Activate your newly created conda environment by running the following command:
```
conda activate chataconda
```

For more information and best practices for managing environments, see Environments.

Building the chatbot

Below, you’ll find the necessary code to build your chatbot, along with an explanation of each section to help you understand the functionality of the code. Using your preferred IDE, create a new file on your machine, and name it chatter-max.py.

Importing libraries

The application uses the OpenAI Python library to communicate with the API server. We’ll import the necessary modules for making API calls and handling errors. Make this the first line of code in your chatter-max.py file:

from openai import OpenAI, APIError, APIConnectionError

Setting the `base_url`

In order for your application to programmatically process natural language inputs to generate responses and perform other actions, it is crucial that you properly structure your application to interact with the API server. The base_url can be constructed by combining the Server Address and Server Port specified in Anaconda Desktop, like this: http://<SERVER_ADDRESS>:<SERVER_PORT>. Set the base_url to point to the default server address by adding the following line to your file.

base_url = 'http://localhost:8080'

localhost and 127.0.0.1 are semantically identical.

Adding the API calls

The OpenAI chat completions API is the main endpoint you’ll use to interact with your model. For a full list of API endpoints and detailed information, see the OpenAI Chat Completions API documentation. To enable your application to communicate with the API server, you need to implement a function that makes the chat completion API call.

Creating the completion function

To interact with the model, you need a function that sends messages to the chat completion endpoint and receives responses. This function handles the API call and error management. The messages parameter is a list of conversation messages, where each message has a role (system, user, or assistant) and content. This format allows the model to understand the full context of the conversation. The parameters in client.chat.completions.create() control how the AI model generates responses. These parameters dictate the model’s behavior during the completion process. The stop parameter specifies tokens that tell the model when to stop generating text. The tokens <|im_end|> and <|im_start|> are special control markers used by the ChatML chat template to denote message boundaries. By including these as stop tokens, you prevent the model from accidentally generating these markers in its response, which could confuse the conversation structure or cause the model to generate both sides of the conversation. Add the following lines to your chatter-max.py file:

def get_completion(client, messages):
    """Send messages to the chat completion API and return the response."""
    try:
        response = client.chat.completions.create(
            model="",
            messages=messages,
            temperature=0.8,
            top_p=0.95,
            max_completion_tokens=400,
            stop=["<|im_end|>", "<|im_start|>"]
        )
        return response.choices[0].message.content.strip(), True
    except APIConnectionError as e:
        return f"Connection error: {e}", False
    except APIError as e:
        return f"API error: {e}", False

The model parameter can be left empty when using Anaconda Desktop’s API server, as the model is already loaded on the server.

Constructing the main function

The main function initiates the chatbot, handles user inputs, and manages the flow of the conversation. This is where you set the initial system prompt that defines the assistant’s behavior.

Play around with the system prompt to see how it impacts the responses you receive from your model!

If you set an API key value on the Model Servers page in Anaconda Desktop, you can pass it to the OpenAI client by replacing the empty string in the api_key parameter. Otherwise, leave it as an empty string to skip authentication.

Add the following lines to your chatter-max.py file:

def main():
    """Run the interactive chatbot loop."""
    client = OpenAI(base_url=base_url, api_key="")

    system_prompt = ('You are a friendly AI assistant designed to provide '
                      'helpful, succinct, and accurate information.')
    messages = [{'role': 'system', 'content': system_prompt}]

    print(f"Connected to: {base_url}")

    while True:
        user_input = input("Enter a prompt or type 'exit' to quit: ")
        if user_input.lower() == 'exit':
            break

        messages.append({'role': 'user', 'content': user_input})
        assistant_response, success = get_completion(client, messages)
        print('Assistant:', assistant_response)

        if success:
            messages.append({'role': 'assistant', 'content': assistant_response})
        else:
            messages.pop()


if __name__ == "__main__":
    main()

Running the chatbot

With your chatbot constructed, it’s time to take your model for a test run!

Open Anaconda Desktop and load a model into the API server.
Leave the Server Address and Server Port at the default values and click Start.
In your terminal (with the chataconda environment activated), navigate to the directory where you stored your chatter-max.py file.
Initiate the chatbot by running the following command:
```
python chatter-max.py
```
If everything is set up correctly, you’ll see a confirmation message showing the connected URL, and you can start chatting with your model.

Make sure the API server is running in Anaconda Desktop before running your chatbot. If the server isn’t running, you’ll see a connection error.

Having some fun with the model

Try adjusting the following parameters in the get_completion() function to see how they affect the output from the model.

Visit OpenAI’s official documentation for additional available parameters. Note that some parameters may not be supported by all available models on Anaconda Desktop.

`temperature`

Adjusting the temperature of your model can increase or decrease the randomness of the responses you receive from your prompts. Higher values (example 2.0) make the output more creative and varied. Lower values (example 0.2) make the output more deterministic and focused. Valid range is 0.0 to 2.0. Defaults to 0.8.

`top_p`

Limits token selection to a subset of tokens with a cumulative probability above a threshold to balance creativity with coherence. Higher values (closer to 1.0) allow the model to provide more creative responses, and lower values enhance focus. Valid range is 0.0 to 1.0. Adjust top_p to see how it affects the model’s descriptiveness for the same prompt. Defaults to 0.95.

`max_completion_tokens`

Controls the maximum number of tokens the model can generate in its response. One token is roughly equivalent to 4 characters or 0.75 words in English. Increasing this value allows for longer responses, while decreasing it keeps responses more concise. Defaults to 400.

`frequency_penalty`

Reduces the likelihood of the model repeating the same words or phrases. Higher values (up to 2.0) make the model avoid repetition more strongly, while lower or negative values allow more repetition. Valid range is -2.0 to 2.0. Defaults to 0.0.

`presence_penalty`

Encourages the model to introduce new topics rather than continuing with existing ones. Higher values (up to 2.0) push the model to be more exploratory, while lower values allow it to stay on topic. Valid range is -2.0 to 2.0. Defaults to 0.0.

Next steps

You can continue to develop and extend this chatbot by adding features like conversation history persistence, multi-turn context management, or integration with other tools. When you’re done, you can delete this file and clean up your conda environment by running the following command:

conda deactivate
conda remove --name chataconda --all

​Setting up your environment

​Building the chatbot

​Importing libraries

​Setting the base_url

​Adding the API calls

​Creating the completion function

​Constructing the main function

​Running the chatbot

​Having some fun with the model

​temperature

​top_p

​max_completion_tokens

​frequency_penalty

​presence_penalty

​Next steps

Setting up your environment

Building the chatbot

Importing libraries

Setting the `base_url`

Adding the API calls

Creating the completion function

Constructing the main function

Running the chatbot

Having some fun with the model

`temperature`

`top_p`

`max_completion_tokens`

`frequency_penalty`

`presence_penalty`

Next steps