In this tutorial, you’ll learn to build a simple chatbot in Python that utilizes Anaconda Desktop’s OpenAI-compatible API server to process natural language queries. You will use conda to establish a working environment to develop the chatbot, build the chatbot application using the OpenAI Python library, and interact with the chatbot at the command line.Prerequisites
Before you begin, ensure that you have the conda installed on your machine. You can install conda by downloading either Anaconda Distribution or Miniconda.
You must have a text-generation type model downloaded onto your local machine. Walk through the Getting started with Anaconda Desktop guide before you begin the tutorial.
When working on a new conda project, it is recommended that you create a new environment for development. Follow these steps to set up an environment for your chatbot:
Below, you’ll find the necessary code to build your chatbot, along with an explanation of each section to help you understand the functionality of the code.Using your preferred IDE, create a new file on your machine, and name it chatter-max.py.
The application uses the OpenAI Python library to communicate with the API server. We’ll import the necessary modules for making API calls and handling errors.Make this the first line of code in your chatter-max.py file:
from openai import OpenAI, APIError, APIConnectionError
In order for your application to programmatically process natural language inputs to generate responses and perform other actions, it is crucial that you properly structure your application to interact with the API server.The base_url can be constructed by combining the Server Address and Server Port specified in Anaconda Desktop, like this: http://<SERVER_ADDRESS>:<SERVER_PORT>.Set the base_url to point to the default server address by adding the following line to your file.
base_url = 'http://localhost:8080'
localhost and 127.0.0.1 are semantically identical.
The OpenAI chat completions API is the main endpoint you’ll use to interact with your model. For a full list of API endpoints and detailed information, see the OpenAI Chat Completions API documentation.To enable your application to communicate with the API server, you need to implement a function that makes the chat completion API call.
To interact with the model, you need a function that sends messages to the chat completion endpoint and receives responses. This function handles the API call and error management.The messages parameter is a list of conversation messages, where each message has a role (system, user, or assistant) and content. This format allows the model to understand the full context of the conversation.The parameters in client.chat.completions.create() control how the AI model generates responses. These parameters dictate the model’s behavior during the completion process.The stop parameter specifies tokens that tell the model when to stop generating text. The tokens <|im_end|> and <|im_start|> are special control markers used by the ChatML chat template to denote message boundaries. By including these as stop tokens, you prevent the model from accidentally generating these markers in its response, which could confuse the conversation structure or cause the model to generate both sides of the conversation.Add the following lines to your chatter-max.py file:
def get_completion(client, messages): """Send messages to the chat completion API and return the response.""" try: response = client.chat.completions.create( model="", messages=messages, temperature=0.8, top_p=0.95, max_completion_tokens=400, stop=["<|im_end|>", "<|im_start|>"] ) return response.choices[0].message.content.strip(), True except APIConnectionError as e: return f"Connection error: {e}", False except APIError as e: return f"API error: {e}", False
The model parameter can be left empty when using Anaconda Desktop’s API server, as the model is already loaded on the server.
The main function initiates the chatbot, handles user inputs, and manages the flow of the conversation. This is where you set the initial system prompt that defines the assistant’s behavior.
Play around with the system prompt to see how it impacts the responses you receive from your model!If you set an API key value on the Model Servers page in Anaconda Desktop, you can pass it to the OpenAI client by replacing the empty string in the api_key parameter. Otherwise, leave it as an empty string to skip authentication.
Add the following lines to your chatter-max.py file:
def main(): """Run the interactive chatbot loop.""" client = OpenAI(base_url=base_url, api_key="") system_prompt = ('You are a friendly AI assistant designed to provide ' 'helpful, succinct, and accurate information.') messages = [{'role': 'system', 'content': system_prompt}] print(f"Connected to: {base_url}") while True: user_input = input("Enter a prompt or type 'exit' to quit: ") if user_input.lower() == 'exit': break messages.append({'role': 'user', 'content': user_input}) assistant_response, success = get_completion(client, messages) print('Assistant:', assistant_response) if success: messages.append({'role': 'assistant', 'content': assistant_response}) else: messages.pop()if __name__ == "__main__": main()
Try adjusting the following parameters in the get_completion() function to see how they affect the output from the model.
Visit OpenAI’s official documentation for additional available parameters. Note that some parameters may not be supported by all available models on Anaconda Desktop.
Adjusting the temperature of your model can increase or decrease the randomness of the responses you receive from your prompts. Higher values (example 2.0) make the output more creative and varied. Lower values (example 0.2) make the output more deterministic and focused. Valid range is 0.0 to 2.0. Defaults to 0.8.
Limits token selection to a subset of tokens with a cumulative probability above a threshold to balance creativity with coherence. Higher values (closer to 1.0) allow the model to provide more creative responses, and lower values enhance focus. Valid range is 0.0 to 1.0. Adjust top_p to see how it affects the model’s descriptiveness for the same prompt. Defaults to 0.95.
Controls the maximum number of tokens the model can generate in its response. One token is roughly equivalent to 4 characters or 0.75 words in English. Increasing this value allows for longer responses, while decreasing it keeps responses more concise. Defaults to 400.
Reduces the likelihood of the model repeating the same words or phrases. Higher values (up to 2.0) make the model avoid repetition more strongly, while lower or negative values allow more repetition. Valid range is -2.0 to 2.0. Defaults to 0.0.
Encourages the model to introduce new topics rather than continuing with existing ones. Higher values (up to 2.0) push the model to be more exploratory, while lower values allow it to stay on topic. Valid range is -2.0 to 2.0. Defaults to 0.0.
You can continue to develop and extend this chatbot by adding features like conversation history persistence, multi-turn context management, or integration with other tools. When you’re done, you can delete this file and clean up your conda environment by running the following command: