In this tutorial, you’ll learn to build a simple chatbot in Python that utilizes Anaconda AI Navigator’s built-in API server to process natural language queries. You will use conda to establish a working environment to develop the chatbot, build the API calls for the chatbot application from snippets provided, interact with the chatbot at the command line, and view the API server logs to verify the application is functioning properly.Prerequisites
Before you begin, ensure that you have the conda installed on your machine. You can install conda using either Anaconda Distribution or Miniconda.
You must have a text-generation type model downloaded onto your local machine. Walk through the Getting started with Anaconda AI Navigator guide before you begin the tutorial.
When working on a new conda project, it is recommended that you create a new environment for development. Follow these steps to set up an environment for your chatbot:
Open Anaconda Prompt (Terminal on macOS/Linux).
This terminal can be opened from within an (JupyterLab, PyCharm, VSCode, Spyder), if preferred.
Create the for your chatbot development and install the packages you’ll need by running the following command:
Report incorrect code
Copy
Ask AI
conda create --name chataconda python requests
Activate your newly created conda environment by running the following command:
Report incorrect code
Copy
Ask AI
conda activate chataconda
For more information and best practices for managing environments, see Environments.
Below, you’ll find the necessary code snippets to build your chatbot, along with an explanation of each snippet to help you understand the functionality of the code.Using your preferred IDE, create a new file on your machine, and name it chatter-max.py.
The application we are building is simple, so we are only importing the requests package, which enables Python to make HTTP requests to the API server and receive responses.Make this the first line of code in your chatter-max.py file:
In order for your application to programmatically process natural language inputs to generate responses, run server health checks, and perform other actions, it is crucial that you properly structure your application to interact with the API server and its endpoints.The URLs for these API endpoints are constructed by combining a base_url with a specific /endpoint for each function. The base_URL can be constructed by combining the Server Address and Server Port specified in Anaconda AI Navigator, like this: http://<SERVER_ADDRESS>:<SERVER_PORT>.Set the base_url to point to the default server address by adding the following line to your file.
Report incorrect code
Copy
Ask AI
base_url = 'http://localhost:8080'
localhost and 127.0.0.1 are semantically identical.
The most common API endpoints are described in this tutorial. For a full list of API endpoints and detailed information on how to use them effectively, see the official llama.cpp HTTP server documentation.To enable your application to communicate with the API server, you must implement functions that make API calls in a way that the server can understand.
Before sending any requests to the server, it’s wise to verify that the server is operational. This function sends a GET request to the /health endpoint and returns a JSON response that tells you the server’s status.Add the following lines to your chatter-max.py file:
To interact with the model, you must have a function that prompts the server’s /completion endpoint. This function sends the user input to the model loaded into the API server and receives a generated response.The prompt construction here provides context to set the tone for how you would like the model to respond to your users. In essence, this is the initial prompt to the model, and it “sets the tone” for your model. We’ll revisit this later.The separation of User: and Assistant: inputs onto new lines-delineated by their respective labels-helps the model distinguish between parts of the dialogue. Without this distinction, the model will assume that the user wants it to complete their input, rather than respond to it.The data dictionary is a structured collection of parameters that control how the AI model generates responses based on the user’s input. These parameters dictate the model’s behavior during the completion process. This is converted to JSON and sent as the body of the request.Add the following lines to your chatter-max.py file:
After each interaction, you’ll want to update the context of the conversation to help the model produce coherent dialog. This function updates the values for context by appending the latest user input and the assistant’s response, thus keeping the model engaged in the conversation.Add the following lines to your chatter-max.py file:
The main function initiates the chatbot, handles user inputs, and manages the flow of the conversation. This is where you set the initial value for context.
Play around with the context to see how it impacts the responses you receive from your model!
Add the following lines to your chatter-max.py file:
Report incorrect code
Copy
Ask AI
def main(): context = "You are a friendly AI assistant designed to provide helpful, succinct, and accurate information." health = get_server_health() print('Server Health:', health) if health.get('status') == 'ok': while True: user_input = input("Enter a prompt or type 'exit' to quit: ") if user_input.lower() == 'exit': break assistant_response = post_completion(context, user_input) print('Assistant:', assistant_response) context = update_context(context, user_input, assistant_response) else: print("Server is not ready for requests.")if __name__ == "__main__": main()
Leave the Server Address and Server Port at the default values and click Start.
Open a terminal and navigate to the directory where you stored your chatter-max.py file.
Initiate the chatbot by running the following command:
Report incorrect code
Copy
Ask AI
python chatter-max.py
View the Anaconda AI Navigator API server logs. If everything is set up correctly, the server logs will populate with traffic from your chatbot application, starting with a health check and the initial context prompt for the model.
Adjusting the temperature of your model can increase or decrease the randomness of the responses you receive from your prompts. Higher values (example 1.0) make the output more free-flowing and creative. Lower values (example 0.2) make the output more deterministic and focused. Defaults to 0.8.
Limiting the top_k parameter confines the model’s response to the k most probable tokens. Lowering the available tokens is like limiting the words the model has to choose from when attempting to guess which word comes next. top_k defaults to 40. Try setting the top_k to higher and lower values to see how the model responds to the same prompt.
Limits token selection to a subset of tokens with a cumulative probability above a threshold to balance creativity with coherence. Higher values allows the model to provide more creative responses, and lower values enhance focus. Adjust top_p to see how it affects the models descriptiveness to the same prompt. Defaults to 0.95.
You can continue to develop and extend this chatbot by including other endpoints for more advanced usage–like tokenization or slot management–or you can delete this file and clean up your conda environment by running the following command: