conda
using either Anaconda Distribution or Miniconda.text-generation
type model downloaded onto your local machine. Walk through the Getting started with Anaconda AI Navigator guide before you begin the tutorial.chatter-max.py
.
requests
package, which enables Python to make HTTP requests to the API server and receive responses.
Make this the first line of code in your chatter-max.py
file:
base_url
base_url
with a specific /endpoint
for each function. The base_URL
can be constructed by combining the Server Address and Server Port specified in Anaconda AI Navigator, like this: http://<SERVER_ADDRESS>:<SERVER_PORT>
.
Set the base_url
to point to the default server address by adding the following line to your file.
localhost
and 127.0.0.1
are semantically identical./health
endpoint and returns a JSON response that tells you the server’s status.
Add the following lines to your chatter-max.py
file:
/completion
endpoint. This function sends the user input to the model loaded into the API server and receives a generated response.
The prompt construction here provides context
to set the tone for how you would like the model to respond to your users. In essence, this is the initial prompt to the model, and it “sets the tone” for your model. We’ll revisit this later.
The separation of User:
and Assistant:
inputs onto new lines-delineated by their respective labels-helps the model distinguish between parts of the dialogue. Without this distinction, the model will assume that the user wants it to complete their input, rather than respond to it.
The data
dictionary is a structured collection of parameters that control how the AI model generates responses based on the user’s input. These parameters dictate the model’s behavior during the completion process. This is converted to JSON and sent as the body of the request.
Add the following lines to your chatter-max.py
file:
context
by appending the latest user input and the assistant’s response, thus keeping the model engaged in the conversation.
Add the following lines to your chatter-max.py
file:
main
function initiates the chatbot, handles user inputs, and manages the flow of the conversation. This is where you set the initial value for context
.
context
to see how it impacts the responses you receive from your model!chatter-max.py
file:
chatter-max.py
file.
context
prompt for the model.
/completion
endpoint’s data
dictionary to see how they affect the output from the model.
temperature
1.0
) make the output more free-flowing and creative. Lower values (example 0.2
) make the output more deterministic and focused. Defaults to 0.8
.
top_k
top_k
parameter confines the model’s response to the k
most probable tokens. Lowering the available tokens is like limiting the words the model has to choose from when attempting to guess which word comes next. top_k
defaults to 40
. Try setting the top_k
to higher and lower values to see how the model responds to the same prompt.
top_p
top_p
to see how it affects the models descriptiveness to the same prompt. Defaults to 0.95
.
stream
stream
to true
to see the model’s responses as they come in token by token. Streaming is set to false
by default.