Skip to main content
The LangChain integration provides Chat and Embedding classes that automatically manage downloading models and starting servers. Install langchain-openai in your environment:
pip install langchain-openai
Only pip install packages in your conda environment once all other packages and their dependencies have been installed. For more information on installing pip packages in your conda environment, see Installing pip packages.
The AnacondaQuantizedLLM class is not currently supported due to llama.cpp limitations with the v1/completions endpoint. Use AnacondaQuantizedModelChat for chat completions and AnacondaQuantizedModelEmbeddings for embeddings.See llama.cpp discussion for details.
Here is a minimal setup example for using LangChain with Anaconda’s models:
from langchain.prompts import ChatPromptTemplate
from anaconda_ai.integrations.langchain import AnacondaQuantizedModelChat, AnacondaQuantizedModelEmbeddings

prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
model = AnacondaQuantizedModelChat(model_name='meta-llama/llama-2-7b-chat-hf_Q4_K_M.gguf')

chain = prompt | model

message = chain.invoke({'topic': 'python'})
You can pass server configuration options using the extra_options parameter:
model = AnacondaQuantizedModelChat(
    model_name='meta-llama/llama-2-7b-chat-hf_Q4_K_M.gguf',
    extra_options={
        'ctx_size': 4096,
        'n_gpu_layers': 20,
        'temperature': 0.7
    }
)
These options are passed directly to the inference server during startup. For more information on using the langchain-openai package, see the official documentation.