litellm. Since litellm does not currently support entrypoints to register the provider, you must import the module before use.
litellm.completion() for both standard and streamed completions (stream=True). Most OpenAI-compatible inference parameters are available and behave as expected, with the exception of the n parameter (for multiple completions), which is not currently supported.
You can also configure server behavior using the optional_params argument. Pass server configuration options under the "server" key:
litellm package, see the official documentation.