Integrate as a Model Provider

This guide focuses on how to setup the classes and configuration necessary to act as a chat provider.

Please see this guide first and look at the existing code in the codebase to understand how to act as a different provider, e.g. handling embeddings or image-generation.

Overview

The way liteLLM works from a provider's perspective is simple.

liteLLM acts as a wrapper, it takes openai requests and routes them to your api. It then adapts your output into a standard output.

To integrate as a provider, you need to write a module that slots in the api and acts as an adapter between the liteLLM API and your API.

The module you will be writing acts as both a config and a means to adapt requests and responses.

Your objective is to effectively write this module so that it adapts inputs to your api, and adapts outputs to the calling liteLLM code.

It includes methods that:

Validate the request
Transform (adapt) the requests into requests sent to your api
Transform (adapt) responses from your api into responses given back to the calling liteLLM code
+ a few others

1. Create Your Config Class

Create a new directory with your provider name

`litellm/llms/your_provider_name_here`

Inside of there, you will want to add a file for your chat configuration

`litellm/llms/your_provider_name_here/chat/transformation.py`

The transformation.py file will contain a configuration class that dictates how your api will slot into the liteLLM api.

Define your config class extending BaseConfig:

from litellm.llms.base_llm.chat.transformation import BaseConfig

class MyProviderChatConfig(BaseConfig):
    def __init__(self):
        ...

We will fill in the abstract methods at a later point.

2. Add Yourself To Various Places In The Code Base

liteLLM is working to enhance this process, but currently, what you need to do is the following:

`litellm/init.py`

At the top part of the file, add your key to the list of keys as an option

azure_key: Optional[str] = None
anthropic_key: Optional[str] = None
replicate_key: Optional[str] = None
bytez_key: Optional[str] = None
cohere_key: Optional[str] = None
infinity_key: Optional[str] = None
clarifai_key: Optional[str] = None

Import your config

from .llms.bytez.chat.transformation import BytezChatConfig
from .llms.custom_llm import CustomLLM
from .llms.bedrock.chat.converse_transformation import AmazonConverseConfig
from .llms.openai_like.chat.handler import OpenAILikeChatConfig

`litellm/main.py`

Add yourself to main.py so requests can be routed to your config class

from .llms.bedrock.chat import BedrockConverseLLM, BedrockLLM
from .llms.bedrock.embed.embedding import BedrockEmbedding
from .llms.bedrock.image.image_handler import BedrockImageGeneration
from .llms.bytez.chat.transformation import BytezChatConfig
from .llms.codestral.completion.handler import CodestralTextCompletion
from .llms.cohere.embed import handler as cohere_embed
from .llms.custom_httpx.aiohttp_handler import BaseLLMAIOHTTPHandler

base_llm_http_handler = BaseLLMHTTPHandler()
base_llm_aiohttp_handler = BaseLLMAIOHTTPHandler()
sagemaker_chat_completion = SagemakerChatHandler()
bytez_transformation = BytezChatConfig()

Then much lower in the code

elif custom_llm_provider == "bytez":
    api_key = (
        api_key
        or litellm.bytez_key
        or get_secret_str("BYTEZ_API_KEY")
        or litellm.api_key
    )

    response = base_llm_http_handler.completion(
        model=model,
        messages=messages,
        headers=headers,
        model_response=model_response,
        api_key=api_key,
        api_base=api_base,
        acompletion=acompletion,
        logging_obj=logging,
        optional_params=optional_params,
        litellm_params=litellm_params,
        timeout=timeout,  # type: ignore
        client=client,
        custom_llm_provider=custom_llm_provider,
        encoding=encoding,
        stream=stream,
    )

    pass

NOTE you can rely on liteLLM passing each of the args/kwargs to your config via the .completion() call

`litellm/constants.py`

Add yourself to the list of LITELLM_CHAT_PROVIDERS

LITELLM_CHAT_PROVIDERS = [
    "openai",
    "openai_like",
    "bytez",
    "xai",
    "custom_openai",
    "text-completion-openai",

Add yourself to the if statement chain of providers here

`litellm/litellm_core_utils/get_llm_provider_logic.py`

elif model == "*":
    custom_llm_provider = "openai"
# bytez models
elif model.startswith("bytez/"):
    custom_llm_provider = "bytez"
if not custom_llm_provider:
    if litellm.suppress_debug_info is False:
        print()  # noqa

`litellm/litellm_core_utils/streaming_handler.py`

If you are doing something custom with streaming, this needs to be updated, e.g.

    def handle_bytez_chunk(self, chunk):
        try:
            is_finished = False
            finish_reason = ""

            return {
                "text": chunk,
                "is_finished": is_finished,
                "finish_reason": finish_reason,
            }
        except Exception as e:
            raise e

Then lower in the file

elif self.custom_llm_provider and self.custom_llm_provider == "bytez":
    response_obj = self.handle_bytez_chunk(chunk)
    completion_obj["content"] = response_obj["text"]
    if response_obj["is_finished"]:
        self.received_finish_reason = response_obj["finish_reason"]
    pass

3. Write a test file to iterate your code

Add a test file somewhere in the project, tests/test_litellm/llms/my_provider/chat/test.py

Write to it the following:

import os
from litellm import completion

os.environ["MY_PROVIDER_KEY"] = "KEY_GOES_HERE"

completion(model="my_provider/your-model", messages=[...], api_key="...")

If you want to run it with the vscode debugger you can do so with this config file (recommended)

.vscode/launch.json

{
  // Use IntelliSense to learn about possible attributes.
  // Hover to view descriptions of existing attributes.
  // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python Debugger: Current File",
      "type": "debugpy",
      "request": "launch",
      "program": "${file}",
      "console": "integratedTerminal",
      "env": {
        "PYTHONPATH": "${workspaceFolder}",
        "MY_PROVIDER_API_KEY": "YOUR_API_KEY"
      }
    }
  ]
}

If you run with the debugger, after you update "MY_PROVIDER_API_KEY": "YOUR_API_KEY" you can remove this from the test script:

os.environ["MY_PROVIDER_KEY"] = "KEY_GOES_HERE"

4. Implement Required Methods

It's wise to follow completion() in litellm/llms/custom_httpx/llm_http_handler.py

You will see it calls each of the methods defined in the base class.

The debugger is your friend.

`validate_environment`

Setup headers, validate key/model:

def validate_environment(...):
    headers.update({
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    })
    return headers

`get_complete_url`

Return the final request URL:

def get_complete_url(...):
    return f"{api_base}/{model}"

`transform_request`

Adapt OpenAI-style input into provider-specific format:

def transform_request(...):
    data = {"messages": messages, "params": optional_params}
    return data

`transform_response`

Process and map the raw provider response:

def transform_response(...):
    json = raw_response.json()
    model_response.model = model
    model_response.choices[0].message.content = json.get("output")
    return model_response

`get_sync_custom_stream_wrapper` / `get_async_custom_stream_wrapper`

If you need to do something these are here for you. See the litellm/llms/sagemaker/chat/transformation.py or the litellm/llms/bytez/chat/transformation.py implementation to better understand how to use these.

Use CustomStreamWrapper + httpx streaming client to yield content.

🧪 Tests

Create tests in tests/test_litellm/llms/my_provider/chat/test.py. Iterate until you are satisfied with the quality!

Spare thoughts

If you get stuck, see the other provider implementations, ctrl + shift + f and ctrl + p are your friends!

You can also visit the discord feedback channel

Overview​

1. Create Your Config Class​

litellm/llms/your_provider_name_here​

litellm/llms/your_provider_name_here/chat/transformation.py​

2. Add Yourself To Various Places In The Code Base​

litellm/__init__.py​

litellm/main.py​

litellm/constants.py​

litellm/litellm_core_utils/get_llm_provider_logic.py​

litellm/litellm_core_utils/streaming_handler.py​

If you are doing something custom with streaming, this needs to be updated, e.g.​

3. Write a test file to iterate your code​

4. Implement Required Methods​

validate_environment​

get_complete_url​

transform_request​

transform_response​

get_sync_custom_stream_wrapper / get_async_custom_stream_wrapper​

🧪 Tests​

Spare thoughts​