Universal Endpoint
https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY
AI Gateway offers multiple endpoints for each Gateway you create - one endpoint per provider, and one Universal Endpoint. The Universal Endpoint requires some adjusting to your schema, but supports additional features. Some of these features are, for example, retrying a request if it fails the first time, or configuring a fallback model/provider when a request fails.
You can use the Universal endpoint to contact every provider. The payload is expecting an array of message, and each message is an object with the following parameters:
provider
: the name of the provider you would like to direct this message to. Can be openai/huggingface/replicateendpoint
: the pathname of the provider API you’re trying to reach. For example, on OpenAI it can bechat/completions
, and for HuggingFace this might bebigstar/code
. See more in the sections that are specific to each provider.authorization
: the content of the Authorization HTTP Header that should be used when contacting this provider. This usually starts with “Token” or “Bearer”.query
: the payload as the provider expects it in their official API.
Requestcurl https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY -X POST \ --header 'Content-Type: application/json' \ --data '[ { "provider": "huggingface", "endpoint": "bigcode/starcoder", "headers": { "Authorization": "Bearer $TOKEN", "Content-Type": "application/json" }, "query": { "input": "console.log" } }, { "provider": "openai", "endpoint": "chat/completions", "headers": { "Authorization": "Bearer $TOKEN", "Content-Type": "application/json" }, "query": { "model": "gpt-3.5-turbo", "stream": true, "messages": [ { "role": "user", "content": "What is Cloudflare?" } ] } }, { "provider": "replicate", "endpoint": "predictions", "authorization": "Token $TOKEN", "query": { "version": "2796ee9483c3fd7aa2e171d38f4ca12251a30609463dcfd4cd76703f22e96cdf", "input": { "prompt": "What is Cloudflare?" } } }]'
The above will send a request to HuggingFace Inference API, if it fails it will proceed to OpenAI, and then Replicate.