AI providers & models supported by GPT for Work
GPT for Work supports models from Anthropic, Azure, DeepSeek, Google, Mistral, OpenAI, OpenRouter, Perplexity, and xAI. GPT for Work also supports open-source models through Ollama and any OpenAI-compatible API endpoint. The tables below show which models you can use with and without an API key, in which GPT for Work add-ons, and at what price.
Space admins can control model availability (Microsoft accounts only). Learn more.
Models you can use without an API key
You can use models from OpenAI, Google, Anthropic, and Perplexity.
Reasoning models, vision models, and web search models typically cost much more than regular, text-only models. Learn more.
Models that support prompt caching get a 75% discount on cached input tokens.
Models you can use with an API key
You can use models from OpenAI, Perplexity, Google, Anthropic, OpenRouter, DeepSeek, Mistral, Azure, xAI, and open-source models through Ollama and any OpenAI-compatible API endpoint.
Reasoning models, vision models, and web search models typically cost much more than regular, text-only models. Learn more.
You pay the API cost directly to the AI provider.
Models available through dedicated API endpoints (Azure, Ollama, other local servers and cloud-based platforms) do not consume credits, but require a positive balance or a valid subscription.
Notes
Reasoning models
Reasoning models are trained to think before they answer, producing an internal chain of thought before responding to a prompt. Reasoning models generate two types of tokens:
- 
Completion tokens make up the model's response. 
- 
Reasoning tokens make up the model's internal chain of thought. 
You are billed for both types of tokens.
Vision models
Vision models can process images as input. The following features support vision models:
- 
Custom prompt bulk AI tool 
- 
Prompt images (Vision) bulk AI tool 
- 
GPT_VISION function 
Image inputs are measured and charged in tokens, just like text inputs. How images are converted to text tokens depends on the model. You can find more information about the conversion in the OpenAI documentation and Anthropic documentation.
Web search models
Web search models can gather the latest information from the web and use it as context when generating responses. The larger the context, the more information a model can retrieve from each web source, producing richer and more detailed responses. For some models, you can select the context size in the model settings.
How web search model pricing works:
- 
Cost without an API key: Token cost + search cost 
- 
Cost with an API key: Token cost 
For more information about token cost and search cost, see Models you can use without an API key and Models you can use with an API key. The search cost varies by AI provider and context size.
API endpoints
You can use any OpenAI-compatible API endpoint with GPT for Work. You can connect to two main types of services:
- 
Cloud-based LLM platforms provide access to models over the internet with no software installation or setup required on your part. Popular examples include Anyscale, Fireworks AI, and Together AI. The available models vary from platform to platform. 
- 
Local LLM servers run on a local machine, such as your own computer or another computer on a local network. Popular examples include LM Studio, LocalAI, and Open WebUI. The available models depend on what's installed on the server you're using. 
What's next
The tables on this page were created with Awesome Table Apps.