At OpenAI DevDay, Microsoft unveiled significant expansions to Azure AI Foundry, introducing a suite of new AI models and capabilities designed to empower developers to build multimodal, agentic AI solutions at scale.
At OpenAI DevDay, Microsoft unveiled significant expansions to Azure AI Foundry, introducing a suite of new AI models and capabilities designed to empower developers to build multimodal, agentic AI solutions at scale.
The update includes the launch of GPT-image-1-mini, GPT-real-time-mini, and GPT-audio-mini, alongside enhanced safety upgrades to GPT-5, providing businesses and developers with faster, more efficient, and affordable tools for text, image, audio, and video generation.
The newly announced models will be rolling out in Azure AI Foundry, with most customers able to access them starting October 7, 2025. These launches complement Microsoft’s recent innovations, including the Microsoft Agent Framework (now in preview), multi-agent workflows in the Foundry Agent Service (private preview), unified observability, Voice Live API general availability, and new Responsible AI capabilities.
The Microsoft Agent Framework, available on GitHub, is an open-source SDK and runtime designed for the orchestration of multi-agent systems, combining the Semantic Kernel foundations with AutoGen’s multi-agent capabilities. This framework enables developers to build intelligent, scalable agentic solutions quickly and confidently.
The new GPT-image-1-mini model is optimized for high-quality text-to-image and image-to-image generation while consuming fewer computational resources, making it ideal for organizations looking to deploy multimodal AI at scale without compromising performance.
Similarly, GPT-real-time-mini and GPT-audio-mini are engineered for fast, cost-effective audio generation and real-time voice interaction. Their lightweight architecture ensures low latency and rapid inference, supporting applications such as voice-based chatbots, real-time translation, and dynamic audio content creation, all while reducing operational costs.
Microsoft emphasized that these models are part of a broader vision to give developers unparalleled flexibility and choice, enabling the creation of agentic AI systems capable of addressing complex business needs.
Looking ahead, Sora 2, soon to be integrated into Azure AI Foundry, will bring advanced video and audio generation in a single API, enabling features like physics-driven animation, synchronized dialogue, and cameo effects—all accessible through a unified platform.
With these innovations, Microsoft positions Azure AI Foundry as a comprehensive, multimodal AI development hub, giving enterprises and developers the tools to accelerate experimentation, scale AI solutions, and deliver immersive generative experiences across text, image, audio, and video.
This expansion underscores Microsoft’s commitment to building responsible, enterprise-ready AI that drives innovation while maintaining performance and efficiency.