Nvidia
OpenAI
25-04-15
GPT-4.1
GPT-4.1 (includes GPT-4.1, GPT-4.1 mini, GPT-4.1 nano) all better than GPT-4o and GPT-4o mini. Max 1m Tokens
Code: SWE-bench Verified: 54.6%
Instruction following: MultiChallenge: 38.3%
Long Context: Video-MME: 72%
GPT-4.1 Prompting Guide: https://cookbook.openai.com/examples/gpt4-1_prompting_guide
Code: SWE-bench Verified: 54.6%
Instruction following: MultiChallenge: 38.3%
Long Context: Video-MME: 72%
GPT-4.1 Prompting Guide: https://cookbook.openai.com/examples/gpt4-1_prompting_guide
25-04-16
Open AI o3 and o4-mini
OpenAI o3 is the most powerful reasoning model that pushes the frontier across coding, math, science, visual perception, and more.
Code: SWE-bench Verified: 69.1%
Instruction following: MultiChallenge: 56.51%
OpenAI o4-mini is a smaller model optimized for fast, cost-efficient reasoning—it achieves remarkable performance for its size and cost, particularly in math, coding, and visual tasks.
Code: SWE-bench Verified: 68.1%
Instruction following: MultiChallenge: 42.99%
https://openai.com/index/introducing-o3-and-o4-mini/
Code: SWE-bench Verified: 69.1%
Instruction following: MultiChallenge: 56.51%
OpenAI o4-mini is a smaller model optimized for fast, cost-efficient reasoning—it achieves remarkable performance for its size and cost, particularly in math, coding, and visual tasks.
Code: SWE-bench Verified: 68.1%
Instruction following: MultiChallenge: 42.99%
https://openai.com/index/introducing-o3-and-o4-mini/
25-03
ReAct agent
Create a ReAct agent using Google's Gemini 2.5 Pro or Gemini 2.0 Flash and LangGraph from scratch. https://github.com/philschmid/gemini-samples/blob/main/guides/langgraph-react-agent.ipynb?linkId=13991564
25-04
Veo 2
Google's most Advanced video model, Veo 2, has officially landed on Gemini Advanced and Whisk! In the GeminiApp, you can create high-resolution videos up to 8 seconds long with text prompts. These videos feature smooth character movements and realistic scenes, and support multiple styles. https://labs.google/fx/tools/whisk/unsupported-country
25-04-17
Gemini 2.5 Flash
Gemini 2.5 Flash is Google's first fully hybrid reasoning model, giving developers the ability to turn thinking on or off.
https://developers.googleblog.com/en/start-building-with-gemini-25-flash/
https://developers.googleblog.com/en/start-building-with-gemini-25-flash/
25-04-18
Gemma 3 QAT
Gemma 3 is optimized with Quantization-Aware Training that enables you to run Gemma 3 27B locally on consumer-grade GPUs like the NVIDIA RTX 3090.
https://developers.googleblog.com/en/gemma-3-quantized-aware-trained-state-of-the-art-ai-to-consumer-gpus/
https://developers.googleblog.com/en/gemma-3-quantized-aware-trained-state-of-the-art-ai-to-consumer-gpus/
Claude
25-04-16
Claude research
Research and a Google Workspace integration that connects email, calendar, and documents to Claude. https://www.anthropic.com/news/research
Grok
25-04
Grok Studio
Grok has released Grok Studio, adding new features such as code execution and Google Drive support. http://grok.com
Microsoft
25-04-16
BitNet b1.58 2B4T
The first open-source, native 1-bit LLM at the 2-billion parameter scale.
Cohere
25-04
Embed 4
Embed 4 delivers state-of-the-art accuracy and efficiency, helping enterprises securely retrieve their multimodal data to build agentic AI applications. https://cohere.com/blog/embed-4
ByteDance
Seed-Thinking-v1.5
Seed-Thinking-v1.5, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks. https://github.com/ByteDance-Seed/Seed-Thinking-v1.5
QWEN
25-04
Wan2.1-FLF2V-14B
Tongyi Qianwen has launched and open-sourced its first large model for "first frame - last frame video generation" https://wan.video
智谱AI
25-04-15
GLM
open source 32B/9B GLM https://chat.z.ai. GLM-Z1-32B-0414 is the fastest business model now (200 Tokens/s), with 1/30 price compared to DeepSeek-R1.
可灵AI
25-04-15
可灵2.0
可灵2.0(大师版)、可图2.0 https://app.klingai.com/cn/release-notes
群核科技
25-04
SpatialLM
SpatialLM is a 3D large language model designed to process 3D point cloud data and generate structured 3D scene understanding outputs. These outputs include architectural elements like walls, doors, windows, and oriented object bounding boxes with their semantic categories. https://huggingface.co/manycore-research
昆仑万维
The world's first movie generation model for unlimited duration using the Diffusion-Forcing framework. https://www.skyreels.ai/home