What's New in AI - 万达个人网页

Nvidia

OpenAI

25-04-15

GPT-4.1

GPT-4.1 (includes GPT-4.1, GPT-4.1 mini, GPT-4.1 nano) all better than GPT-4o and GPT-4o mini. Max 1m Tokens
Code: SWE-bench Verified: 54.6%
Instruction following: MultiChallenge: 38.3%
Long Context: Video-MME: 72%
GPT-4.1 Prompting Guide: https://cookbook.openai.com/examples/gpt4-1_prompting_guide

25-04-16

Open AI o3 and o4-mini

OpenAI o3 is the most powerful reasoning model that pushes the frontier across coding, math, science, visual perception, and more.
Code: SWE-bench Verified: 69.1%
Instruction following: MultiChallenge: 56.51%
OpenAI o4-mini is a smaller model optimized for fast, cost-efficient reasoning—it achieves remarkable performance for its size and cost, particularly in math, coding, and visual tasks.
Code: SWE-bench Verified: 68.1%
Instruction following: MultiChallenge: 42.99%
https://openai.com/index/introducing-o3-and-o4-mini/

Google

25-03

ReAct agent

Create a ReAct agent using Google's Gemini 2.5 Pro or Gemini 2.0 Flash and LangGraph from scratch. https://github.com/philschmid/gemini-samples/blob/main/guides/langgraph-react-agent.ipynb?linkId=13991564

25-04

Veo 2

Google's most Advanced video model, Veo 2, has officially landed on Gemini Advanced and Whisk! In the GeminiApp, you can create high-resolution videos up to 8 seconds long with text prompts. These videos feature smooth character movements and realistic scenes, and support multiple styles. https://labs.google/fx/tools/whisk/unsupported-country

25-04-17

Gemini 2.5 Flash

Gemini 2.5 Flash is Google's first fully hybrid reasoning model, giving developers the ability to turn thinking on or off.
https://developers.googleblog.com/en/start-building-with-gemini-25-flash/

25-04-18

Gemma 3 QAT

Gemma 3 is optimized with Quantization-Aware Training that enables you to run Gemma 3 27B locally on consumer-grade GPUs like the NVIDIA RTX 3090.
https://developers.googleblog.com/en/gemma-3-quantized-aware-trained-state-of-the-art-ai-to-consumer-gpus/

Claude

25-04-16

Claude research

Research and a Google Workspace integration that connects email, calendar, and documents to Claude. https://www.anthropic.com/news/research

Grok

25-04

Grok Studio

Grok has released Grok Studio, adding new features such as code execution and Google Drive support. http://grok.com

Microsoft

25-04-16

BitNet b1.58 2B4T

The first open-source, native 1-bit LLM at the 2-billion parameter scale.

Cohere

25-04

Embed 4

Embed 4 delivers state-of-the-art accuracy and efficiency, helping enterprises securely retrieve their multimodal data to build agentic AI applications. https://cohere.com/blog/embed-4

ByteDance

Seed-Thinking-v1.5

Seed-Thinking-v1.5, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks. https://github.com/ByteDance-Seed/Seed-Thinking-v1.5

QWEN

25-04

Wan2.1-FLF2V-14B

Tongyi Qianwen has launched and open-sourced its first large model for "first frame - last frame video generation" https://wan.video

智谱AI

25-04-15

GLM

open source 32B/9B GLM https://chat.z.ai. GLM-Z1-32B-0414 is the fastest business model now (200 Tokens/s), with 1/30 price compared to DeepSeek-R1.

可灵AI

25-04-15

可灵2.0

可灵2.0（大师版）、可图2.0 https://app.klingai.com/cn/release-notes

群核科技

25-04

SpatialLM

SpatialLM is a 3D large language model designed to process 3D point cloud data and generate structured 3D scene understanding outputs. These outputs include architectural elements like walls, doors, windows, and oriented object bounding boxes with their semantic categories. https://huggingface.co/manycore-research

昆仑万维

The world's first movie generation model for unlimited duration using the Diffusion-Forcing framework. https://www.skyreels.ai/home