AI Tech Daily - 2026-06-18 | Recsys Frontier

type

Post

status

Published

date

Jun 18, 2026 04:31

slug

ai-daily-en-2026-06-18

summary

📊 Today's Overview

AI hit multiple inflection points today. Noam Shazeer, co-author of the original Transformer paper, left Google for OpenAI — a decade-long pursuit finally realized. Vercel launched its eve agent framework with a full stack of components, while AWS and Hugging Face both unveiled critical agent infrastructure: Context for enterprise knowledge graphs and ARD for runtime tool discovery. On the research front, NVIDIA open-sourced Nemotron 3 Ultra, a 550B hybrid Mamba-Transformer model with 6x throughput gains, and Inclusion AI released Ling-2.6/Ring-2.6, a trillion-parameter family with hybrid linear attention. CMU challenged the Bitter Lesson with V-pretraining, and OpenAI's AI chemist improved reaction yields by 57% — a landmark for AI-driven science.

🔥 Trend Insights

Agent infrastructure goes mainstream: AWS Context, Hugging Face ARD, and Vercel eve all launched today — the ecosystem is moving from ad-hoc agent building to standardized discovery, context, and deployment layers.

Architecture innovation beyond scaling: Nemotron 3 Ultra's hybrid Mamba-Transformer and Ling-2.6's hybrid linear attention show the industry is actively exploring alternatives to pure Transformer architectures for efficiency.

AI for science hits production: OpenAI's AI chemist autonomously improved drug reactions by 57%, and Radical AI's self-driving lab synthesized 1,200 alloys in 6 months — AI-driven discovery is no longer theoretical.

🐦 X/Twitter Highlights

📈 热点与趋势

Noam Shazeer（Transformer 论文作者之一）离开 Google 加入 OpenAI - Sam Altman（OpenAI CEO）表示自 OpenAI 创立之初就希望与 Shazeer 合作，时隔 10 年终于实现。Shazeer 此前领导 Google 的对话模型团队。 @sama @NoamShazeer

🔧 工具与产品

Vercel 发布 eve agent 框架及完整 Agent Stack 组件 - Vercel（Web 开发部署平台）推出 eve，采用目录结构（agent/tools/skills/sandbox/schedules），配套发布 AI SDK、AI Gateway、Workflow SDK、Sandbox、Chat SDK 和 Vercel Connect（为 agent 提供短生令牌安全访问外部数据），以及企业级部署方案（身份验证、审计日志）。 @vercel @vercel @vercel @vercel

Cursor 推出云 agent 功能：支持手机提示和并行运行 - Cursor（AI 编码助手公司）让用户能将本地 agent 迁移到云端运行，合上笔记本也能继续工作，支持从手机触发和一次运行多 agent 并行。 @cursor_ai

Replit 与 Claude Design 集成，设计直接转为可部署应用 - Replit（在线 IDE 和部署平台）支持从 Claude Design 将设计方案发送到 Replit 生成可用应用。 @Replit

LlamaIndex 发布 LiteParse 开源 PDF 解析工具，优化 Claude Code 理解 PDF 效率 - Jerry Liu（LlamaIndex 创始人/CEO）团队发布 LiteParse，通过分析 Claude Code 对 PDF 的操作踪迹，避免重复解析、不必要的 OCR 和截图，加 BM25 检索，与原始 Claude Code 相比成本降低 37% 且精度更高。 @jerryjliu0 @llama_index

MiniMax M3 在 vLLM 推理引擎上获 Day 0 支持，NVIDIA 和 Inferact 参与优化 - vLLM（开源推理引擎 / UC Berkeley 出品）与 NVIDIA、Inferact、SemiAnalysis 合作，为 MiniMax M3 模型提供开箱即用的推理支持，后续将加入 disaggregated inferencing 和 FlashInfer M3 MoE kernel。 @vllm_project @SemiAnalysis_

DeepLearning.AI 推出免费语音 agent 集成课程 - DeepLearning.AI 与 VocalBridge 合作，教授三种模式：应用中嵌入语音、在已有 agent 上层叠语音、将语音作为可调用工具用于拨打外呼电话。 @DeepLearningAI

⚙️ 技术实践

NVIDIA GEAR 开源 ENPIRE：8 个 Codex 代理自主控制机器人集群进行实验研究 - Jim Fan（NVIDIA GEAR（具身智能研究团队）领导）介绍 ENPIRE 系统，让 8 个 Codex 代理操作机器人集群、分配 GPU 和 token 预算，自主完成高精度任务（扎线带、整理细针、安装 GPU）。系统包含两层安全硬约束（运动范围限位 + 柔顺夹爪），修复前固定奖励函数目标，并定义机器人利用率、token 利用率等指标。团队将开源全部代码。 @DrJimFan

LMSYS 用 SGLang-JAX 优化 1T 参数 MoE 模型在 TPU v7x 推理，MoE prefill 减少 53% - LMSYS Org（大模型系统评测组织）与 Inclusion AI 合作，为 Ling-2.6-1T（1T 混合 MoE 模型）开发融合 MoE V2 核（计算/通信重叠）、混合内存池（MLA KV + 递归状态）、GLA 线性注意力的分块并行预填充等优化。 @lmsysorg

Jacob Li 提出 Machine Studying：从文档集合中自主发展领域专业知识 - Jacob Li（独立研究者）定义“Machine Studying”问题：给定文档库，AI 系统应如何自主发展新领域专长。Omar Khattab（斯坦福教授 / DSPy 作者）评论这带来了首个令其满意的 agent 智能可测定义，并以“study compute”作为计量单位。Khattab 还推荐 OBLIQ-Bench（召回 @k）和 StudyBench（专长度）为目前少数可信的长上下文基准。 @jacobli99 @lateinteraction @lateinteraction @lateinteraction

Ai2 发布 MolmoMotion：3D 运动预测模型，可从单帧预测物体未来轨迹 - Ai2（Allen Institute for AI）发布 MolmoMotion，给定一个或多个视频帧、物体上的 3D 点以及指令（如“将白色碗放到桌上”），预测未来几秒内这些点在共享 3D 世界坐标系中的运动。 @allen_ai

⭐ Featured Content

AWS 发布 Context 服务：将企业数据自动映射为知识图谱，为 Agent 提供运行时上下文 ｜ Agent 基础设施创新

AWS 在纽约峰会上宣布推出 AWS Context 服务，自动将企业现有数据映射为知识图谱，为 AI Agent 提供运行时上下文搜索。该服务基于 Amazon Quick 的成熟知识图谱技术，支持数据管理员通过控制台管理推理关系、业务规则和领域知识。关键特性包括：图谱从 Agent 使用中学习并优化、支持 Apache Iceberg 开放格式、与 Glue/SageMaker/Lake Formation 集成。无需基础设施，几点击即可启用。这是 Agent 工程领域的重要基础设施创新，直接解决 Agent 上下文碎片化痛点，对构建企业级 Agent 应用的团队有直接部署价值。

Sources: AWS

Hugging Face 发布 Agentic Resource Discovery (ARD)：让 Agent 在运行时动态搜索工具和技能 ｜ Agent 生态关键基础设施

Hugging Face 发布 Agentic Resource Discovery (ARD) 规范与参考实现 hf-discover，让 Agent 能在运行时动态搜索工具、技能和其他 Agent，无需预配置。ARD 定义 ai-catalog.json 静态清单和 POST /search 动态注册 API，支持自然语言查询。Hub 实现已索引数千个 Skills、MCP Servers 和 Spaces。这是 Agent 生态从“安装-使用”到“意图搜索”的关键一步，解决了 Agent 工具发现和注册的核心痛点，与 MCP 生态互补。

Sources: Hugging Face

CMU 挑战 Bitter Lesson：提出 V-pretraining，用下游反馈动态构造自监督任务 ｜预训练范式新方向

CMU 博客文章挑战 Sutton 的 Bitter Lesson，指出当前预训练虽在模型训练上遵循规模法则，但在任务构建上仍依赖人工固定目标（如 one-hot next-token）。提出 V-pretraining：用下游反馈训练一个轻量级任务设计器，动态构造自监督任务（如自适应 top-K 软目标），使预训练梯度与下游梯度对齐。在 Qwen2.5-0.5B 上 GSM8K 从 22.20 提升至 29.60，视觉任务也有增益。方法保持自监督更新，不直接使用下游梯度，是预训练范式的有趣新方向，对从事 LLM 预训练研究的从业者有直接启发。

Sources: CMU Blog

OpenAI 发布 AI 化学家：GPT-5.4 自主改进药物化学反应，产率提升 57% ｜ AI 驱动科学发现里程碑

OpenAI 发布与 Molecule.one 合作成果：GPT-5.4 连接自动化化学实验室 Maria，自主提出并优化 Chan-Lam 偶联反应，发现 TEMPO 作为添加剂可将产率从 16.6% 提升至 25.2%，88% 的底物产率改善。系统展示了从假设生成到实验验证的闭环，是 AI 驱动科学发现的重要里程碑。对关注 AI for Science 的从业者，这是理解 AI 科学家工作流（假设生成→实验设计→执行→分析→迭代）的典型案例。

Sources: OpenAI

Nature Medicine 研究：通用 LLM 全面超越专用临床 AI 工具 ｜反直觉结论挑战行业假设

Nature Medicine 发表研究，系统评估通用 LLM（GPT-5.2、Gemini 3.1 Pro、Claude Opus 4.6）与专用临床 AI 工具（OpenEvidence、UpToDate Expert AI）在医学基准上的表现。通过 MedQA、HealthBench 和真实临床查询（RCQ）三个基准，12 位临床医生盲审 1800 次输出，发现通用 LLM 全面超越专用工具，后者表现与 Google Search AI Overview 相当。结论挑战了“专用模型必然更好”的假设，强调独立真实世界评估的必要性。对从事 LLM 应用评估和垂直领域落地的从业者有直接参考价值。

Sources: Nature Medicine

Anthropic 暂停 Claude Agent SDK 的 token 计费，转向固定费率 ｜ Agent 开发成本模式变化

Anthropic 暂停了 Claude Agent SDK 的基于 token 的计费模式，转向固定费率。此举回应了开发者对 token 计费不可预测性的抱怨，尤其是 Agent 循环中 token 消耗难以预估。固定费率降低了成本不确定性，可能推动更多开发者采用 Claude Agent SDK。对 Agent 工程从业者而言，这是影响 SDK 选型和成本模型的重要信号。

Sources: Ars Technica

Latent Space 深度访谈：自驱动实验室 6 个月合成 1200 种合金，开源 TorchSim ｜ AI for Materials 前沿实践

Latent Space 播客深度访谈 Radical AI 创始人 Joseph Krause，介绍其自驱动实验室（Self-Driving Lab）如何将合金发现速度提升至 DARPA/GE MACH 项目的近 10 倍——6 个月内生产并表征 1200 种合金，其中 10 种具有新颖的先进性能。核心洞察：实验数据是护城河，AI 科学家+自动化闭环系统实现并行研究。同时开源了 PyTorch 分子动力学模拟工具 TorchSim。文章还讨论了中美材料竞争格局及公共-私营合作策略。适合关注 AI for Science、自动化实验、材料信息学的从业者阅读。

Sources: Latent Space

Allen AI 发布 MolmoMotion：语言引导的 3D 运动预测模型 ｜多模态与机器人交叉方向

Allen AI 发布 MolmoMotion，一个语言引导的 3D 运动预测模型。给定视频帧、物体上的 3D 点和文字指令，预测未来几秒的 3D 点轨迹。同时发布最大规模的动作描述 3D 点轨迹数据集 MolmoMotion-1M（116 万视频）和人工验证基准 PointMotionBench。模型基于 Molmo 2 骨干，在机器人规划和可控视频生成等下游任务中展示潜力。开源权重、数据和代码，对从事多模态、机器人或视频生成的从业者有直接可用资源。

Sources: Hugging Face

🎙️ Podcast Picks

E240｜OpenAI联手PE砸下40亿美元，聊聊硅谷最火新职位FDE

📍 Source: 硅谷101 | ⭐⭐⭐⭐ | 🏷️ LLM, Agent, Product | ⏱️ 51:24

Discusses how model companies like OpenAI and Anthropic are forming deployment companies, creating the new FDE (Frontline Deployment Engineer) role. Guests Jove (Cresta FDE lead) and Oliver (ex-McKinsey) break down FDE responsibilities, the Palantir connection, PE partnership models, and key challenges in enterprise AI deployment.

💡 Why Listen: First deep dive on the hottest new AI job title. If you're wondering how enterprise AI actually gets deployed — and what career paths are emerging — this is your playbook.

🔬 The Self-Driving Lab — Joseph Krause, Radical AI

📍 Source: Latent Space | ⭐⭐⭐⭐ | 🏷️ Research, Infra, Interview | ⏱️ 1:16:50

Radical AI founder Joseph Krause on accelerating materials discovery with AI. Core thesis: materials science is harder to accelerate than biology due to supply chain and microstructure complexity. The self-driving lab (SDL) — combining AI scientists with automated experiments in a closed loop — is the key. Experimental data is the real moat.

💡 Why Listen: Concrete case study of AI in materials science, not hype. The 1,200 alloys in 6 months stat is staggering. For anyone building AI for science, this is required listening.

‘Hard Fork’ Live Part 2: Dylan Field on Standing Out in the A.I. Era

📍 Source: Hard Fork | ⭐⭐⭐ | 🏷️ Product, Interview | ⏱️ 00:31:14

Figma CEO Dylan Field discusses positioning a design company in the AI era, including the 'Design Is Dead' campaign and an Anthropic executive resigning from Figma's board. Field shares his views on how AI is reshaping the design industry and Figma's strategy.

💡 Why Listen: Lightweight but insightful take on AI's impact on creative tools. Good for product folks thinking about how AI changes design workflows.

📄 Paper Highlights

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

NVIDIA ｜ 🏷️ Architecture, Training, Inference, MoE

NVIDIA's 550B hybrid Mamba-Transformer with 55B active params achieves SOTA agentic reasoning at 6x higher throughput than comparable models. Open-source checkpoints, training data, and recipe — a major release for efficient agent deployment.

Ling and Ring 2.6 Technical Report: Efficient and Instant Agentic Intelligence at Trillion-Parameter Scale

Inclusion AI ｜ 🏷️ Architecture, Training, Agent Framework, Reasoning

Hybrid linear attention (Lightning Attention + MLA) and KPop RL framework enable a 1T-param model family with instant responses and deep reasoning. Open-source all checkpoints — the most practical trillion-parameter agent system to date.

Models Take Notes at Prefill: KV Cache Can Be Editable and Composable

Pine AI ｜ 🏷️ Inference, KV Cache, Agentic Workflow

KV cache as editable "notes": edit a field's value without full recompute, achieving up to 14.9x lower latency while maintaining decision identity. Validated across 12 models, quantization, MoE, and multimodal caches — a paradigm shift for agent inference.

🐙 GitHub Trending

MolmoMotion ｜ Language-guided 3D motion prediction

Allen AI's model predicts 3D point trajectories from video frames and text instructions. Comes with MolmoMotion-1M (1.16M videos) and PointMotionBench. Open-source weights, data, and code — immediately useful for robotics planning and video generation.

GitHub ｜ ⭐ New ｜ 🗣️ Python ｜ 🏷️ Multimodal, Robotics, Video Generation

LiteParse ｜ PDF parsing optimized for Claude Code

LlamaIndex's open-source PDF parser analyzes Claude Code's trace to avoid redundant OCR and screenshots. Combined with BM25 retrieval, it cuts costs by 37% while improving accuracy over raw Claude Code.

GitHub ｜ ⭐ New ｜ 🗣️ Python ｜ 🏷️ Tool, PDF, LLM

ENPIRE ｜ Autonomous robot swarm with Codex agents

NVIDIA GEAR's system lets 8 Codex agents control a robot cluster, allocate GPU/token budgets, and complete high-precision tasks autonomously. Two-layer safety constraints and full open-source code.

GitHub ｜ ⭐ New ｜ 🗣️ Python ｜ 🏷️ Robotics, Agent, Autonomy