- 标签:
- AI (120)
- Daily (96)
- Tech Trends (96)
- 技术趋势 (18)
- 周报 (17)
- 推荐系统 (16)
- 日报 (15)
- Recommendation Systems (10)
- Weekly (10)
- Papers (10)
- Agentic Engineering (7)
- 思考 (6)
- 论文 (6)
- 深度学习 (4)
- 工具 (3)
- Harness Engineering (3)
- 推荐 (2)
- 强化学习 (1)
- 思维模型 (1)
- Transformer (1)
- LLM (1)
- 管理 (1)
- 生成式 (1)
AI hit major milestones today: Axiom Math's system scored a perfect 120 on the Putnam exam, beating top human undergraduates and DeepSeek with formal verification. NVIDIA dropped Nemotron 3 Ultra, a 550B MoE with Mamba-Attention that delivers 5x inference speedup for agent workflows. OpenAI upgraded
AI funding hit record highs and evaluation methods faced a reckoning today. DeepSeek is closing ~$7B in funding at a $30B+ valuation, while Alphabet raised ~$85B through equity financing with $10B from Berkshire Hathaway. Google dropped Gemma 4 12B — an encoder-free multimodal model that runs on a l
AI hit a major inflection point today: Microsoft released MAI-Thinking-1, its first self-trained reasoning model, alongside 6 other models and an Agent Control Specification open standard — a full-stack AI strategy rollout. GitHub's COO revealed that AI agents have driven a 1,400% surge in code comm
AI hit a major capital markets milestone today: Anthropic filed its S-1, kicking off the IPO race with OpenAI. Meanwhile, MiniMax dropped M3 — a model that beats GPT-5.5 and Gemini 3.1 Pro on key benchmarks at just 5-10% the cost, marking the first time a Chinese model has topped US frontier models.
AI's center of gravity shifted today on multiple fronts. OpenAI kicked off its Robotics hiring push under Aditya Ramesh, while MiniMax dropped M3 — the first open-weight model combining coding, 1M context, and native multimodality. NVIDIA's N1X PC SoC announcement signals its expansion from GPU to C
AI security hit a milestone — attackers used an LLM agent for real post-exploitation, completing a full cloud breach in under an hour. vLLM v0.22.0 landed with DeepSeek V4 support and 28.9% latency reduction, while NVIDIA's DynoSim simulates inference stacks 1500x faster than real-time. On the busin
This week's AI narrative converges on one core theme: Agents have shifted from "helping developers write code" to "working independently in the background," with inference efficiency, safety evaluation, and capital spending all accelerating in parallel. Anthropic's Opus 4.8 and Dynamic Workflows push parallel sub-agent counts into the hundreds. OpenAI's Codex expands to Windows and adds remote monitoring from mobile. xAI launches grok-build-0.1 at rock-bottom pricing, purpose-built for agentic coding. None of these are "better Tab completion" — they mark a new paradigm where agents participate as asynchronous teammates. Latent Space's interview with Cognition and OpenInspect founders maps the evolution from Copilot (first wave) to local agents (second wave) to async agents (third wave). The "third era" Cursor's CEO described was validated by multiple real-world deployments this week. Capital follows the same vector: Anthropic closes a $96.5B Series H at a $965B valuation, with $47B annualized revenue. Cognition raises $1B Series D at a $26B valuation, expecting year-end ARR over $1B. The model layer updates just as fast — Claude Opus 4.8 beats GPT-5.5 on multiple coding and agent benchmarks, with ~4x honesty improvement. MiniMax-M2 achieves 229.9B total params with only 9.8B active via MoE. Qwen-VLA unifies vision-language-action into a single model, reaching SOTA on 7 robotics benchmarks. On inference efficiency: vLLM integrates fastokens to remove long-context tokenization bottlenecks with a Rust BPE tokenizer. MobileMoE delivers 1.8–3.8× speedup on commodity phones. Orbit infrastructure (tweet) can train trillion-parameter models with RL on a single 8×B200 node. Safety also progresses: OpenAI publishes a handbook for third-party evaluations. Redpanda proposes out-of-band metadata channels for agent safety governance. Onyx Security launches enterprise-grade agent monitoring. Below are four detailed themes.
Anthropic shattered expectations today, closing a $65B Series H at a $96.5B valuation — surpassing OpenAI to become the world's most valuable AI startup — while simultaneously launching Claude Opus 4.8, its strongest coding model yet. Meanwhile, Meta's SilverTorch redefined recommendation system ret
AI coding and agent infrastructure dominated the news cycle. Cognition AI raised $1B at a $26B valuation, while Fireworks AI is reportedly in talks at $15B — the AI coding race is heating up fast. On the technical side, NVIDIA open-sourced Polar for GRPO training across agent tools, Hugging Face sla
AI's commercial landscape flipped today: Anthropic's revenue likely surpassed OpenAI by at least 35%, driven by enterprise preference for safety and reliability. Meanwhile, AI infrastructure hit a new milestone — Fireworks AI ($15B) and Baseten ($11B) became decacorns, marking the "inference inflect
AI hit major milestones today: OpenAI and Google DeepMind both cracked decades-old Erdős math problems — the first time AI has made such a fundamental mathematical breakthrough. On the efficiency front, HRM-Text trained a SOTA 1B model for just $1,500, challenging the scaling law orthodoxy, while De