Xiaohu AI デイリー — 2026-05-10

2026-05-10 · 日生成 22:26:15

ソース

175

記事数

498

高得点 8+

6

クラスタ

0

🌟 本日のヘッドライン

Fields Medalist says ChatGPT 5.5 Pro delivered "PhD-level" math research in under two hours with zero human help

Fields Medalist Timothy Gowers used ChatGPT 5.5 Pro on open number theory problems, with the model improving an exponential bound to polynomial in under an hour using what an MIT researcher called 'completely original' reasoning, demonstrating AI's capability for independent cutting-edge mathematical contributions.

💬 編集コメント

重要なのはスピードではなく権威の転換。数学界の最高権威がAIの独創性を認めた瞬間、AIは『道具』から『研究者』へ昇格。学問そのものの可能性の定義が塗り替わる。

続きを読む →

🔥本日のハイライト

01

Nvidia has already committed $40B to equity AI deals this year

9/10 ニュース

Nvidia continues to be a big investor in the AI ecosystem.

続きを読む →

02

Google 把 Fitbit Air 的全新 Google Health API 直接开放了！昨天 Fitbit Air 刚刚发布，但更重磅的是它自带了全新的 @googlehealth AP…

9/10 新製品

Google 随新款 Fitbit Air 发布了全新的 Health API 并向开发者开放。该 API 提供了涵盖运动、睡眠、心率、血氧等维度的 31 种健康数据点，支持 Webhooks 实时数据推送、精细的读写权限控制以及按时间范围查询和汇总数据。

続きを読む →

03

Introducing Pareto Code： a new， free， experimental coding router Set `min_coding_score` in your req…

9/10 新製品

推出帕累托代码：一款全新、免费、实验性的编码路由工具在请求中设置 `min_coding_score`，即可路由至符合您标准且成本最低的编码模型，排名由 @ArtificialAnlys 提供。实时查看帕累托前沿的变化👇

続きを読む →

04

Ranked No. 1 in benchmarks. Lightning speed. Native A/V sync. The era of waiting in line for AI vi…

9/10 新製品

基准测试排名第一。闪电速度。原生音视频同步。排队等待AI视频的时代结束了。HappyHorse现已在阿里云Model Studio上线。当别人还在渲染时，你已完成。立即构建：https：//int.alibabacloud.com/m/1000412167/

続きを読む →

05

"OncoAgent： A Dual-Tier Multi-Agent Framework for Privacy-Preserving Oncology Clinical Decision Support"

7/10 テック

研究团队发布了开源肿瘤临床决策支持系统OncoAgent。该系统采用双层多智能体框架，结合LangGraph拓扑与四阶段Corrective RAG流程，检索超过70份权威临床指南。系统根据查询复杂度，将任务路由至9B参数的速度优化模型或27B参数的深度推理模型，两者均通过QLoRA在AMD MI3…

続きを読む →

06

SpaceXAI 正式官宣了！ @xFreeze 贴出 USPTO 商标申请截图，"SpaceXAI" 已于 2026 年 5 月 6 日正式提交申请（序列号 99808217），目前处于 LIV…

7/10 業界分析

商标申请文件显示，"SpaceXAI"已于2026年5月6日提交申请，目前状态为待审查。该日期与Elon Musk宣布将xAI并入SpaceX的时间点吻合，标志着xAI的AI能力将与SpaceX的航天业务进行品牌统一，旨在将打造多行星文明与发展超级智能两大目标合并于单一实体之下。

続きを読む →

📖深読みの価値あり

🕐 約 3 分 · チュートリアル 7/10

It was always the case that agency was self-compounding， but AI is magnifying the effect. Low-agency…

💡 チュートリアル素材に展開可能

主观能动性向来具有自我增强的特性，而AI正在放大这种效应。低能动性的AI使用者进一步丧失能动性，高能动性的AI使用者则进一步增强能动性。

続きを読む →

🕐 約 3 分 · チュートリアル 7/10

Here's how you can integrate GPT-Realtime-2 to bring voice control to a CRM workflow.

💡 チュートリアル素材に展開可能

以下介绍如何集成GPT-Realtime-2为CRM工作流添加语音控制功能。

続きを読む →

📂カテゴリで見る

新製品

ZAYA1-8B Technical Report

6

Zyphra presents ZAYA1-8B, a reasoning-focused mixture-of-experts model with 700M active parameters from 8B total, trained on AMD infrastructure. It matches or exceeds DeepSeek-R1-0528 on math and coding benchmarks despite having under 1B active parameters.

続きを読む →

4

Claude Code released version 2.1.138, a routine maintenance update focused on internal improvements and stability enhancements. This point release does not introduce any new user-facing features or major functionality changes.

続きを読む →

4

Ollama v0.30.0-rc11 release candidate brings critical fixes for Windows build systems and developer workflows. Specifically, it resolves issues where compiler paths containing spaces would cause build failures, a widespread problem on Windows machines where default installation directories often include spaces in their names. These path issues have prevented successful compilation for many users.

続きを読む →

オピニオン

DBMSolver: A Training-free Diffusion Bridge Sampler for High-Quality Image-to-Image Translation

6

DBMSolver is a training-free sampler for Diffusion Bridge Models that accelerates image-to-image translation by exploiting semi-linear SDE/ODE structure through exponential integrators, achieving 1st and 2nd-order solutions while significantly reducing required function evaluations (NFEs).

続きを読む →

Horizon-Constrained Rashomon Sets for Chaotic Forecasting

6

Introduces horizon-constrained Rashomon sets, a theoretical framework characterizing how model multiplicity evolves with prediction horizon in chaotic systems, showing exponential growth unlike static prediction tasks, providing new insights into forecasting under uncertainty.

続きを読む →

Robustness of Graph Self-Supervised Learning to Real-World Noise: A Case Study on Text-Driven Biomedical Graphs

6

Examines robustness of Graph Self-Supervised Learning (GSSL) methods trained on automatically extracted knowledge graphs from text containing real-world noise, filling a gap in prior research that assumed clean, curated graph data.

続きを読む →

業界分析

Auction-Based Regulation for Artificial Intelligence

6

Proposes a rigorous mathematical framework for AI regulation based on auction mechanisms, addressing gaps in regulatory approaches to AI safety, bias, and legal compliance, offering structured methodology for governing AI deployment.

続きを読む →

Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at Intersections

5

Leverages existing CCTV networks and computer vision to measure real-world impact of urban design interventions (temporary pedestrian refuges, curb extensions) on vehicle speed and safety. Deep learning models enabled perspective-corrected speed analysis before and after each intervention.

続きを読む →

チュートリアル

BioMedArena: An Open-source Toolkit for Building and Evaluating Biomedical Deep Research Agents

6

BioMedArena is an open-source toolkit that simplifies building and evaluating biomedical deep research agents by providing unified evaluation harness and tool registry, reducing per-paper engineering overhead and enabling more efficient foundation model integration.

続きを読む →

MTL-MAD: Multi-Task Learners are Effective Medical Anomaly Detectors

6

MTL-MAD uses multiple self-supervised and pseudo-labeling tasks within a Mixture-of-Experts framework for medical image anomaly detection without anomaly samples during training, achieving state-of-the-art performance through proxy task integration.

続きを読む →

Fast and Efficient Gossip Algorithms for Robust and Non-smooth Decentralized Learning

6

Develops gossip-based algorithms for decentralized learning on resource-constrained edge devices that are communication-efficient and robust to data corruption, combining benefits of prior methods that previously required tradeoffs.

続きを読む →

📭今日はスキップ

自動でフィルタしました。理由をご覧ください：

DBMSolver: A Training-free Diffusion Bridge Sampler for High-Quality Image-to-Image Translation
→ 単一ソースの論文、一般読者には価値が低い
0.131.0-alpha.4
→ alpha/beta/rc マイナーリリース、新機能なし
rust-v0.131.0-alpha.3
→ alpha/beta/rc マイナーリリース、新機能なし
0.131.0-alpha.2
→ alpha/beta/rc マイナーリリース、新機能なし
BioMedArena: An Open-source Toolkit for Building and Evaluating Biomedical Deep Research Agents
→ 単一ソースの論文、一般読者には価値が低い
Horizon-Constrained Rashomon Sets for Chaotic Forecasting
→ 単一ソースの論文、一般読者には価値が低い
Robustness of Graph Self-Supervised Learning to Real-World Noise: A Case Study on Text-Driven Biomedical Graphs
→ 単一ソースの論文、一般読者には価値が低い
Steering Visual Generation in Unified Multimodal Models with Understanding Supervision
→ 単一ソースの論文、一般読者には価値が低い

📎 ロングテール (358) · クリックで展開

Best Arm Identification in Generalized Linear Bandits via Hybrid Feedback 5

Measuring Black-Box Confidence via Reasoning Trajectories: Geometry, Coverage, and Verbalization 5

Rethinking Vacuity for OOD Detection in Evidential Deep Learning 5

Structural Instability of Feature Composition 5

GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation 5

Accelerating LMO-Based Optimization via Implicit Gradient Transport 5

Fourier Feature Methods for Nonlinear Causal Discovery: FFML Scoring and FFCI Testing in Mixed Data 5

Tuning Derivatives for Causal Fairness in Machine Learning 5

The Weight Gram Matrix Captures Sequential Feature Linearization in Deep Networks 5

Operator-Guided Invariance Learning for Continuous Reinforcement Learning 5

Accelerating Discrete Facility Layout Optimization: A Hybrid CDCL and CP-SAT Architecture 5

CatNet: Controlling the False Discovery Rate in LSTM with SHAP Feature Importance and Gaussian Mirrors 5

Amortized Linear-time Exact Shapley Value for Product-Kernel Methods 5

Generalised Linear Models in Deep Bayesian RL with Learnable Basis Functions 5

StableTTA: Improving Vision Model Performance by Training-free Test-Time Adaptation Methods 5

Understanding Annotator Safety Policy with Interpretability 5

Partial Evidence Bench: Benchmarking Authorization-Limited Evidence in Agentic Systems 5

Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at Intersections 5

Intentionality is a Design Decision: Measuring Functional Intentionality for Accountable AI Systems 5

FoodCHA: Multi-Modal LLM Agent for Fine-Grained Food Analysis 5

Prober.ai: Gated Inquiry-Based Feedback via LLM-Constrained Personas for Argumentative Writing Development 5

Large Vision-Language Models Get Lost in Attention 5

MAT-Cell: A Multi-Agent Tree-Structured Reasoning Framework for Batch-Level Single-Cell Annotation 5

Knee Osteoarthritis Severity Grading Using Optimized Deep Learning and LLM-Driven Intelligent AI on Computationally Limited Systems 5

Multimodal Fact-Level Attribution for Verifiable Reasoning 5

SANet: A Semantic-aware Agentic AI Networking Framework for Cross-layer Optimization in 6G 5

PREFER: Personalized Review Summarization with Online Preference Learning 5

Intentmaking and Sensemaking: Human Interaction with AI-Guided Mathematical Discovery 5

HaM-World: Soft-Hamiltonian World Models with Selective Memory for Planning 5

Back to the Beginning of Heuristic Design: Bridging Code and Knowledge with LLMs 5

P-Guide: Parameter-Efficient Prior Steering for Single-Pass CFG Inference 5

Graphlets as Building Blocks for Structural Vocabulary in Knowledge Graph Foundation Models 5

Towards Annotation-Free Validation of MLLMs: A Vision-Language Logical Consistency Metric 5

Joint Consistency: A Unified Test-Time Aggregation Framework via Energy Minimization 5

Proactive Instance Navigation with Comparative Judgment for Ambiguous User Queries 5

Price of Fairness in Short-Term and Long-Term Algorithmic Selections 5

A Regime Theory of Controller Class Selection for LLM Action Decisions 5

Prediction and Empowerment: A Theory of Agency through Bridge Interfaces 5

SCRuB: Social Concept Reasoning under Rubric-Based Evaluation 5

Probabilistic Dating of Historical Manuscripts via Evidential Deep Regression on Visual Script Features 5

MedMamba: Recasting Mamba for Medical Time Series Classification 5

Layout-Aware Representation Learning for Open-Set ID Fraud Discovery 5

MidSteer: Optimal Affine Framework for Steering Generative Models 5

Adaptive Computation Depth via Learned Token Routing in Transformers 5

MACS: Modality-Aware Capacity Scaling for Efficient Multimodal MoE Inference 5

Decision-aware User Simulation Agent for Evaluating Conversational Recommender Systems 5

Automated Population-Level Audit Assurance via AI-Based Document Intelligence 5

Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning 5

Open-SAT: LLM-Guided Query Embedding Refinement for Open-Vocabulary Object Retrieval in Satellite Imagery 5

Making AI Drafts Count: A Quality Threshold in Audio Description Workflows 5

Tamaththul3D: High-Fidelity 3D Saudi Sign Language Avatars from Monocular Video 5

SPADE: Faster Drug Discovery by Learning from Sparse Data 5

Two Steps Are All You Need: Efficient 3D Point Cloud Anomaly Detection with Consistency Models 5

AstroAlertBench: Evaluating the Accuracy, Reasoning, and Honesty of Multimodal LLMs in Astronomical Classification 5

Irminsul: MLA-Native Position-Independent Caching for Agentic LLM Serving 5

MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems 5

Auto Research with Specialist Agents Develops Effective and Non-Trivial Training Recipes 5

LCC-LLM: Leveraging Code-Centric Large Language Models for Malware Attribution 5

CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency 5

LLM-Driven Design Space Exploration of FPGA-based Accelerators 5

Towards Reliable LLM Evaluation: Correcting the Winner's Curse in Adaptive Benchmarking 5

Adding Thermal Awareness to Visual Systems in Real-Time via Distilled Diffusion Models 5

Optimal Transport for LLM Reward Modeling from Noisy Preference 5

OBLIQ-Bench: Exposing Overlooked Bottlenecks in Modern Retrievers with Latent and Implicit Queries 5

Attributions All the Way Down? The Metagame of Interpretability 5

Fine-Tuning Small Language Models for Solution-Oriented Windows Event Log Analysis 5

TinyBayes: Closed-Form Bayesian Inference via Jacobi Prior for Real-Time Image Classification on Edge Devices 5

eXplaining to Learn (eX2L): Regularization Using Contrastive Visual Explanation Pairs for Distribution Shifts 5

Sparkle: Realizing Lively Instruction-Guided Video Background Replacement via Decoupled Guidance 5

Separation Assurance between Heterogeneous Fleets of Small Unmanned Aerial Systems via Multi-Agent Reinforcement Learning 5

The Structural Origin of Attention Sink: Variance Discrepancy, Super Neurons, and Dimension Disparity 5

Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study 5

Optimizer-Model Consistency: Full Finetuning with the Same Optimizer as Pretraining Forgets Less 5

Games for AI Control: Models of Safety Evaluations of AI Deployment Protocols 5

CORE: Concept-Oriented Reinforcement for Bridging the Definition-Application Gap in Mathematical Reasoning 5

BioAgent Bench: An AI Agent Evaluation Suite for Bioinformatics 5

Latent Generative Solvers for Generalizable Long-Term Physics Simulation 5

Neuro-Symbolic Proof Generation for Scaling Systems Software Verification 5

Supervising Ralph Wiggum: Exploring a Metacognitive Co-Regulation Agentic AI Loop for Engineering Design 5

Problem Reductions at Scale: Agentic Integration of Computationally Hard Problems 5

AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning 5

Position: agentic AI orchestration should be Bayes-consistent 5

New Bounds for Zarankiewicz Numbers via Reinforced LLM Evolutionary Search 5

Segment-Aligned Policy Optimization for Multi-Modal Reasoning 5

DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning 5

SoccerMaster: A Vision Foundation Model for Soccer Understanding 5

Perceptive Humanoid Parkour: Chaining Dynamic Human Skills via Motion Matching 5

AI Agents Alone Are Not (Yet) Sufficient for Social Simulation 5

PEPA: a Persistently Autonomous Embodied Agent with Personalities 5

Path Dependence under Adaptive AI Delegation 5

DC-DiT: Adaptive Compute and Elastic Inference for Visual Generation via Dynamic Chunking 5

Unsupervised Anomaly Detection in Wearable Foot Sensor Data: A Baseline Feasibility Study Towards Diabetic Foot Ulcer Prevention 5

ChArtist: Generating Pictorial Charts with Unified Spatial and Subject Control 5

P^2O: Joint Policy and Prompt Optimization 5

Spectral Edge Dynamics: An Analytical-Empirical Study of Phase Transitions in Neural Network Training 5

Frequency-Enhanced Diffusion Models: Curriculum-Guided Semantic Alignment for Zero-Shot Skeleton Action Recognition 5

Mochi: Aligning Pre-training and Inference for Efficient Graph Foundation Models via Meta-Learning 5

AgriKD: Cross-Architecture Knowledge Distillation for Efficient Leaf Disease Classification 5

Fonttrio Launches as Open-Source Font Pairing Registry for shadcn/ui 5

AWS Improves Aurora Serverless: 45% Faster Ramp-Up, 30% Higher Throughput 5

Hackable Robot Lawn Mower Unlocks a New Nightmare 5

The Mismeasure of Open Source 5

BALAR : A Bayesian Agentic Loop for Active Reasoning 5

On Time, Within Budget: Constraint-Driven Online Resource Allocation for Agentic Workflows 5

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key 5

One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue 5

PersonaKit (PK): A Plug-and-Play Platform for User Testing Diverse Roles in Full-Duplex Dialogue 5

HNC: Leveraging Hard Negative Captions towards Models with Fine-Grained Visual-Linguistic Comprehension Capabilities 5

Log-Likelihood, Simpson's Paradox, and the Detection of Machine-Generated Text 5

Measuring Evaluation-Context Divergence in Open-Weight LLMs: A Paired-Prompt Protocol with Pilot Evidence of Alignment-Pipeline-Specific Heterogeneity 5

Is Escalation Worth It? A Decision-Theoretic Characterization of LLM Cascades 5

When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels 5

Flexible Agent Alignment with Goal Inference from Open-Ended Dialog 5

ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems in the Wild 5

Autogenesis: A Self-Evolving Agent Protocol 5

WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning 5

GR-Ben: A General Reasoning Benchmark for Evaluating Process Reward Models 5

Zero-Shot Confidence Estimation for Small LLMs: When Supervised Baselines Aren't Worth Training 5

Evaluating Prompting and Execution-Based Methods for Deterministic Computation in LLMs 5

What Happens Inside Agent Memory? Circuit Analysis from Emergence to Diagnosis 5

Correct Is Not Enough: Training Reasoning Planners with Executor-Grounded Rewards 5

Attribution-Guided Pruning for Insight and Control: Circuit Discovery and Targeted Correction in Small-scale LLMs 5

MediEval: A Unified Medical Benchmark for Patient-Contextual and Knowledge-Grounded Reasoning in LLMs 5

Conversation for Non-verifiable Learning: Self-Evolving LLMs through Meta-Evaluation 5

Quantifying Hallucinations in Language Language Models on Medical Textbooks 5

MetaKE: Meta-Learning for Knowledge Editing Toward a Better Accuracy-Editability Trade-off 5

Alternating Reinforcement Learning with Contextual Rubric Rewards: Beyond the Scalarization Strategy 5

Bringing Up a Bilingual BabyLM: Investigating Multilingual Language Acquisition Using Small-Scale Models 5

Handling and Interpreting Missing Modalities in Patient Clinical Trajectories via Autoregressive Sequence Modeling 5

Information Aggregation with AI Agents 5

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising 5

Structured Progressive Knowledge Activation for LLM-Driven Neural Architecture Search 5

Supercharging LLM inference on Google TPUs: Achieving 3X speedups with diffusion-style speculative decoding 5

Troubleshoot performance issues faster with the new Grafana Assistant integration for Database Observability 5

Google's "Preferred Sources" feature is a free pass for more garbage in search 4

LaTA: A Drop-in, FERPA-Compliant Local-LLM Autograder for Upper-Division STEM Coursework 4

Text-Graph Synergy: A Bidirectional Verification and Completion Framework for RAG 4

Resolving the bias-precision paradox with stochastic causal representation learning for personalized medicine 4

HyperLens: Quantifying Cognitive Effort in LLMs with Fine-grained Confidence Trajectory 4

CircuitFormer: A Circuit Language Model for Analog Topology Design from Natural Language Prompt 4

Sheet as Token: A Graph-Enhanced Representation for Multi-Sheet Spreadsheet Understanding 4

On the Role of Language Representations in Auto-Bidding: Findings and Implications 4

Taklif.AI: LLM-Powered Platform for Interest-Based Personalized College Assignments 4

AirQualityBench: A Realistic Evaluation Benchmark for Global Air Quality Forecasting 4

Agentic, Context-Aware Risk Intelligence in the Internet of Value 4

Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model 4

Which Are the Low-Resource Languages of the Semantic Web? 4

Temporal Smoothness Doubly Robust Learning for Debiased Knowledge Tracing 4

From Coordinate Matching to Structural Alignment: Rethinking Prototype Alignment in Heterogeneous Federated Learning 4

ReasonSTL: Bridging Natural Language and Signal Temporal Logic via Tool-Augmented Process-Rewarded Learning 4

From Token Lists to Graph Motifs: Weisfeiler-Lehman Analysis of Sparse Autoencoder Features 4

Market-Alignment Risk in Pricing Agents: Trace Diagnostics and Trace-Prior RL under Hidden Competitor State 4

SpatialEpiBench: Benchmarking Spatial Information and Epidemic Priors in Forecasting 4

Improved techniques for fine-tuning flow models via adjoint matching: a deterministic control pipeline 4

A Note on TurboQuant and the Earlier DRIVE/EDEN Line of Work 4

Are Flat Minima an Illusion? 4

Physics-Informed Neural Networks with Learnable Loss Balancing and Transfer Learning 4

Evolutionary fine tuning of quantized convolution-based deep learning models 4

Governed Metaprogramming for Intelligent Systems: Reclassifying Eval as a Governed Effec 4

Graph Normalization: Fast Binarizing Dynamics for Differentiable MWIS 4

Feature Starvation as Geometric Instability in Sparse Autoencoders 4

Two-Stage Learned Decomposition for Scalable Routing on Multigraphs 4

Creative Robot Tool Use by Counterfactual Reasoning 4

On Semantic Loss Fine-Tuning Approach for Preventing Model Collapse in Causal Reasoning 4

The Pedagogy of AI Mistakes: Fostering Higher-Order Thinking 4

MOSAIC: Module Discovery via Sparse Additive Identifiable Causal Learning for Scientific Time Series 4

Nearly Optimal Attention Coresets 4

Leveraging Image Generators to Address Training Data Scarcity: The Gen4Regen Dataset for Forest Regeneration Mapping 4

The Missing Evaluation Axis: What 10,000 Student Submissions Reveal About AI Tutor Effectiveness 4

EGA: Adapting Frozen Encoders for Vector Search with Bounded Out-of-Distribution Degradation 4

CFE-PPAR: Compression-friendly encryption for privacy-preserving action recognition leveraging video transformers 4

Budgeted Attention Allocation: Cost-Conditioned Compute Control for Efficient Transformers 4

CRAFT: Forgetting-Aware Intervention-Based Adaptation for Continual Learning 4

CoMemNet: Contrastive Sampling with Memory Replay Network for Continual Traffic Prediction 4

The autoPET3 Challenge -- Automated Lesion Segmentation in Whole-Body PET/CT - Multitracer Multicenter Generalization 4

Revealing Modular Gradient Noise Imbalance in LLMs: Calibrating Adam via Signal-to-Noise Ratio 4

Q-MMR: Off-Policy Evaluation via Recursive Reweighting and Moment Matching 4

iPhoneBlur: A Difficulty-Stratified Benchmark for Consumer Device Motion Deblurring 4

LicenseGPT: A Fine-tuned Foundation Model for Publicly Available Dataset License Compliance 4

T2I-VeRW: Part-level Fine-grained Perception for Text-to-Image Vehicle Retrieval 4

Quantizing With Randomized Hadamard Transforms: Efficient Heuristic Now Proven 4

Quantum Kernels for Audio Deepfake Detection Using Spectrogram Patch Features 4

Causal Reinforcement Learning for Complex Card Games: A Magic The Gathering Benchmark 4

Normalized Architectures are Natively 4-Bit 4

VISD: Enhancing Video Reasoning via Structured Self-Distillation 4

Beyond Autoregressive RTG: Conditioning via Injection Outside Sequential Modeling in Decision Transformer 4

Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking 4

Continuous Expert Assembly: Instance-Conditioned Low-Rank Residuals for All-in-One Image Restoration 4

BUILD-AND-FIND: An Effort-Aware Protocol for Evaluating Agent-Managed Codebases 4

Autoregressive Visual Generation Needs a Prologue 4

Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex 4

SymDrift: One-Shot Generative Modeling under Symmetries 4

Unifying Goal-Conditioned RL and Unsupervised Skill Learning via Control-Maximization 4

Entropy-Regularized Adjoint Matching for Offline RL 4

In-Context Black-Box Optimization with Unreliable Feedback 4

EA-WM: Event-Aware Generative World Model with Structured Kinematic-to-Visual Action Fields 4

Taming the Entropy Cliff: Variable Codebook Size Quantization for Autoregressive Visual Generation 4

FunctionalAgent: Towards end-to-end on-top functional design 4

When to Trust Imagination: Adaptive Action Execution for World Action Models 4

Soft Deterministic Policy Gradient with Gaussian Smoothing 4

Band Together: Untargeted Adversarial Training with Multimodal Coordination against Evasion-based Promotion Attacks 4

Cumulative-Goodness Free-Riding in Forward-Forward Networks: Real, Repairable, but Not Accuracy-Dominant 4

Inference-Time Refinement Closes the Synthetic-Real Gap in Tabular Diffusion 4

Pro-KLShampoo: Projected KL-Shampoo with Whitening Recovered by Orthogonalization 4

NavOne: One-Step Global Planning for Vision-Language Navigation on Top-Down Maps 4

Memory Efficient Full-gradient Attacks (MEFA) Framework for Adversarial Defense Evaluations 4

Flow Matching with Arbitrary Auxiliary Paths 4

Continuous-Time Distribution Matching for Few-Step Diffusion Distillation 4

MinMax Recurrent Neural Cascades 4

Asymmetric On-Policy Distillation: Bridging Exploitation and Imitation at the Token Level 4

Consistent Geometric Deep Learning via Hilbert Bundles and Cellular Sheaves 4

ORTHOBO: Orthogonal Bayesian Hyperparameter Optimization 4

3D MRI Image Pretraining via Controllable 2D Slice Navigation Task 4

DINORANKCLIP: DINOv3 Distillation and Injection for Vision-Language Pretraining with High-Order Ranking Consistency 4

Concept-Based Abductive and Contrastive Explanations for Behaviors of Vision Models 4

BAMI: Training-Free Bias Mitigation in GUI Grounding 4

Multi-Modality Distillation via Learning the teacher's modality-level Gram Matrix 4

An Efficient Insect-inspired Approach for Visual Point-goal Navigation 4

Beyond Factual Correctness: Mitigating Preference-Inconsistent Explanations in Explainable Recommendation 4

Making AI Evaluation Deployment Relevant Through Context Specification 4

Disentangled Generative Graph Representation Learning 4

ReMAP: Neural Reparameterization for Scalable MAP Inference in Arbitrary-Order Markov Random Fields 4

Multi-Scale Spectral Attention Module-based Hyperspectral Segmentation in Autonomous Driving Scenarios 4

REMAP: Regularized Matching and Partial Alignment of Video Embeddings 4

Beyond Value Elicitation: Towards Moral Profiles in Early Requirements Engineering via Role-Playing Games and Anthropologist LLMs 4

A Practitioner's Guide to Kolmogorov-Arnold Networks 4

AsyncVLA: Asynchronous Flow Matching for Vision-Language-Action Models 4

Continually Evolving Skill Knowledge in Vision Language Action Model 4

Interpretability-Guided Bi-objective Optimization: Aligning Accuracy and Explainability 4

Switchcodec: Adaptive residual-expert sparse quantization for high-fidelity neural audio coding 4

PixelGen: Improving Pixel Diffusion with Perceptual Supervision 4

Visual Para-Thinker: Divide-and-Conquer Reasoning for Visual Comprehension 4

Spectral Alignment in Forward-Backward Representations via Temporal Abstraction 4

Caracal: Causal Architecture via Spectral Mixing 4

Reading List 05/09/2026 4

Using a Python 3 LSP server with Python 2 code works (more or less) 4

Decodable but Not Corrected by Fixed Residual-Stream Linear Steering: Evidence from Medical LLM Failure Regimes 4

Rethinking Adapter Placement: A Dominant Adaptation Module Perspective 4

OPSD Compresses What RLVR Teaches: A Post-RL Compaction Stage for Reasoning Models 4

Patch-Effect Graph Kernels for LLM Interpretability 4

Counterargument for Critical Thinking as Judged by AI and Humans 4

Generating Query-Focused Summarization Datasets from Query-Free Summarization Datasets 4

Adaptive Selection of LoRA Components in Privacy-Preserving Federated Learning 4

IRC-Bench: Recognizing Entities from Contextual Cues in First-Person Reminiscences 4

Contrastive Identification and Generation in the Limit 4

TIDE: Every Layer Knows the Token Beneath the Context 4

Linear Semantic Segmentation for Low-Resource Spoken Dialects 4

WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling 4

E = T*H/(O+B): A Dimensionless Control Parameter for Mixture-of-Experts Ecology 4

Litespark Inference on Consumer CPUs: Custom SIMD Kernels for Ternary Neural Networks 4

Recursive Agent Optimization 4

Benchmarking PNW Model for MedMNIST to 100% Accuracy 4

Learning Lifted Action Models from Unsupervised Visual Traces 4

An Agent-Oriented Pluggable Experience-RAG Skill for Experience-Driven Retrieval Strategy Orchestration 4

Searching the Internet for Challenging Benchmarks at Scale 4

Catch Your Breath: Adaptive Computation for Self-Paced Sequence Production 4

DialectLLM: A Dialect-Aware Dialog[ue] Generation Framework Beyond Standard American English 4

CAMEL: Confidence-Gated Reflection for Reward Modeling 4

Adaptive Greedy Frame Selection for Long Video Understanding 4

Structural Sensitivity in Compressed Transformers: Relative Error Propagation and Layer Removal 4

Screening Is Enough 4

Reward Score Matching: Unifying Reward-based Fine-tuning for Flow and Diffusion Models 4

Low-Rank Adaptation for Critic Learning in Off-Policy Reinforcement Learning 4

Enhancing Speaker Verification with Whispered Speech via Post-Processing 4

Enhancing Science Classroom Discourse Analysis through Joint Multi-Task Learning for Reasoning-Component Classification 4

How Fast Should a Model Commit to Supervision? Training Reasoning Models on the Tsallis Loss Continuum 4

RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners 4

Probe-Geometry Alignment: Erasing the Cross-Sequence Memorization Signature Below Chance 4

Predictive and Prescriptive AI toward Optimizing Wildfire Suppression 4

Continual Knowledge Updating in LLM Systems: Learning Through Multi-Timescale Memory Dynamics 4

The Real Singularity is the Friends We Made Along the Way 3

Authorization Propagation in Multi-Agent AI Systems: Identity Governance as Infrastructure 3

Agentic Discovery of Exchange-Correlation Density Functionals 3

LANTERN: LLM-Augmented Neurosymbolic Transfer with Experience-Gated Reasoning Networks 3

Housing Potential Common Data Model and City Digital Twin 3

BitCal-TTS: Bit-Calibrated Test-Time Scaling for Quantized Reasoning Models 3

Locality-aware Private Class Identification for Domain Adaptation with Extreme Label Shift 3

Retrieval-Conditioned Topology Selection with Provable Budget Conservation for Multi-Agent Code Generation 3

Attractor Geometry of Transformer Memory: From Conflict Arbitration to Confident Hallucination 3

GCCM: Enhancing Generative Graph Prediction via Contrastive Consistency Model 3

Knowledge-Graph Paths as Intermediate Supervision for Self-Evolving Search Agents 3

SDFlow: Similarity-Driven Flow Matching for Time Series Generation 3

Evaluating Explainability in Safety-Critical ATR Systems: Limitations of Post-Hoc Methods and Paths Toward Robust XAI 3

Confidence is the key: how conformal prediction enhances the generative design of permeable peptides 3

HEDP: A Hybrid Energy-Distance Prompt-based Framework for Domain Incremental Learning 3

Von Neumann Networks 3

Long-Horizon Q-Learning: Accurate Value Learning via n-Step Inequalities 3

XDecomposer: Learning Prior-Free Set Decomposition for Multiphase X-ray Diffraction 3

Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning 3

Topology-Driven Anti-Entanglement Control for Soft Robots 3

PPO-Based Dynamic Positioning of HAPS-BS in Wind-Disturbed Stratospheric Maritime Networks 3

Memory-Efficient EDA Denoising via Knowledge Distillation for Wearable IoT Under Severe Motion Artifacts and Underwater Conditions 3

Enhancing Cryo-EM Density Map Segmentation in Phenix for Improved Atomic Model Building 3

Towards an Inferentialist Account of Information Through Proof-theoretic Semantics 3

When Semantic Communication Meets Queueing: Cross-Layer Latency and Task Fidelity Optimization 3

A Testable Certificate for Constant Collapse in Teacher-Guided VAEs 3

VARS-FL: Validation-Aligned Client Selection for Non-IID Federated Learning in IoT Systems 3

Architecture-agnostic Lipschitz-constant Bayesian header and its application to resolve semantically proximal classification errors with vision transformers 3

Beyond Uniform Credit Assignment: Selective Eligibility Traces for RLVR 3

CredibleDFGO: Differentiable Factor Graph Optimization with Credibility Supervision 3

Learning Discrete Autoregressive Priors with Wasserstein Gradient Flow 3

Super-Level-Set Regression: Conditional Quantiles via Volume Minimization 3

A Topological Sorting Criterion for Random Causal Directed Acyclic Graphs 3

On the Security of Research Artifacts 3

On the Implicit Reward Overfitting and the Low-rank Dynamics in RLVR 3

Directional Consistency as a Complementary Optimization Signal: The GONO Framework 3

Towards Metric-Faithful Neural Graph Matching 3

Goal-Driven Query Answering over First- and Second-Order Dependencies with Equality 3

Refining Gelfond Rationality Principle: Towards More Comprehensive Foundational Principles for Answer Set Semantics 3

Are Large Language Models Robust in Understanding Code Against Semantics-Preserving Mutations? 3

Cohort-Based Active Modality Acquisition 3

Practical Adversarial Attacks on Stochastic Bandits via Fake Data Injection 3

Leveraging Analytic Gradients in Provably Safe Reinforcement Learning 3

A Survey of Personalized Federated Foundation Models for Privacy-Preserving Recommendation 3

Multivariate Standardized Residuals for Conformal Prediction 3

Toward Practical Equilibrium Propagation: Brain-inspired Recurrent Neural Network with Feedback Regulation and Residual Connections 3

Frictional Q-Learning 3

On the optimization dynamics of RLVR: Gradient gap and step size thresholds 3

Mapping Human Anti-collusion Mechanisms to Multi-agent AI Systems 3

CSMCIR: CoT-Enhanced Symmetric Alignment with Memory Bank for Composed Image Retrieval 3

Dynamic Expert-Guided Model Averaging for Causal Discovery 3

Theoretically Optimal Attention/FFN Ratios in Disaggregated LLM Serving 3

Keep Rehearsing and Refining: Lifelong Learning Vehicle Routing under Continually Drifting Tasks 3

The Illusion of Forgetting: Attack Unlearned Diffusion via Initial Latent Variable Optimization 3

SMI: Statistical Membership Inference for Reliable Unlearned Model Auditing 3

DOGMA: Weaving Structural Information into Data-centric Single-cell Transcriptomics Analysis 3

AROpt: An Optimization Method for Autoregressive Time Series Forecasting 3

It's Not a Lottery, It's a Race: Understanding How Gradient Descent Adapts the Network's Capacity to the Task 3

Parity, Sensitivity, and Transformers 3

A Theoretical Analysis of Test-Driven Code Generation 3

Action-to-Action Flow Matching 3

AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine Learning Engineering 3

Risk Horizons: Structured Hypothesis Spaces for Longitudinal Clinical Prediction 3

On the Rate-Distortion-Complexity Tradeoff for Semantic Communication 3

Molecular Design beyond Training Data with Novel Extended Objective Functionals of Generative AI Models Driven by Quantum Annealing Computer 3

Decentralized Attention Fails Centralized Signals: Rethinking Transformers for Medical Time Series 3

Same Words, Different Judgments: How Preferences Vary Across Modalities 3

A Detection-Gated Pipeline for Robust Glottal Area Waveform Extraction and Clinical Pathology Assessment 3

Pluralistic: Trump's fruitless search for a goreable ox (09 May 2026) 3

Notes on using GNU Emacs' Tramp system in an unusual shell environment 3

When and Why SignSGD Outperforms SGD: A Theoretical Study Based on $\ell_1$-norm Lower Bounds 3

Learning Reasoning Rewards from Expert Demonstrations with Inverse Reinforcement Learning 3

Cataract-LMM Large-Scale Multi-Source Multi-Task Benchmark for Deep Learning in Surgical Video Analysis 3

PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning 3

From Documents to Spans: Scalable Supervision for Evidence-Based ICD Coding with LLMs 3

asRoBallet: Closing the Sim2Real Gap via Friction-Aware Reinforcement Learning for Underactuated Spherical Dynamics 3

Superlinear Returns 3

How to Do Great Work 3

How to Get New Ideas 3

Eliminate noisy log lines with Adaptive Logs drop rules 3

0.131.0-alpha.4 2

rust-v0.131.0-alpha.3 2

0.131.0-alpha.2 2

Speeding Up AI: Bringing Google Colossus to PyTorch via GCSFS and Rapid Bucket 2