1 comments

  • tingjunchen 4 hours ago
    LLMs and agentic AI systems show immense promise for automated software development. However, applying them to hardware-in-the-loop (HIL) embedded and IoT systems is notoriously difficult due to the tight coupling between software logic, timing constraints, and physical hardware behavior. Code that compiles successfully often fails on real devices.

    To bridge this gap, we introduce an open-source, skills-based agentic AI framework for embedded and IoT systems development, and a comprehensive IoT-SkillsBench. Key highlights: 1. A skills-based agentic framework: A principled approach for injecting structured, domain-specific knowledge into LLM-based agents for reliable embedded and IoT systems development. 2. IoT-SkillsBench: A comprehensive benchmark designed to evaluate AI agents in real-world embedded programming settings, spanning 3 platforms, 23 peripherals, and 42 tasks across 3 difficulty levels. 3. 378 hardware-in-the-loop (HIL) experiments: Each task is evaluated under three agent configurations (no-skills, LLM-generated skills, and human-expert skills) and validated on real, physical hardware, demonstrating that structured human-expert skills achieve near-perfect success rates without reliance on retrieval or long-context reasoning.