vlsi-moe-yarn is a domain-specialized LLM for VLSI and chip design — RTL generation, timing constraints, verification, and architecture reasoning across a 262K-token context window.
Capabilities
Standard LLMs fail at VLSI tasks — they lack RTL-specific knowledge, lose coherence on long specs, and hallucinate timing numbers. vlsi-moe-yarn is trained on a VLSI-domain corpus via knowledge distillation, with YaRN context extension to hold an entire SoC specification in a single pass.
AgentIC Pipeline
AgentIC is a multi-agent orchestration framework where each specialized agent calls vlsi-moe-yarn as its reasoning engine. The 262K context window carries full design state across the entire pipeline without losing earlier decisions.
All agents use vlsi-moe-yarn via OpenAI-compatible API
Usage
The model is served via vLLM on AMD Instinct MI300X with full OpenAI-compatible endpoints. Drop it into any agent framework with zero code changes.
from openai import OpenAI # Drop-in OpenAI client — just change base_url client = OpenAI( base_url="http://YOUR_SERVER:8000/v1", api_key="EMPTY", ) response = client.chat.completions.create( model="vlsi-moe-yarn", messages=[ {"role": "system", "content": "You are a VLSI design expert."}, {"role": "user", "content": "Generate a parameterized FIFO in SystemVerilog."} ], max_tokens=1024, ) print(response.choices[0].message.content)
Architecture
| Property | Value |
|---|---|
| Base | Qwen encoder-decoder backbone |
| Modification | 10% of FFN layers → reasoning-optimized FFN blocks |
| Context extension | YaRN Yet another RoPE extensioN |
| Max context | 262,144 tokens |
| dtype | bfloat16 · fp8 KV cache |
| Training method | Knowledge distillation on VLSI-domain corpus |
| Inference engine | vLLM v0.17.1 |
| Hardware | AMD Instinct MI300X (192 GB HBM3) |
Hugging Face
The full model weights, model card, and technical details are hosted on Hugging Face.
Model weights · Model card · Architecture details · License: MIT