Prepared: 2026-03-18 Author: COO / Claude Code Stack Audience: Daniel Guterman — Technical Decision-Maker
NVIDIA announced two products at GTC 2026 (March 16) that directly impact our OpenClaw evaluation:
Nemotron — NVIDIA's open-source LLM family. Hybrid Mamba-Transformer architecture with Mixture-of-Experts. Models range from 4B to ~500B parameters, but only activate 1B–50B at inference time, making them fast and efficient. Free, permissive license. Designed for agentic AI workloads at scale.
NemoClaw — An enterprise security wrapper that installs OpenClaw + Nemotron models inside a sandboxed runtime with network isolation, filesystem restrictions, and privacy routing. Apache 2.0 license. Currently alpha.
Together, they transform the OpenClaw equation. Our original evaluation flagged OpenClaw as CRITICAL security risk. NemoClaw addresses the top concerns (network exposure, filesystem access, credential leakage). Nemotron provides free local inference, eliminating API costs for routine tasks.
Bottom line: NemoClaw + Nemotron makes OpenClaw deployable in a way raw OpenClaw never was — but it's alpha software with no third-party audits yet, and our core objections (Jack's allergy safety, prompt-based security) still apply.
Nemotron is NVIDIA's family of open foundation models, spanning four generations since 2024. The current generation (Nemotron 3) introduces a breakthrough hybrid Mamba-Transformer Mixture-of-Experts architecture that delivers frontier-class quality at a fraction of the compute cost.
NVIDIA's strategy is clear: give away the models to sell the hardware. But the models are genuinely good, and the licensing is among the most permissive in the industry.
| Model | Total Params | Active Params | Context Window | Target Use Case |
|---|---|---|---|---|
| Nano 4B | 4B | ~1B | 1M tokens | Edge devices, mobile, IoT |
| Nano 30B | 30B | 3B | 1M tokens | Efficient agent tasks, local workstations |
| Super 120B | 120B | 12B | 1M tokens | Multi-agent workflows, complex reasoning |
| Ultra ~500B | ~500B | ~50B | 1M tokens | Frontier reasoning (expected H1 2026) |
The Nemotron 3 architecture combines three paradigms that have individually proven successful:
1. Mamba-2 Layers (Linear-Time Sequence Processing)
2. Transformer Attention Layers (Precise Associative Recall)
3. Mixture-of-Experts (Parameter Efficiency)
Novel Innovation — Latent MoE:
Multi-Token Prediction (MTP):
NVFP4 Native Training:
Phase 1 — Pretraining:
Phase 2 — Supervised Fine-Tuning:
Phase 3 — Multi-Environment Reinforcement Learning:
| Benchmark | Result | What It Measures |
|---|---|---|
| PinchBench | 85.6% (best open model in class) | Agent reasoning and planning |
| AIME 2025 | Leading in size class | Advanced mathematics |
| SWE-Bench Verified | Leading in size class | Real-world software engineering |
| Terminal Bench | Leading in size class | Command-line task completion |
| Throughput | 5x previous Nemotron | Raw inference speed |
| Benchmark | Nemotron 70B | GPT-4o | Claude 3.5 Sonnet |
|---|---|---|---|
| Arena Hard | 85.0 | 79.3 | 79.2 |
| AlpacaEval 2 LC | 57.6 | — | — |
| MT-Bench | 8.98 | — | — |
| Aider (coding) | 55.0% | 72.9% | — |
Honest assessment: Nemotron wins on alignment/chat benchmarks. Claude and GPT-4o still lead on coding and complex reasoning. Nemotron 3 Super is more competitive on coding (SWE-Bench leading in class), but detailed head-to-head vs Claude Opus/Sonnet not yet published.
Where Nemotron truly excels: Throughput. When you need many parallel agents doing moderate-complexity tasks, Nemotron's MoE architecture delivers more tokens per second per dollar than any competitor.
| Variant | Purpose |
|---|---|
| Nemotron 3 Omni | Multimodal — audio + vision + language in one model |
| Nemotron 3 VoiceChat | Real-time simultaneous listen-and-respond |
| Nemotron Nano VL 12B | Vision-language for image understanding |
| Nemotron RAG | Retrieval and embedding (leading ViDoRe, MTEB leaderboards) |
| Nemotron Safety | Content moderation and guardrails |
| Nemotron Speech | Automatic speech recognition and text-to-speech |
NVIDIA Open Model License:
| Platform | Access |
|---|---|
| Hugging Face | All models (BF16, FP8 variants) |
| NVIDIA NIM | API via build.nvidia.com |
| Ollama | Nemotron 3 Super for local inference |
| NeMo Framework | Full training and fine-tuning |
| GitHub | Developer assets at NVIDIA-NeMo/Nemotron |
Announced at GTC 2026 — a first-of-its-kind global collaboration:
Members: Black Forest Labs, Cursor, LangChain, Mistral AI, Perplexity, Reflection AI, Sarvam, Thinking Machines Lab
Goal: Collaboratively build the next generation of open frontier models across six families:
Accenture, Cadence, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, Synopsys, Zoom
NemoClaw is an open-source software stack that wraps OpenClaw with enterprise-grade security, privacy, and isolation controls. It is not a separate agent — it is OpenClaw running inside NVIDIA's security infrastructure.
Jensen Huang, GTC 2026 keynote: "30 years of NVIDIA computing, distilled into an agent platform."
Peter Steinberger (OpenClaw creator, now at OpenAI): "With NVIDIA and the broader ecosystem, we're building the claws and guardrails that let anyone create powerful, secure AI assistants."
curl -fsSL https://nvidia.com/nemoclaw.sh | bash
This installs:
Two-component design:
| Component | Language | Role |
|---|---|---|
| CLI Plugin | TypeScript | Integrates with OpenClaw CLI, user-facing |
| Blueprint | Python | Orchestrates OpenShell resources, manages sandbox |
This is NemoClaw's core value proposition — the direct answer to OpenClaw's CRITICAL security rating.
openclaw-sandbox.yaml (human-readable, version-controlled)/sandbox and /tmp| Spec | Minimum | Recommended |
|---|---|---|
| CPU | 4 vCPU | 4+ vCPU |
| RAM | 8 GB | 16 GB |
| Disk | 20 GB | 40 GB |
| OS | Ubuntu 22.04 LTS+ | Ubuntu 22.04 LTS+ |
| Runtime | Node.js 20+, Docker | Node.js 20+, Docker |
Hardware agnostic — does not require NVIDIA GPUs (though optimized for them). Supported on:
| Detail | Value |
|---|---|
| Announced | March 16, 2026 (GTC keynote) |
| License | Apache 2.0 |
| GitHub | github.com/NVIDIA/NemoClaw |
| Stars | ~6.7K (first 2 days) |
| Forks | 739 |
| Contributors | ~26 |
| Status | Alpha — "Expect rough edges" |
| Tech Stack | TypeScript 37.7%, Shell 30.6%, JS 25.7%, Python 4.9% |
NVIDIA's own docs: "Interfaces, APIs, and behavior may change without notice as the design iterates."
Being pursued for NemoClaw integrations: Salesforce, Cisco, Google, Adobe, CrowdStrike, SAP, JFrog (supply chain security)
For full details, see our Deep Research Report.
| Attribute | Detail |
|---|---|
| What | Open-source autonomous AI agent (TypeScript/Node.js) |
| GitHub Stars | 234K+ |
| License | MIT |
| Creator | Peter Steinberger (now at OpenAI) |
| Governance | Moving to open-source foundation |
| Runtime | Long-lived Gateway daemon on port 18789 |
| Messaging | 22+ platforms (WhatsApp, Signal, Telegram, Discord, iMessage, Slack, Teams, etc.) |
| AI Models | 20+ providers (Claude, GPT, Gemini, DeepSeek, Ollama, etc.) |
| Skills | 10,700+ community skills on ClawHub |
| Integrations | 50+ (chat, smart home, music, productivity, browser, cron) |
| Security Rating | CRITICAL — 512 vulns, 8+ critical CVEs, 20% malicious marketplace skills |
┌─────────────────────────────────────────────────────┐
│ USER INTERFACE │
│ WhatsApp Signal Telegram Discord iMessage │
└──────────────────────┬──────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────┐
│ OPENCLAW │
│ Agent Runtime · Skills · Memory · Integrations │
│ (TypeScript, Gateway daemon, port 18789) │
└──────────────────────┬──────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────┐
│ NEMOCLAW │
│ Security Wrapper · Sandbox · Policy Engine │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌────────────┐ │
│ │ Network │ │ Filesystem │ │ Process │ │
│ │ Isolation │ │ Restrictions │ │ Protection │ │
│ │ (whitelist) │ │ (/sandbox │ │ (OpenShell │ │
│ │ │ │ /tmp only) │ │ K3s) │ │
│ └─────────────┘ └──────────────┘ └────────────┘ │
│ │
│ ┌──────────────────────────────────────────────┐ │
│ │ PRIVACY ROUTER │ │
│ │ Sensitive → Local Non-sensitive → Cloud │ │
│ └──────────────────────────────────────────────┘ │
└──────────────────────┬──────────────────────────────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────┐ ┌──────────────┐
│ NEMOTRON │ │ Claude │ │ GPT / etc │
│ (Local LLM) │ │ (Cloud) │ │ (Cloud) │
│ Free, fast │ │ Smart │ │ Optional │
│ Private │ │ Capable │ │ │
└──────────────┘ └──────────┘ └──────────────┘
| Layer | Provides | Without It |
|---|---|---|
| OpenClaw | Agent brain, messaging, skills, integrations, always-on daemon | No agent — just raw model APIs |
| NemoClaw | Security sandbox, network isolation, filesystem lock, privacy routing | OpenClaw runs naked — CRITICAL risk |
| Nemotron | Free local inference, private data stays local, no API costs for routine tasks | Pay per token to cloud providers for everything |
Before NemoClaw: Deploying OpenClaw required accepting CRITICAL security risk. Our evaluation said "do not deploy without full isolation" — which meant building your own sandbox, firewall rules, Docker hardening, and credential isolation manually.
After NemoClaw: NVIDIA built exactly the isolation we specified. One command gets you a sandboxed OpenClaw with:
With Nemotron: Routine queries (scheduling, reminders, simple lookups, home automation) run on free local models. Only complex reasoning (coding, analysis, financial) routes to Claude. This dramatically reduces API costs and keeps private data off external servers.
| Dimension | Our Stack (Claude Code + COO) | NVIDIA Stack (OpenClaw + NemoClaw + Nemotron) |
|---|---|---|
| Runtime | Ephemeral CLI sessions | Always-on daemon (24/7) |
| Interface | Terminal + Discord (limited) | 22+ messaging platforms |
| AI Model | Claude only (Anthropic) | Multi-model (Claude + GPT + Gemini + local Nemotron) |
| Security Model | No daemon = minimal attack surface | 4-layer sandbox (NemoClaw) |
| Privacy | All queries go to Anthropic API | Privacy router — sensitive stays local |
| Cost | Pro plan + API usage | Nemotron free locally; API only for complex tasks |
| Coding | Best-in-class (Claude Code) | Weaker — Nemotron trails Claude on coding |
| Orchestration | C-suite agent hierarchy (COO/CTO/CFO/CISO/CMO) | Flat — single agent with skills |
| Memory | File-based + session persistence | SQLite vector + daily logs + MEMORY.md |
| Smart Home | Home Assistant MCP | Home Assistant (same underlying) |
| Network Mgmt | UniFi MCP (direct UDM Pro control) | No equivalent |
| Financial | Monarch Money MCP (real bank data) | No equivalent |
| Food Safety | Hardcoded allergy rules (rosey-bot) | Prompt-based only — UNACCEPTABLE for Jack |
| Voice | None | Wake word, push-to-talk, TTS |
| Music | None | Spotify, Sonos, Shazam |
| Scheduling | Manual (pending items only) | Cron, scheduled automation |
| Browser | Firecrawl (scraping) | Full Chromium CDP automation |
| Messaging | Discord + Mattermost only | WhatsApp, Signal, Telegram, iMessage, Slack, Teams + 16 more |
Our stack is stronger for:
NVIDIA stack is stronger for:
1. Family Messaging Hub
2. Always-On Home Automation
3. Proactive Scheduling
4. Voice Interface
5. Music Control
6. Local AI for Private Tasks
1. Software Development → Claude Code + developer/reviewer agents 2. Financial Analysis → CFO agent + Monarch Money MCP 3. Network Management → CTO agent + UniFi MCP 4. Security Review → CISO agent reviews before execution 5. Meal Planning → rosey-bot with hardcoded allergy rules (NEVER move to prompt-based)
Run both stacks, each doing what it's best at:
| Task Type | Handled By | Why |
|---|---|---|
| Coding, development | Claude Code + COO | Best-in-class coding, agent hierarchy |
| Finance, budgets | CFO agent + Monarch | Real bank data, structured analysis |
| Network, infrastructure | CTO agent + UniFi | Direct hardware control |
| Security review | CISO agent | Architectural review before execution |
| Meal planning | rosey-bot | Hardcoded allergy safety |
| Family messaging | OpenClaw + NemoClaw | 22+ platforms, always-on |
| Home automation | OpenClaw + HA | Scheduled, always-on |
| Voice, music | OpenClaw | No equivalent in our stack |
| Private/sensitive queries | Nemotron (local) | Never leaves the machine |
| Quick lookups, reminders | OpenClaw + Nemotron | Free, fast, local |
| Original Risk | Rating | NemoClaw Mitigation | Residual Risk |
|---|---|---|---|
| Network exposure (40K+ instances) | CRITICAL | Whitelist-only networking, no 0.0.0.0 binding | LOW — if policy is correctly configured |
| Filesystem access (SSH keys, creds) | CRITICAL | Write-only to /sandbox and /tmp | LOW — host filesystem isolated |
| Credential leakage to external APIs | HIGH | Privacy router, all API calls through OpenShell | MEDIUM — depends on classification accuracy |
| Arbitrary code execution | HIGH | OpenShell K3s container, digest-verified blueprints | LOW — container escape is hard |
| Prompt injection | HIGH | NOT ADDRESSED — still prompt-based security | HIGH — fundamental architectural flaw |
| Malicious marketplace skills | CRITICAL | PARTIALLY ADDRESSED — JFrog partnership for supply chain | MEDIUM — skill vetting still incomplete |
| Data at rest (memory stores PII) | HIGH | Sandbox isolation limits what's stored | MEDIUM — data in /sandbox still unencrypted |
Prompt injection — The fundamental flaw. If a crafted message can hijack the agent's instructions, the sandbox doesn't help because the agent is already authorized to act. NemoClaw limits the blast radius but doesn't prevent the hijack.
Malicious skills — ClawHub still has vetting issues. JFrog partnership is announced but not implemented. Installing community skills remains risky.
Jack's allergy safety — Moving hardcoded allergy rules to prompt-based instructions is STILL unacceptable. A prompt injection could override "never recommend foods containing almonds, sesame, milk, eggs, or peanuts." This is a life-safety issue that sandboxing does not address.
Alpha software — No third-party security audits. NVIDIA's security claims are design documents, not battle-tested facts.
curl | bash install — The installation method itself is a security anti-pattern. Mitigated by reviewing the script before running, but still concerning.
If deploying NemoClaw + OpenClaw:
| Spec | EQR2 Current | Requirement |
|---|---|---|
| CPU | TBD | 4+ vCPU |
| RAM | TBD | 16 GB recommended |
| GPU | None required | Optional (Nemotron runs on CPU, faster on GPU) |
| Disk | TBD | 40 GB for NemoClaw + models |
| Network | Tailscale | Already configured |
Pros: Separate machine from main infrastructure, Tailscale already set up Cons: May not have GPU for fast Nemotron inference
Pros: EQR1 has resources, Docker available Cons: Shares host with critical infrastructure. Adds attack surface to primary machine.
Not recommended — isolation is the whole point. Don't put the experiment next to production.
NVIDIA's new personal AI supercomputer. Designed specifically for NemoClaw + Nemotron.
Pros: Purpose-built, maximum Nemotron performance, dedicated hardware Cons: $3,000, delivery timeline uncertain, may be overkill for evaluation
Spin up a cloud VM (any provider) with:
Pros: Zero hardware commitment, easy to tear down Cons: Monthly cost, data leaves our network (partially offset by privacy router)
| Model | VRAM (FP16) | VRAM (Quantized) | CPU-Only? | Speed |
|---|---|---|---|---|
| Nano 4B | ~8 GB | ~2-4 GB | Yes (slow) | Fast on any GPU |
| Nano 30B (3B active) | ~6 GB active | ~2-3 GB active | Yes (usable) | Good on RTX 3060+ |
| Super 120B (12B active) | ~24 GB active | ~8-12 GB active | Slow | Needs RTX 4090 or better |
For our use case: Nano 30B is the sweet spot. 3B active params, runs on modest hardware, handles routine tasks well. Route complex queries to Claude via API.
| Item | Monthly Cost |
|---|---|
| Anthropic Pro Plan | $20/mo |
| API overages (if any) | Variable |
| Total | ~$20/mo |
| Item | Monthly Cost |
|---|---|
| Hardware (if buying DGX Spark) | $3,000 one-time |
| Hardware (if cloud VM) | $20-50/mo |
| Hardware (if existing EQR2) | $0 |
| Nemotron models | Free (open-source) |
| NemoClaw software | Free (Apache 2.0) |
| OpenClaw software | Free (MIT) |
| Claude API for complex routing | Reduced — routine queries go to free Nemotron |
| Total (EQR2 deploy) | ~$0 additional |
| Total (cloud VM) | ~$20-50/mo additional |
With Nemotron handling routine queries locally:
Estimated 60-80% of household queries could run locally on Nemotron, significantly reducing API costs if we move beyond the Pro plan flat rate.
Action: Wait and watch. Do not deploy yet.
Action: Test Nemotron locally on EQR1 or EQR2.
Action: Evaluate NemoClaw deployment if:
The ideal end state is a dual-stack architecture:
┌──────────────────────────────────────────────────────┐
│ DANIEL'S AI INFRASTRUCTURE │
│ │
│ ┌─────────────────────┐ ┌───────────────────────┐ │
│ │ CLAUDE CODE + COO │ │ NEMOCLAW + OPENCLAW │ │
│ │ │ │ │ │
│ │ Coding │ │ Family messaging │ │
│ │ Finance │ │ Home automation │ │
│ │ Network mgmt │ │ Voice / music │ │
│ │ Security review │ │ Scheduling / cron │ │
│ │ Project mgmt │ │ Quick lookups │ │
│ │ Complex reasoning │ │ Private queries │ │
│ │ │ │ │ │
│ │ Model: Claude │ │ Models: Nemotron │ │
│ │ Interface: CLI │ │ (local) + Claude │ │
│ │ │ │ (cloud, complex) │ │
│ │ runs on: EQR1 │ │ Interface: WhatsApp │ │
│ │ │ │ Signal, Discord │ │
│ │ │ │ │ │
│ │ │ │ runs on: EQR2 or │ │
│ │ │ │ dedicated hardware │ │
│ └─────────────────────┘ └───────────────────────┘ │
│ │
│ ┌─────────────────────┐ │
│ │ ROSEY-BOT │ ← Allergy safety STAYS │
│ │ Hardcoded rules │ HERE. Never moves. │
│ │ Meal plans │ │
│ │ Discord channels │ │
│ └─────────────────────┘ │
└──────────────────────────────────────────────────────┘
Each component does what it's best at. No single point of failure. Jack's safety rules stay hardcoded. Private data stays local. Complex work uses the best model available.
Prepared by COO / Claude Code Stack — 2026-03-18 For internal use by Daniel Guterman
Published via ts_publish — 2026-03-18 02:32 PM ET