Cortex is live on GitHub.Repo

Products: Coming Soon

Dopove products are undergoing final staging. Release is coming soon.

Runtime Launch

Local AI at Maximum Velo

Run massive models in your laptop. Zero cloud overhead, absolute privacy, unmatched throughput.

Current version: 0.24.0 | Free Software
Capabilities

The Core Trinity

Bare-Metal Velocity

Bypasses standard bottlenecks with advanced memory routing and L3 cache maximization.

Air-Gapped Privacy

Your data never leaves your device. Secure, local, and enterprise-ready by default.

Hardware Agnostic

Deeply optimized for NVIDIA, AMD, Apple Silicon, and standard x86 CPUs.

Proven Velocity

Benchmark Metrics

benchmark_matrix.sh
Legacy Inference16.79 tok/s
GoingMerry Engine0 tok/s
Architecture

Under the Hood.

GoingMerry's speed is not an accident; it's the result of a deliberate, performance-first architecture designed to extract maximum throughput from your hardware.

LAYER 01

Go Orchestrator

Async Scheduler & Memory Broker
LAYER 02

C/C++ Tensor Engine

Hardware Compiler & L3 Cache Control
LAYER 03

CUDA / Apple Metal / AVX

Bare-Metal Local Operations
01 / COMPILER OPTIMIZATIONS

Hardware-Aware Toolchain

Custom C/C++ compiler toolchain implementing branchless hot-path optimization, AVX-512/NEON vectorization on CPUs, and tensor core instructions on GPUs.

02 / CACHE ALIGNMENT

Mastering Memory Latency

Cache-aware runtime that organizes model weights to align with CPU pre-fetching mechanisms, establishing high-bandwidth L3 cache tunnels.

03 / REAL-TIME METRICS

Predictive Performance

An empirical forecasting model that predicts local tokens-per-second output based on target memory bandwidth and model parameter sizes.

04 / CORE STACK

Go + C++ Hybrid Engine

A polished, high-level Go orchestration layer for API and network logic driving a low-level C++ tensor inference engine compiled for speed.

Local Ecosystem & Models
Llama 4
Gemma 4
DeepSeek-V3
Phi 4
Qwen 3.0
Mistral Large 3
Hermes Agent
Command R7
LLaVA 2
Llama 4
Gemma 4
DeepSeek-V3
Phi 4
Qwen 3.0
Mistral Large 3
Hermes Agent
Command R7
LLaVA 2
Llama 4
Gemma 4
DeepSeek-V3
Phi 4
Qwen 3.0
Mistral Large 3
Hermes Agent
Command R7
LLaVA 2
Claude Code
OpenClaw
Continue.dev
Open WebUI
Enchanted
Lobe Chat
Page Assist
Twinny
Obsidian Copilot
Claude Code
OpenClaw
Continue.dev
Open WebUI
Enchanted
Lobe Chat
Page Assist
Twinny
Obsidian Copilot
Claude Code
OpenClaw
Continue.dev
Open WebUI
Enchanted
Lobe Chat
Page Assist
Twinny
Obsidian Copilot
LangChain
LlamaIndex
CrewAI
LiteLLM
ChromaDB
Autogen
LangGraph
Dify
Flowise
LangChain
LlamaIndex
CrewAI
LiteLLM
ChromaDB
Autogen
LangGraph
Dify
Flowise
LangChain
LlamaIndex
CrewAI
LiteLLM
ChromaDB
Autogen
LangGraph
Dify
Flowise
Pricing Plans

Flexible Licensing for Every Scale

Choose the deployment model that fits your engineering needs. From individual developers to full enterprise clusters.

🐏 Community Edition

Unrestricted local inference runtime. Run models offline on your hardware with absolute performance and zero limits.

Free
  • Free local execution runtime
  • Bare-metal CPU/GPU hardware acceleration
  • Air-gapped security and data privacy
  • Supports Llama 4, Gemma 4, DeepSeek, and more
Download Free
Enterprise Control Plane

🛡️ Enterprise Edition

Deploy, secure, and monitor local models at scale. Built for the CTO to manage AI across thousands of employee nodes or server farms.

Custom Pricing
  • Advanced SSO (Okta, Entra ID) & Granular RBAC
  • Fleet Management: Central Dashboard & Kubernetes Operator
  • Private Registry: Air-gapped self-hosted hub & weight encryption
  • GoingMerry Swarm: Multi-node distributed sharding
  • Advanced Observability & Department Chargeback Analytics
  • 24/7/365 Dedicated SLAs & Commercial Indemnification
Contact Sales

Up and running soon.

We are calibrating local compiler targets and model sharding frameworks. Join the waitlist to get notified the second GoingMerry becomes available for local installation.

Peeking GoingMerry Sheep
Merry! 🐏