Koli Code: Multi-Model AI Coding Assistant CLI

Problem

The landscape of AI coding assistants has exploded, but developers face a fragmented ecosystem. Each AI provider (OpenAI, Anthropic, Google, xAI) offers unique strengths, pricing models, and capabilities. Yet most tools lock you into a single provider, making it cumbersome to leverage the best model for a given task or to experiment with alternatives.

Beyond provider lock-in, I noticed that existing AI coding tools were often designed for IDE integration or web interfaces, leaving terminal-centric workflows underserved. As someone who spends significant time in the terminal, I wanted an AI assistant that felt native to that environment: concise output, streaming responses, and the ability to connect to external tools and data sources without leaving my shell.

I also saw an opportunity to embrace the emerging Model Context Protocol (MCP) standard, which promises a unified way to extend AI assistants with external capabilities. Building MCP support from the ground up would future-proof the tool as the ecosystem matures.

Approach

I architected Koli Code around a model abstraction layer that treats all AI providers uniformly. This design allows the CLI to switch between GPT-4, Claude, Gemini, and Grok with a simple configuration change, while maintaining consistent behavior and streaming capabilities across all backends.

The TypeScript codebase is structured for extensibility. The model abstraction handles the nuances of each provider’s API (different authentication patterns, streaming formats, and response structures) while exposing a clean, unified interface to the rest of the application.

Key Design Elements

Provider Agnosticism: Each AI provider is implemented as a module conforming to a common interface, making it straightforward to add new models as they emerge
Streaming-First Architecture: All responses stream in real-time, critical for terminal UX where waiting for complete responses feels unresponsive
MCP Integration: The tool connects to external resources (databases, APIs, file systems) through the Model Context Protocol, enabling rich contextual interactions beyond simple prompt/response
Security Guardrails: Built-in rules help prevent generation of potentially malicious code patterns, addressing a real concern with AI-generated code
Terminal-Optimized Output: Prompts are engineered for concise, actionable responses rather than verbose explanations

Configuration follows the principle of sensible defaults with full customization available via environment variables or .env files, letting developers set their preferred default model while maintaining flexibility to switch on demand.

Outcomes

Unified multi-model interface that abstracts provider differences, allowing developers to choose the right model for each task without changing tools
Extensible tooling through MCP that positions the CLI to grow with the ecosystem as more MCP-compatible tools and data sources become available
Terminal-native experience optimized for developers who prefer command-line workflows over IDE plugins or web interfaces

Key Contributions

Designed unified model abstraction layer that normalizes API differences across OpenAI, Anthropic, Google Gemini, and xAI Grok
Implemented real-time streaming with consistent behavior regardless of the underlying provider’s streaming format
Built MCP protocol support for connecting to external tools, databases, and APIs
Developed security-first architecture with built-in rules to mitigate risks from AI-generated code
Created optimized prompt templates specifically tuned for terminal use cases with minimal, actionable output
Established extensible plugin architecture making it straightforward to add new models and tools as the ecosystem evolves

Co-authored with AI, based on the author's working sessions, dictations, and notes.

Koli Code: Multi-Model AI Coding Assistant CLI

Impact Summary

Role

Timeline

Scale

Links

Problem

Approach

Key Design Elements

Outcomes

Key Contributions

Key Takeaways

Related Projects

AWS Security Group Mapper: Visual Analysis Tool for Cloud Security

Fighters Paradise: Modern Game Engine Reimplementation in Rust

Agent-Eval: CI Evaluation Harness for Multi-Agent Development