Back to case studies
2025Solo

MCP core for an LLM assistant

Backend core for an LLM assistant with plugin execution, hot reload, tool chains, and explicit context handoff.

Plugin model
Hot reload
Execution
Context handoff
Surface
CLI + API
Execution core

Plugins can change without restarting the whole assistant idea.

The useful part is predictable execution, not just model access.

1Plugin lifecycle separated from the core runtime
2Hot reload keeps iteration quick
3Tool chains pass context explicitly
Role

AI Tooling Developer

Stack
PythonFastAPIMCPTool callingLLM APIs
Problem

When an assistant grows beyond a demo, plugins and tool calls become hard to evolve without breaking the whole runtime.

Solution

Built a FastAPI execution core with plugin lifecycle management, hot reload, cascading tool calls, and a CLI client with execution history.

Impact

New tools can be added without rewriting the assistant core, and tool execution is easier to inspect.

What I did
  • Implemented plugin hot reload without restarting the core process.
  • Built explicit context handoff between chained tools.
  • Separated runtime responsibilities from plugin responsibilities.
What it shows
  • AI tooling becomes useful when execution behavior is predictable.
  • A clean plugin boundary matters more than the number of models connected.
Related work

More projects

2025-2026

nnzen model catalog

Solo

A live LLM catalog that collects model data, normalizes it, and makes model comparison faster.

PythonFastAPILLM APIsRAGVector DB / pgvector
2024-2025

FastAPI backend performance cleanup

Solo / contract-style work

Optimization of a FastAPI API path: async I/O, connection reuse, Redis cache, request validation, metrics, and removal of serial bottlenecks.

PythonFastAPIasyncioPydanticRedis