Production-grade multi-agent orchestration system that decomposes complex tasks and dispatches them to parallel worker agents with sandboxed tool execution.
Complex tasks require decomposition and parallel execution across multiple specialized agents. Existing orchestration solutions lock users into specific vendor ecosystems with per-token pricing, limited tool sandboxing, and no interoperability between agent frameworks.
// SOLUTION
A self-hosted multi-agent orchestrator that uses LLM-driven task decomposition to identify independent sub-tasks, dispatches them to sandboxed workers across a bounded thread pool, and exposes the entire system via a streamable HTTP MCP server — allowing any MCP-compatible client to leverage the orchestration capability. Workers execute with 7 sandboxed tools under configurable path-root security. SQLite-backed task tracking and configuration provide persistence without external databases.
// OUTCOME
Minimal SaaS dependencies — entirely self-hosted on a single VPS with no per-token vendor markup beyond base LLM API costs. MCP protocol ensures interoperability with any compatible client or agent framework. Bounded thread-pool concurrency prevents runaway infrastructure costs. No external database service required (SQLite). No proprietary orchestration platform fees.
STACK:
DeepSeek API
OpenAI-compatible API
SQLite
Docker
Linux
FastMCP
Python
ThreadPoolExecutor
Nginx
Systemd
HTTP SSE
MCP Protocol
REST API
SSH
Ripgrep
Metro commuters navigate unreliable bus systems where static schedules fail to account for real-time disruptions — traffic congestion, severe weather, and missed connections cascade into missed appointments. Existing transit apps provide point-to-point static routing with no quantified risk assessment or proactive re-planning.
// SOLUTION
A 13-state ride finite state machine with composite route scoring that factors duration, transfers, walking effort, weather conditions, traffic data, and connection-miss probability weighted by rider priority. The system proactively pre-computes optimal trips hourly via background scheduling, integrating GTFS-RT real-time transit data and public NWS/FDOT data sources. A strict architecture limits decision-making to only three modules (Orchestrator, RouteScorer, TripScheduler).
// OUTCOME
Minimal SaaS — uses only free public APIs (GTFS-RT transit feeds, NWS weather, FDOT 511 traffic). All routing computation is self-hosted — no per-query mapping or routing vendor costs. SQLite for persistence eliminates managed database expenses. Proactive hourly pre-computation means real-time queries are lightweight, keeping infrastructure overhead low.
STACK:
FDOT 511 API
NWS API
SQLite
Docker
Finite State Machine
Java
Python
WorkManager
Real-time Data
Android
Nginx
GTFS-RT
REST API
On-Device AI Assistant Platform
ProductionOn-Device AI Assistant · Android AI Application
A full-featured Android AI assistant with 46 packages, 40+ tools, on-device model inference, MCP/Skill marketplace, voice interaction, and GUI automation.
AI assistants typically require constant cloud connectivity, incurring latency, privacy exposure, and recurring subscription costs. Users need an offline-capable AI with deep device integration, extensibility, and no mandatory external service dependencies.
// SOLUTION
An Android AI assistant with on-device model inference via MNN and llama.cpp (GGUF models) for fully offline operation, complemented by optional cloud LLM connectivity. Includes 46 functional packages, 40+ integrated tools, a modular MCP/Skill marketplace for extensibility, multi-engine voice interaction with wake-word activation, visual workflow automation, persistent memory system, and dual-channel GUI automation. A complete Linux user-space environment (proot-based) enables local development.
// OUTCOME
Near-zero infrastructure overhead — core inference runs on-device with no server costs. Cloud APIs are entirely optional and user-configurable. No mandatory SaaS subscriptions. User data remains local by default. The MCP/Skill marketplace allows community-driven extension without platform fees. Single APK distribution with no backend server dependency for core functionality.
STACK:
SQLite
Docker
Ubuntu
proot
Memory System
Workflow Automation
MNN
llama.cpp
C++
Java
Kotlin
Python
AutoGLM
UI Tree
GGUF
Android
WebView
MCP Protocol
REST API
SSH
WebSocket
STT
TTS
Wake Word Detection
Retrieval-Augmented Chatbot
ProductionConversational AI & Knowledge Retrieval · RAG Chatbot with Agent Spawning
A web-based RAG chatbot with vector database semantic retrieval, cloud LLM generation, natural-language-to-script agent spawning, and email-authenticated sessions.
Users need a knowledge-grounded chatbot that can retrieve from personal documents and autonomously spawn task-specific agents, without being locked into a particular vector database vendor, embedding service, or LLM provider.
// SOLUTION
A web-based chatbot combining ChromaDB vector storage with HuggingFace embedding models for semantic document retrieval, LangChain for LLM orchestration, and a cloud inference endpoint for response generation. Features a unique natural-language agent spawning command ('/spawn') that generates executable Python scripts. Email-authenticated sessions tie user identity to interactions.
// OUTCOME
Fully swappable component architecture — vector database, embedding model, and LLM can each be replaced independently without rewriting the application. Self-hosted ChromaDB eliminates vector database SaaS costs. Only the LLM inference API is external, and any provider can be substituted. LangChain abstraction prevents provider lock-in. Single-server deployment with minimal resource footprint.
STACK:
Gmail API
Google API
Groq API
HuggingFace Embeddings
ChromaDB
Vector Database
Linux
Gradio
LangChain
Python
Systemd
OAuth 2.0
Web-Based Multi-Model AI Chat Interface
ProductionConversational AI & Knowledge Retrieval · LLM Frontend Gateway
A Docker-deployed web-based AI chat interface providing multi-model, multi-conversation frontend with document upload, prompt library, and model switching.
Users managing multiple AI models need a unified chat interface that supports conversation management, document upload, and model switching — without being tied to any single LLM provider's proprietary frontend or per-seat SaaS pricing.
// SOLUTION
A Docker-deployed web-based AI chat interface providing multi-model, multi-conversation management with document upload, prompt library, system prompt customization, and per-model parameter tuning. Serves as a general-purpose gateway to any OpenAI-compatible API endpoint.
// OUTCOME
Self-hosted Docker deployment eliminates per-seat SaaS fees. Compatible with any OpenAI-format API — no vendor lock-in. Single-container deployment keeps infrastructure minimal. No external frontend service dependency; all chat data stays on the host server.
STACK:
OpenAI-compatible API
Docker
Linux
JavaScript
Python
Nginx
REST API
WebSocket
AI-Driven Sales Engagement System
ProductionBusiness Automation · Lead Qualification & Follow-Up
An AI-powered business development agent that automates lead qualification, multi-step follow-up sequencing, and CRM-integrated pipeline management.
Automotive sales BDCs spend excessive manual effort on lead qualification, follow-up sequencing, and pipeline management. Existing sales automation platforms charge per-user or per-lead fees that scale poorly.
// SOLUTION
An AI-powered business development agent that automates lead qualification, multi-step follow-up sequencing across email and SMS, and CRM-integrated pipeline tracking. Handles inbound lead qualification and outbound engagement with automated response handling.
// OUTCOME
Self-hosted engine with no per-lead or per-user pricing — costs are fixed infrastructure only. CRM integration via standard APIs avoids vendor-specific connectors. Email and SMS gateways are interchangeable. Single-server deployment scales across multiple clients without incremental SaaS costs.
STACK:
Email API
SMS API
SQL
Docker
Linux
Python
CRM Integration
Nginx
Systemd
REST API
Large codebases become increasingly difficult to navigate as they grow — developers waste time searching for relevant modules, understanding architecture, and identifying cross-cutting concerns. External code indexing services require uploading proprietary code to third-party servers.
// SOLUTION
A local SQLite-backed codebase indexing engine that parses repository structures, generates embeddings for semantic search, and produces AI-generated module descriptions covering architecture, known issues, and configuration patterns. Supports intent-based retrieval of relevant code modules without external services.
// OUTCOME
Zero external SaaS — pure local SQLite storage with no vector database service required. Embeddings generated locally. No proprietary code leaves the host. Flat-file database means zero administration overhead. Queryable via simple SQL or programmatic API — no specialized search infrastructure needed.
STACK:
OpenAI-compatible API
SQLite
Linux
Semantic Search
Python
Embeddings
REST API
A suite of 4 backend microservices providing session persistence, headless browser automation, vector embeddings, and external API gateway integration.
Internal tooling requires session persistence for long-running LLM tasks, headless browser automation, vector embedding generation, and external API gateway functionality. Purchasing these as separate SaaS products would incur significant per-request costs and introduce multiple vendor dependencies.
// SOLUTION
Four self-hosted microservices on a single VPS: (1) Auto-Continue Proxy for LLM session persistence; (2) Browser Automation Service with headless browser control; (3) Embedding Service for vector generation; (4) MCP Integration Gateway connecting to external API ecosystems (email, calendar, drive) via streamable HTTP.
// OUTCOME
Complete self-hosting eliminates per-request pricing from browser automation services, embedding APIs, and API gateway vendors. Single VPS runs all four services — no multi-service cloud sprawl. MCP protocol standardizes integration. OAuth-based API access uses free-tier quotas where available. Each service is independently replaceable without affecting others.
STACK:
Google API
Docker
Linux
Node.js
Python
Embedding Models
Playwright
Nginx
Systemd
HTTP Proxy
MCP Protocol
OAuth 2.0
Browser Automation