Projects // HandleIT.Online

Multi-Agent Orchestration Platform

Production Multi-Agent Orchestration & Distributed Task Execution · Task Decomposition & Parallel Execution

Production-grade multi-agent orchestration system that decomposes complex tasks and dispatches them to parallel worker agents with sandboxed tool execution.

>> https://handleit.online | https://mcp.handleit.online

// PROBLEM

Complex tasks require decomposition and parallel execution across multiple specialized agents. Existing orchestration solutions lock users into specific vendor ecosystems with per-token pricing, limited tool sandboxing, and no interoperability between agent frameworks.

// SOLUTION

A self-hosted multi-agent orchestrator that uses LLM-driven task decomposition to identify independent sub-tasks, dispatches them to sandboxed workers across a bounded thread pool, and exposes the entire system via a streamable HTTP MCP server — allowing any MCP-compatible client to leverage the orchestration capability. Workers execute with 7 sandboxed tools under configurable path-root security. SQLite-backed task tracking and configuration provide persistence without external databases.

// OUTCOME

Minimal SaaS dependencies — entirely self-hosted on a single VPS with no per-token vendor markup beyond base LLM API costs. MCP protocol ensures interoperability with any compatible client or agent framework. Bounded thread-pool concurrency prevents runaway infrastructure costs. No external database service required (SQLite). No proprietary orchestration platform fees.

STACK:

DeepSeek API

OpenAI-compatible API

SQLite

Docker

Linux

FastMCP

Python

ThreadPoolExecutor

Nginx

Systemd HTTP SSE MCP Protocol

REST API

SSH

Ripgrep

Intelligent Transit Routing System

Development Intelligent Transit & Mobility · Real-Time Routing & Risk Management

An intelligent transit routing application featuring a 13-state ride FSM, composite route scoring, and proactive hourly trip pre-computation.

>> https://busboss.handleit.online

// PROBLEM

Metro commuters navigate unreliable bus systems where static schedules fail to account for real-time disruptions — traffic congestion, severe weather, and missed connections cascade into missed appointments. Existing transit apps provide point-to-point static routing with no quantified risk assessment or proactive re-planning.

// SOLUTION

A 13-state ride finite state machine with composite route scoring that factors duration, transfers, walking effort, weather conditions, traffic data, and connection-miss probability weighted by rider priority. The system proactively pre-computes optimal trips hourly via background scheduling, integrating GTFS-RT real-time transit data and public NWS/FDOT data sources. A strict architecture limits decision-making to only three modules (Orchestrator, RouteScorer, TripScheduler).

// OUTCOME

Minimal SaaS — uses only free public APIs (GTFS-RT transit feeds, NWS weather, FDOT 511 traffic). All routing computation is self-hosted — no per-query mapping or routing vendor costs. SQLite for persistence eliminates managed database expenses. Proactive hourly pre-computation means real-time queries are lightweight, keeping infrastructure overhead low.

STACK: FDOT 511 API NWS API

SQLite

Docker Finite State Machine

Java

Python

WorkManager

Real-time Data

Android

Nginx GTFS-RT

REST API

On-Device AI Assistant Platform

Production On-Device AI Assistant · Android AI Application

A full-featured Android AI assistant with 46 packages, 40+ tools, on-device model inference, MCP/Skill marketplace, voice interaction, and GUI automation.

>> Android APK (min API 26, v1.10.1)

// PROBLEM

AI assistants typically require constant cloud connectivity, incurring latency, privacy exposure, and recurring subscription costs. Users need an offline-capable AI with deep device integration, extensibility, and no mandatory external service dependencies.

// SOLUTION

An Android AI assistant with on-device model inference via MNN and llama.cpp (GGUF models) for fully offline operation, complemented by optional cloud LLM connectivity. Includes 46 functional packages, 40+ integrated tools, a modular MCP/Skill marketplace for extensibility, multi-engine voice interaction with wake-word activation, visual workflow automation, persistent memory system, and dual-channel GUI automation. A complete Linux user-space environment (proot-based) enables local development.

// OUTCOME

Near-zero infrastructure overhead — core inference runs on-device with no server costs. Cloud APIs are entirely optional and user-configurable. No mandatory SaaS subscriptions. User data remains local by default. The MCP/Skill marketplace allows community-driven extension without platform fees. Single APK distribution with no backend server dependency for core functionality.

STACK:

SQLite

Docker

Ubuntu

proot

Memory System

Workflow Automation

MNN

llama.cpp

C++

Java

Kotlin

Python AutoGLM

UI Tree

GGUF

Android

WebView MCP Protocol

REST API

SSH

WebSocket

STT

TTS

Wake Word Detection

Retrieval-Augmented Chatbot

Production Conversational AI & Knowledge Retrieval · RAG Chatbot with Agent Spawning

A web-based RAG chatbot with vector database semantic retrieval, cloud LLM generation, natural-language-to-script agent spawning, and email-authenticated sessions.

>> https://ahmes.handleit.online

// PROBLEM

Users need a knowledge-grounded chatbot that can retrieve from personal documents and autonomously spawn task-specific agents, without being locked into a particular vector database vendor, embedding service, or LLM provider.

// SOLUTION

A web-based chatbot combining ChromaDB vector storage with HuggingFace embedding models for semantic document retrieval, LangChain for LLM orchestration, and a cloud inference endpoint for response generation. Features a unique natural-language agent spawning command ('/spawn') that generates executable Python scripts. Email-authenticated sessions tie user identity to interactions.

// OUTCOME

Fully swappable component architecture — vector database, embedding model, and LLM can each be replaced independently without rewriting the application. Self-hosted ChromaDB eliminates vector database SaaS costs. Only the LLM inference API is external, and any provider can be substituted. LangChain abstraction prevents provider lock-in. Single-server deployment with minimal resource footprint.

STACK:

Gmail API

Google API

Groq API

HuggingFace Embeddings

ChromaDB

Vector Database

Linux

Gradio

LangChain

Python

Systemd

OAuth 2.0

Web-Based Multi-Model AI Chat Interface

Production Conversational AI & Knowledge Retrieval · LLM Frontend Gateway

A Docker-deployed web-based AI chat interface providing multi-model, multi-conversation frontend with document upload, prompt library, and model switching.

>> https://ai.handleit.online

// PROBLEM

Users managing multiple AI models need a unified chat interface that supports conversation management, document upload, and model switching — without being tied to any single LLM provider's proprietary frontend or per-seat SaaS pricing.

// SOLUTION

A Docker-deployed web-based AI chat interface providing multi-model, multi-conversation management with document upload, prompt library, system prompt customization, and per-model parameter tuning. Serves as a general-purpose gateway to any OpenAI-compatible API endpoint.

// OUTCOME

Self-hosted Docker deployment eliminates per-seat SaaS fees. Compatible with any OpenAI-format API — no vendor lock-in. Single-container deployment keeps infrastructure minimal. No external frontend service dependency; all chat data stays on the host server.

STACK:

OpenAI-compatible API

Docker

Linux

JavaScript

Python

Nginx

REST API

WebSocket

AI-Driven Sales Engagement System

Production Business Automation · Lead Qualification & Follow-Up

An AI-powered business development agent that automates lead qualification, multi-step follow-up sequencing, and CRM-integrated pipeline management.

>> https://leadenginetech.com

// PROBLEM

Automotive sales BDCs spend excessive manual effort on lead qualification, follow-up sequencing, and pipeline management. Existing sales automation platforms charge per-user or per-lead fees that scale poorly.

// SOLUTION

An AI-powered business development agent that automates lead qualification, multi-step follow-up sequencing across email and SMS, and CRM-integrated pipeline tracking. Handles inbound lead qualification and outbound engagement with automated response handling.

// OUTCOME

Self-hosted engine with no per-lead or per-user pricing — costs are fixed infrastructure only. CRM integration via standard APIs avoids vendor-specific connectors. Email and SMS gateways are interchangeable. Single-server deployment scales across multiple clients without incremental SaaS costs.

STACK:

Email API

SMS API

SQL

Docker

Linux

Python

CRM Integration

Nginx

Systemd

REST API

Codebase Indexing & Semantic Analysis Engine

Production Developer Infrastructure & Supporting Services · Repository Analysis

A codebase indexing and analysis system that parses repository structures, extracts module descriptions, and generates AI-powered semantic summaries.

>> VPS Internal

// PROBLEM

Large codebases become increasingly difficult to navigate as they grow — developers waste time searching for relevant modules, understanding architecture, and identifying cross-cutting concerns. External code indexing services require uploading proprietary code to third-party servers.

// SOLUTION

A local SQLite-backed codebase indexing engine that parses repository structures, generates embeddings for semantic search, and produces AI-generated module descriptions covering architecture, known issues, and configuration patterns. Supports intent-based retrieval of relevant code modules without external services.

// OUTCOME

Zero external SaaS — pure local SQLite storage with no vector database service required. Embeddings generated locally. No proprietary code leaves the host. Flat-file database means zero administration overhead. Queryable via simple SQL or programmatic API — no specialized search infrastructure needed.

STACK:

OpenAI-compatible API

SQLite

Linux

Semantic Search

Python

Embeddings

REST API

Backend Service Mesh

Production Developer Infrastructure & Supporting Services · Microservices & Proxies

A suite of 4 backend microservices providing session persistence, headless browser automation, vector embeddings, and external API gateway integration.

>> VPS Internal (ports 9001, 9002, 11435, 8766)

// PROBLEM

Internal tooling requires session persistence for long-running LLM tasks, headless browser automation, vector embedding generation, and external API gateway functionality. Purchasing these as separate SaaS products would incur significant per-request costs and introduce multiple vendor dependencies.

// SOLUTION

Four self-hosted microservices on a single VPS: (1) Auto-Continue Proxy for LLM session persistence; (2) Browser Automation Service with headless browser control; (3) Embedding Service for vector generation; (4) MCP Integration Gateway connecting to external API ecosystems (email, calendar, drive) via streamable HTTP.

// OUTCOME

Complete self-hosting eliminates per-request pricing from browser automation services, embedding APIs, and API gateway vendors. Single VPS runs all four services — no multi-service cloud sprawl. MCP protocol standardizes integration. OAuth-based API access uses free-tier quotas where available. Each service is independently replaceable without affecting others.

STACK:

Google API

Docker

Linux

Node.js

Python

Embedding Models

Playwright

Nginx

Systemd

HTTP Proxy MCP Protocol

OAuth 2.0

Browser Automation

PROJECT LOG

// PROBLEM

// SOLUTION

// OUTCOME

// PROBLEM

// SOLUTION

// OUTCOME

// PROBLEM

// SOLUTION

// OUTCOME

// PROBLEM

// SOLUTION

// OUTCOME

// PROBLEM

// SOLUTION

// OUTCOME

// PROBLEM

// SOLUTION

// OUTCOME

// PROBLEM

// SOLUTION

// OUTCOME

// PROBLEM

// SOLUTION

// OUTCOME