A selection of agentic AI systems, self-hosted infrastructure, and ML engineering work. All production-deployed, all without vendor lock-in.
Agentic AILive
SteveBot — AI Digital Twin
A privacy-first AI assistant that answers questions about me, books meetings, and demonstrates production-grade agentic architecture.
Built entirely self-hosted: multi-model orchestration with GLM-4.7 and GLM-5, RAG with PostgreSQL + pgvector, SSE streaming, visitor fingerprinting, feedback loops, and full observability. No OpenAI, no cloud lock-in.
Outcome: Demonstrates end-to-end agentic system design in production with real users
ML InfrastructureShipped
Multi-Model LLM Orchestrator
A lightweight orchestration layer that routes queries to the best-fit model based on task type, latency budget, and context size.
Routes structured tasks to GLM-4.7 (fast, precise) and open-ended reasoning to GLM-5 (slower, deeper). Includes parallel execution paths, result merging, and automatic fallback. Deployed in Docker with full request tracing.
Outcome: 40% reduction in average response latency vs. single-model routing
AI/ML EngineeringLive
RAG Pipeline with pgvector
A retrieval-augmented generation system that grounds LLM responses in a curated knowledge base using semantic similarity search.
Implements chunking, embedding generation, cosine similarity search, and context injection. The knowledge base is admin-manageable via a web UI. Embeddings computed locally for full data privacy. Supports hybrid keyword + semantic search.
PostgreSQLpgvectorOpenAI-compatible embeddingsTypeScriptNext.js API Routes
Outcome: Halved hallucination rate in domain-specific Q&A vs. baseline LLM
InfrastructureLive
Self-Hosted LLM Stack
A full production stack for running local language models: inference server, reverse proxy, auth, monitoring, and automated backups.
Runs Z.ai-compatible models on-premise behind an Nginx reverse proxy with JWT auth. Includes Umami analytics, Sentry error tracking, automated SQLite backups to local storage, and a health check dashboard. Zero cloud dependencies.
Docker ComposeNginxSentryUmamiSQLiteGitHub Actions CI
Outcome: Full production workload with 99.9% uptime, zero vendor dependency
DevOpsShipped
Automated Database Backup System
Scheduled backup automation for SQLite and PostgreSQL databases with retention policies, compression, and health monitoring.
Runs on a cron schedule inside Docker, compresses and timestamps backups, enforces a configurable retention window (default: 7 days), and reports status to the health check endpoint. Supports both local filesystem and remote S3-compatible storage.
Node.jsDockerShell scriptingSQLitePostgreSQLHealth API
Outcome: Zero data loss risk with automated 6-hour backup cadence
Frontend EngineeringLive
PWA with Offline Support
Progressive Web App with service worker caching, push notifications, and full offline capability for the AI chat interface.
Implements a Workbox-compatible service worker that caches static assets and provides offline fallbacks. Push notification subscription management via Web Push API with VAPID keys. Installable on iOS and Android home screens.