Ian James Duncan

Adding Molt Bot (Clawd) to Home Assistant

Connect Any OpenAI-Compatible LLM to Home Assistant Voice

This guide shows how to use a powerful LLM (like Claude, GPT-4, or any OpenAI-compatible API) as your Home Assistant voice assistant brain, replacing local models like Ollama/Llama.

Why?

– Smarter responses — Cloud LLMs understand context better than small local models

– Fast device control — Proxy handles common commands instantly without LLM roundtrip

– Best of both worlds — Quick local responses for home control, powerful LLM for complex questions

Architecture

Wake Word → Whisper STT → Ollama Proxy → Your LLM API → Piper TTS

↓

(Fast path for device control, weather, time queries)

`Prerequisites

– Home Assistant with voice pipeline set up (Wyoming protocol)

– Whisper (faster-whisper) for speech-to-text

– Piper for text-to-speech

– OpenWakeWord for wake word detection

– Python 3.10+ on a server (can be same machine as HA or separate)

– An OpenAI-compatible API endpoint (OpenAI, Claude via proxy, local LLM with OpenAI API, etc.)

Step 1: Create the Ollama Proxy

This Python script makes your LLM look like an Ollama server to Home Assistant.

Create `ollama-proxy.py`:

```python
#!/usr/bin/env python3

Ollama API Proxy – Makes any OpenAI-compatible LLM look like Ollama to Home Assistant.

import json
import re
import requests
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
from flask import Flask, request, jsonify
app = Flask(__name__)

============== CONFIGURATION ==============

Your LLM API endpoint (OpenAI-compatible)

LLM_URL = “https://api.openai.com/v1/chat/completions” # Or your endpoint

LLM_TOKEN = “your-api-key-here”

LLM_MODEL = “gpt-4” # Or claude-3-opus, etc.

Home Assistant API (for device control)

HA_URL = “https://homeassistant.local:8123”

HA_TOKEN = “your-long-lived-access-token”

System prompt for voice responses

SYSTEM_PROMPT = “””You are a voice assistant for Home Assistant. Keep responses concise and conversational – this is voice, not text.

Important: Respond in 1-2 sentences max. Be helpful and natural.”””

============== DEVICE MAPPINGS ==============

Customize these for YOUR home

LIGHTS = {

“living room”: “light.living_room”,

“kitchen”: “light.kitchen”,

“bedroom”: “light.bedroom”,

# Add your lights here

}

SWITCHES = {

“fan”: “switch.fan”,

# Add your switches here

}

COVERS = {

“garage”: “cover.garage_door”,

“blinds”: “cover.blinds”,

}

CLIMATE = {

“thermostat”: “climate.thermostat”,

}

COLORS = {

“white”: {“color_temp_kelvin”: 4000},

“warm”: {“color_temp_kelvin”: 2700},

“red”: {“hs_color”: [0, 100]},

“green”: {“hs_color”: [120, 100]},

“blue”: {“hs_color”: [240, 100]},

}

============== HELPER FUNCTIONS ==============

def call_ha_service(domain, service, data):

“””Call a Home Assistant service.”””

try:

resp = requests.post(

f”{HA_URL}/api/services/{domain}/{service}”,

json=data,

headers={“Authorization”: f”Bearer {HA_TOKEN}”},

verify=False,

timeout=10

)

return resp.status_code == 200

except:

return False

def get_weather(location):

“””Get weather from wttr.in (no API key needed).”””

try:

location = location.strip().replace(” “, “+”)

resp = requests.get(f”https://wttr.in/{location}?format=%C+%t”, timeout=5)

if resp.status_code == 200:

return f”Weather in {location.replace(‘+’, ‘ ‘)}: {resp.text.strip()}”

except:

pass

return None

def find_entity(text, device_map):

“””Find matching entity from device map.”””

text_lower = text.lower()

for name in sorted(device_map.keys(), key=len, reverse=True):

if name in text_lower:

return name, device_map[name]

return None, None

============== FAST PATH HANDLERS ==============

def handle_time(text):

“””Handle time queries instantly.”””

text_lower = text.lower()

if “what time” in text_lower:

from datetime import datetime

return datetime.now().strftime(“It’s %I:%M %p.”)

if “what day” in text_lower or “what’s the date” in text_lower:

from datetime import datetime

return datetime.now().strftime(“It’s %A, %B %d.”)

return None

def handle_weather(text):

“””Handle weather queries.”””

text_lower = text.lower()

patterns = [

r”weather\s+(?:in|for|at)\s+(.+)”,

r”what(?:’s| is)\s+(?:the\s+)?weather\s+(?:in|for|at|like in)\s+(.+)”,

]

for pattern in patterns:

match = re.search(pattern, text_lower)

if match:

location = match.group(1).strip().rstrip(“?.,!”)

return get_weather(location)

return None

def handle_lights(text):

“””Handle light commands.”””

text_lower = text.lower()

name, entity = find_entity(text, LIGHTS)

if not entity:

return None

params = {“entity_id”: entity}

# Check for color

for color_name, color_data in COLORS.items():

if color_name in text_lower:

params.update(color_data)

if call_ha_service(“light”, “turn_on”, params):

return f”{name.title()} lights set to {color_name}.”

# Check for brightness

brightness_match = re.search(r”(\d+)\s*%”, text_lower)

if brightness_match:

params[“brightness_pct”] = int(brightness_match.group(1))

if call_ha_service(“light”, “turn_on”, params):

return f”{name.title()} lights set to {params[‘brightness_pct’]}%.”

# On/off

if any(w in text_lower for w in [“turn off”, “off”]):

if call_ha_service(“light”, “turn_off”, params):

return f”{name.title()} lights off.”

elif any(w in text_lower for w in [“turn on”, “on”]):

if call_ha_service(“light”, “turn_on”, params):

return f”{name.title()} lights on.”

return None

def handle_thermostat(text):

“””Handle thermostat commands.”””

text_lower = text.lower()

name, entity = find_entity(text, CLIMATE)

if not entity:

return None

temp_match = re.search(r”(\d+)\s*(?:degrees|°)?”, text_lower)

if temp_match:

temp = int(temp_match.group(1))

if call_ha_service(“climate”, “set_temperature”, {“entity_id”: entity, “temperature”: temp}):

return f”Thermostat set to {temp} degrees.”

return None

def handle_covers(text):

“””Handle cover commands (garage, blinds, etc.).”””

text_lower = text.lower()

name, entity = find_entity(text, COVERS)

if not entity:

return None

params = {“entity_id”: entity}

if “open” in text_lower:

if call_ha_service(“cover”, “open_cover”, params):

return f”Opening {name}.”

elif “close” in text_lower:

if call_ha_service(“cover”, “close_cover”, params):

return f”Closing {name}.”

return None

============== MAIN HANDLER ==============

def handle_command(text):

“””Try fast-path handlers first, then fall back to LLM.”””

Try each fast handler

for handler in [handle_time, handle_weather, handle_lights, handle_thermostat, handle_covers]:

result = handler(text)

if result:

return result

Fall back to LLM

try:

resp = requests.post(
            LLM_URL,
            json={
                "model": LLM_MODEL,
                "messages": [
                    {"role": "system", "content": SYSTEM_PROMPT},
                    {"role": "user", "content": text}
                ]
            },
            headers={
                "Authorization": f"Bearer {LLM_TOKEN}",
                "Content-Type": "application/json"
            },
            timeout=120
        )
        if resp.status_code == 200:
            return resp.json()["choices"][0]["message"]["content"]
    except Exception as e:
        print(f"LLM error: {e}")
    return "Sorry, I couldn't process that."

============== OLLAMA API ROUTES ==============

@app.route(“/api/chat”, methods=[“POST”])

def chat():

“””Handle Ollama chat requests from Home Assistant.”””

data = request.json

messages = data.get(“messages”, [])

if messages:

user_message = messages[-1].get(“content”, “”)

response = handle_command(user_message)

return jsonify({

“model”: “assistant”,

“created_at”: “”,

“message”: {“role”: “assistant”, “content”: response},

“done”: True

})

return jsonify({“error”: “No message provided”}), 400

@app.route(“/api/tags”, methods=[“GET”])

def tags():

“””Return available models (Ollama compatibility).”””

return jsonify({

“models”: [{

“name”: “assistant:latest”,

“model”: “assistant:latest”,

“modified_at”: “2024-01-01T00:00:00Z”,

“size”: 4661235994,

“digest”: “abc123”,

“details”: {

“format”: “gguf”,

“family”: “llama”,

“parameter_size”: “8B”,

“quantization_level”: “Q4_0”

}

}]

})

@app.route(“/api/version”, methods=[“GET”])

def version():

return jsonify({“version”: “0.1.0”})

@app.route(“/”, methods=[“GET”])

def health():

return “Ollama Proxy OK”

if __name__ == “__main__”:

print(“Starting Ollama Proxy on port 11435…”)

app.run(host=”0.0.0.0″, port=11435)

“Step 2: Install Dependencies and Run

“`bash

# Install dependencies

pip install flask requests

# Run the proxy

python ollama-proxy.py

Step 3: Create a Systemd Service (Optional)

Create `/etc/systemd/system/ollama-proxy.service`:

“`ini

[Unit]

Description=Ollama LLM Proxy

After=network.target

[Service]

Type=simple

User=your-username

WorkingDirectory=/path/to/script

ExecStart=/usr/bin/python3 /path/to/ollama-proxy.py

Restart=always

RestartSec=5

[Install]

WantedBy=multi-user.target

“`

Enable and start:

“`bash

sudo systemctl daemon-reload

sudo systemctl enable --now ollama-proxy

“Step 4: Configure Home Assistant

Add the Proxy as an Ollama Service

1. Go to **Settings → Devices & Services → Add Integration**

2. Search for **Ollama**

3. Enter the URL: `http://YOUR_SERVER_IP:11435`

4. Click Submit

Create a Conversation Agent

1. Go to the new Ollama integration

2. Click Add conversation agent

3. Select model: `assistant:latest`

4. Uncheck “Prefer handling commands locally”

5. Save

Configure Voice Assistant

1. Go to Settings → Voice assistants

2. Edit your assistant (or create new)

3. Set Conversation agent to your new Ollama agent

4. Ensure STT is set to Whisper and TTS to Piper

Point Your Voice Device to the Assistant

For HA Voice devices or Wyoming Satellites:

1. Find the device’s Assistant selector entity

2. Set it to your new voice assistant

Step 5: Test

Say your wake word, then:

– “Turn on the living room lights”

– “What’s the weather in Seattle?”

– “Set the thermostat to 72”

– “What time is it?”

– “Tell me a joke” (goes to LLM)

Customization

Add More Devices

Edit the device mappings in the script:

“`python

LIGHTS = {

“living room”: “light.living_room”,

“kitchen”: “light.kitchen”,

“garage”: “light.garage”,

# Add yours

}

“

Add More Fast Handlers

Create new handler functions for device types you use frequently:

“`python

def handle_music(text):

if “play music” in text.lower():

call_ha_service(“media_player”, “media_play”, {“entity_id”: “media_player.speaker”})

return “Playing music.”

return None

“`

Add to the handler list in `handle_command()`.

Adjust LLM Timeout

If responses are slow, the HA voice pipeline may timeout. Options:

– Increase fast-path coverage for common commands

– Use a faster LLM model

– Adjust HA’s timeout settings (if available)

Troubleshooting

“No such entity” errors

– Check device mappings match your actual HA entity IDs

– Verify HA_TOKEN has permission to control devices

Proxy not responding

– Check firewall allows port 11435

– Verify proxy is running: `curl http://localhost:11435/api/tags`

Voice assistant times out

– Add more fast-path handlers for common queries

– Check LLM API latency

Wake word not detected

– Check OpenWakeWord is running

– Verify wake word model is loaded

– Adjust microphone sensitivity

This approach was developed for https://github.com/clawdbot/clawdbot, a personal AI assistant framework. The proxy pattern works with any OpenAI-compatible API.

Home Automation

I have gone down a rabbit hole with Home Assistant.

AI wake word speech control, tablets in key locations in the house, even some views for the Tesla’s browser in my car to control some basic functionality like garage and cameras.

I have completely integrated AI into the system using the Hey Jarvis to wake and running Whisper, Piper and Wake Word on a local Ubuntu machine with old hardware. At some point I want to update the hardware as it’s an old Titan XP gpu and the STS, inference time and TTS takes a few seconds. It can do lights and basic things and also answer to colors like make the lights red but if I ask it to turn on the theater or “Start Theater” it’s not working.

It works about 30% of the time. I consider this a great success. When it works it’s a fast as Alexa if not faster and that’s on an old TitanXP GPU i7 ubuntu machine.

This did take a lot of back and forth with Gemini to get it done. A Lot.

I’m a bit obsessed with my home theater so I 3D printed some light guards to stop the splash on my screen plus automations to turn everything on and off including running a 50 ethernet cable to my Epson LS12000 project just to turn it on and off. Worth it!

Weekend Vibe Coding

This past weekend, I did what every developer dreams about but rarely executes: I went from zero to deployed on not one, but two complete web applications. Armed with Google AI Studio, a Git repository, and Vercel’s deployment pipeline, I turned coffee and curiosity into live websites pointing at my own domains.

Here’s how it went down.

The Setup: Modern Dev Stack in Minutes

The beauty of today’s development ecosystem is how quickly you can go from concept to production. My stack was deliberately simple:

Google AI Studio for rapid prototyping and AI integration
Git for version control (because we’re not animals)
Vercel for deployment and hosting
Custom domains to make it official

The entire setup—from initializing repos to configuring DNS—took less time than my morning coffee routine.

Project One: Top-Down Vector Shooter at ijduncan.com

First up was something purely for fun: a browser-based, top-down vector shooter. Think classic arcade aesthetics meets modern web technologies.

The Build

Using Google AI Studio, I scaffolded out the game logic quickly. The vector-based graphics gave it that clean, geometric feel—all clean lines and smooth movement. The physics were surprisingly satisfying to dial in, and within a few hours I had:

Smooth player movement and controls
Enemy spawn mechanics
Collision detection
A scoring system
Vector-based particle effects for that retro-future vibe

The Deploy

I pushed from Google ai studio directly to my git and then over to Vercel:

Import Git repository
Configure build settings
Deploy
Point ijduncan.com at the deployment

15 minutes later, I had a playable game running on my own domain.

Project Two: AI Fitness Coach at jeffphillipsfitness.com

The second project was more practical but equally satisfying: a single-page website for my personal trainer friend Jeff, complete with an embedded Gemini-powered chat agent.

The Vision

Jeff needed a web presence that could actually help potential clients. Not just a static brochure site, but something interactive that could answer questions about training, nutrition, and fitness goals 24/7.

The Build

Google AI Studio’s Gemini integration made this almost trivially easy. I built out:

A clean, single-page layout focused on Jeff’s services
An embedded chat interface (bottom right corner, naturally)
A Gemini-powered agent trained on fitness and training knowledge
Prompt engineering to keep responses professional, encouraging, and on-brand

The chat agent can discuss:

Training methodologies and program design
Nutrition basics and macro planning
Injury prevention and mobility work
Scheduling and service inquiries

I also used gen AI to create many of the images the video banner. There was not a video of Jeff. That was created with image to video in Veo.

The Magic of AI Agents

What’s fascinating is how quickly you can create a specialized AI assistant. With the right system prompts, the Gemini model stays in character as a knowledgeable fitness professional, steering conversations toward health and training topics while maintaining Jeff’s approachable, no-BS coaching style.

Deploy and Point

Same smooth process:

Vercel handled the build and deployment. Just add a google API key for Gemini and that was it. Point jeffphillipsfitness.com at it, and suddenly Jeff has a live site with an AI receptionist that never sleeps.

Watching It All Come Together

There’s something deeply satisfying about the modern deployment workflow. Push to Git, watch Vercel’s build logs stream by, see the deployment go live, hit your custom domain, and there it is, your thing, live on the internet, accessible to anyone.

No server configuration. No certificate headaches. No deployment scripts to debug. Just code, commit, deploy, done.

What I Learned

1. Vercel’s Developer Experience Is Unmatched

The integration between Git and Vercel is seamless. Every push triggers a new deployment. The production deployment happens automatically on merge to main. DNS configuration is straightforward. It’s how deployment should always feel.

2. AI Studio Accelerates Prototyping

Google AI Studio let me iterate rapidly on both projects. For the shooter, I could quickly test game logic variations. For the fitness site, I could refine the chat agent’s personality and knowledge base without rebuilding infrastructure.

The Weekend Scorecard

Time invested: ~6 hours across both projects
Deployment headaches: Zero
Live sites: Two
Satisfaction level: Maximum

What’s Next

Both sites are live and functional, but they’re far from finished. The shooter needs sound effects, power-ups, and maybe a leaderboard. The fitness site could use a blog section, and deeper integration with Jeff’s actual booking calendar.

But that’s the beauty of this workflow—iteration is easy. Each improvement is just another commit away from being live.

Try Them Out

Play the shooter: ijduncan.com
Chat with the AI fitness coach: jeffphillipsfitness.com

And if you’re thinking about a weekend build of your own, my advice is simple: pick your stack, start typing, and let the deployment pipeline handle the rest. The modern web makes it easier than ever to go from idea to live site in a weekend.

Freeman AI Work

I took some images from the film I made and used a combo of gemini nano banana and veo3 to some new renderings of Bernhard Forcher as Gordon Freeman.

This is an experiment to see if I can create some kind of video LORA using footage from the film to create new scenes.

I used Kling which produced the best outputs. Sora looked video gamey and far too uncanny. Kling was pretty good.

This was generated from Kling using a still.

This was generated from Kling using a still. It gets his profile perfectly right.

I did some AI animation conversion to see what the freeman would look like as an animated series. This got some totally unwarranted major hate on YouTube.

Also believe it or not when i asked Nano banana to create an image of freeman it did two versions.

I swear on my children’s lives this is what it delivered.

Producing, Post & VFX Supervision

While heading up post at Mirada I was able to post-produce these Sentry Insurance spots from on-set VFX supervision to final delivery. A b-roll shot I grabbed on my Blackmagic pocket 4k also made it into the final spot.

Polartropica – Music Video

I VFX Supervised and finished this music video. First time using Davinci cloud database for shared remote workflows and I have to say it works pretty fraking well.

A hypnotic adventure into Wonderland with a dark and bloodthirsty twist sets the stage for LA indie artist Polartropica’s new music video for What’s Your Fantasy, directed by Vanessa Marzaroli.

The Midnight Cub

Worked on this with Mas FX. Great fun.

Here’s some stuff that didn’t make it in. Generated with Midjourney.

Not the cats. Really? That was Coco. Cyborg cat.

Dolby

Some of my work producing and posting Dolby shoots over the years.

Modular Post & Finishing Booth Design

I joined Enderby Entertainment at the tail end of 2020 on a contract to launch new offices in Canada.

Office rendering animation I made in Blender.

During my time there Designed the office space and custom booths for color and finishing. The design was done by me Blender and the booth was manufactured locally.

VFX Supervised: “North Sea Connection”

Supervised VFX from this series. Shout out to the editorial team in Sweden.

View it on IMDB:

The stunning scenery of Ireland’s west coast conceals a dark secret in ‘North Sea Connection’, in which Ciara must confront the dramatic consequences of her brother Aidan’s decision to transport drugs at sea.