Adding Molt Bot (Clawd) to Home Assistant

Connect Any OpenAI-Compatible LLM to Home Assistant Voice

This guide shows how to use a powerful LLM (like Claude, GPT-4, or any OpenAI-compatible API) as your Home Assistant voice assistant brain, replacing local models like Ollama/Llama.

Why?

Smarter responses — Cloud LLMs understand context better than small local models

Fast device control — Proxy handles common commands instantly without LLM roundtrip

Best of both worlds — Quick local responses for home control, powerful LLM for complex questions

Architecture

Wake Word → Whisper STT → Ollama Proxy → Your LLM API → Piper TTS

                              ↓

(Fast path for device control, weather, time queries)

`Prerequisites

– Home Assistant with voice pipeline set up (Wyoming protocol)

– Whisper (faster-whisper) for speech-to-text

– Piper for text-to-speech  

– OpenWakeWord for wake word detection

– Python 3.10+ on a server (can be same machine as HA or separate)

– An OpenAI-compatible API endpoint (OpenAI, Claude via proxy, local LLM with OpenAI API, etc.)

Step 1: Create the Ollama Proxy

This Python script makes your LLM look like an Ollama server to Home Assistant.

Create `ollama-proxy.py`:

```python
#!/usr/bin/env python3

Ollama API Proxy – Makes any OpenAI-compatible LLM look like Ollama to Home Assistant.

import json
import re
import requests
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
from flask import Flask, request, jsonify
app = Flask(__name__)

============== CONFIGURATION ==============

Your LLM API endpoint (OpenAI-compatible)

LLM_URL = “https://api.openai.com/v1/chat/completions”  # Or your endpoint

LLM_TOKEN = “your-api-key-here”

LLM_MODEL = “gpt-4”  # Or claude-3-opus, etc.

Home Assistant API (for device control)

HA_URL = “https://homeassistant.local:8123”

HA_TOKEN = “your-long-lived-access-token”

System prompt for voice responses

SYSTEM_PROMPT = “””You are a voice assistant for Home Assistant. Keep responses concise and conversational – this is voice, not text.

Important: Respond in 1-2 sentences max. Be helpful and natural.”””

============== DEVICE MAPPINGS ==============

Customize these for YOUR home

LIGHTS = {

    “living room”: “light.living_room”,

    “kitchen”: “light.kitchen”,

    “bedroom”: “light.bedroom”,

    # Add your lights here

}

SWITCHES = {

    “fan”: “switch.fan”,

    # Add your switches here

}

COVERS = {

    “garage”: “cover.garage_door”,

    “blinds”: “cover.blinds”,

}

CLIMATE = {

    “thermostat”: “climate.thermostat”,

}

COLORS = {

    “white”: {“color_temp_kelvin”: 4000},

    “warm”: {“color_temp_kelvin”: 2700},

    “red”: {“hs_color”: [0, 100]},

    “green”: {“hs_color”: [120, 100]},

    “blue”: {“hs_color”: [240, 100]},

}

============== HELPER FUNCTIONS ==============

def call_ha_service(domain, service, data):

    “””Call a Home Assistant service.”””

    try:

        resp = requests.post(

            f”{HA_URL}/api/services/{domain}/{service}”,

            json=data,

            headers={“Authorization”: f”Bearer {HA_TOKEN}”},

            verify=False,

            timeout=10

        )

        return resp.status_code == 200

    except:

        return False

def get_weather(location):

    “””Get weather from wttr.in (no API key needed).”””

    try:

        location = location.strip().replace(” “, “+”)

        resp = requests.get(f”https://wttr.in/{location}?format=%C+%t”, timeout=5)

        if resp.status_code == 200:

            return f”Weather in {location.replace(‘+’, ‘ ‘)}: {resp.text.strip()}”

    except:

        pass

    return None

def find_entity(text, device_map):

    “””Find matching entity from device map.”””

    text_lower = text.lower()

    for name in sorted(device_map.keys(), key=len, reverse=True):

        if name in text_lower:

            return name, device_map[name]

    return None, None

============== FAST PATH HANDLERS ==============

def handle_time(text):

    “””Handle time queries instantly.”””

    text_lower = text.lower()

    if “what time” in text_lower:

        from datetime import datetime

        return datetime.now().strftime(“It’s %I:%M %p.”)

    if “what day” in text_lower or “what’s the date” in text_lower:

        from datetime import datetime

        return datetime.now().strftime(“It’s %A, %B %d.”)

    return None

def handle_weather(text):

    “””Handle weather queries.”””

    text_lower = text.lower()

    patterns = [

        r”weather\s+(?:in|for|at)\s+(.+)”,

        r”what(?:’s| is)\s+(?:the\s+)?weather\s+(?:in|for|at|like in)\s+(.+)”,

    ]

    for pattern in patterns:

        match = re.search(pattern, text_lower)

        if match:

            location = match.group(1).strip().rstrip(“?.,!”)

            return get_weather(location)

    return None

def handle_lights(text):

    “””Handle light commands.”””

    text_lower = text.lower()

    name, entity = find_entity(text, LIGHTS)

    if not entity:

        return None

    params = {“entity_id”: entity}

    # Check for color

    for color_name, color_data in COLORS.items():

        if color_name in text_lower:

            params.update(color_data)

            if call_ha_service(“light”, “turn_on”, params):

                return f”{name.title()} lights set to {color_name}.”

    # Check for brightness

    brightness_match = re.search(r”(\d+)\s*%”, text_lower)

    if brightness_match:

        params[“brightness_pct”] = int(brightness_match.group(1))

        if call_ha_service(“light”, “turn_on”, params):

            return f”{name.title()} lights set to {params[‘brightness_pct’]}%.”

    # On/off

    if any(w in text_lower for w in [“turn off”, “off”]):

        if call_ha_service(“light”, “turn_off”, params):

            return f”{name.title()} lights off.”

    elif any(w in text_lower for w in [“turn on”, “on”]):

        if call_ha_service(“light”, “turn_on”, params):

            return f”{name.title()} lights on.”

    return None

def handle_thermostat(text):

    “””Handle thermostat commands.”””

    text_lower = text.lower()

    name, entity = find_entity(text, CLIMATE)

    if not entity:

        return None

    temp_match = re.search(r”(\d+)\s*(?:degrees|°)?”, text_lower)

    if temp_match:

        temp = int(temp_match.group(1))

        if call_ha_service(“climate”, “set_temperature”, {“entity_id”: entity, “temperature”: temp}):

            return f”Thermostat set to {temp} degrees.”

    return None

def handle_covers(text):

    “””Handle cover commands (garage, blinds, etc.).”””

    text_lower = text.lower()

    name, entity = find_entity(text, COVERS)

    if not entity:

        return None

    params = {“entity_id”: entity}

    if “open” in text_lower:

        if call_ha_service(“cover”, “open_cover”, params):

            return f”Opening {name}.”

    elif “close” in text_lower:

        if call_ha_service(“cover”, “close_cover”, params):

            return f”Closing {name}.”

    return None

============== MAIN HANDLER ==============

def handle_command(text):

    “””Try fast-path handlers first, then fall back to LLM.”””

    Try each fast handler

    for handler in [handle_time, handle_weather, handle_lights, handle_thermostat, handle_covers]:

        result = handler(text)

        if result:

            return result

    Fall back to LLM

    try:

resp = requests.post(
            LLM_URL,
            json={
                "model": LLM_MODEL,
                "messages": [
                    {"role": "system", "content": SYSTEM_PROMPT},
                    {"role": "user", "content": text}
                ]
            },
            headers={
                "Authorization": f"Bearer {LLM_TOKEN}",
                "Content-Type": "application/json"
            },
            timeout=120
        )
        if resp.status_code == 200:
            return resp.json()["choices"][0]["message"]["content"]
    except Exception as e:
        print(f"LLM error: {e}")
    return "Sorry, I couldn't process that."

============== OLLAMA API ROUTES ==============

@app.route(“/api/chat”, methods=[“POST”])

def chat():

    “””Handle Ollama chat requests from Home Assistant.”””

    data = request.json

    messages = data.get(“messages”, [])

    if messages:

        user_message = messages[-1].get(“content”, “”)

        response = handle_command(user_message)

        return jsonify({

            “model”: “assistant”,

            “created_at”: “”,

            “message”: {“role”: “assistant”, “content”: response},

            “done”: True

        })

    return jsonify({“error”: “No message provided”}), 400

@app.route(“/api/tags”, methods=[“GET”])

def tags():

    “””Return available models (Ollama compatibility).”””

    return jsonify({

        “models”: [{

            “name”: “assistant:latest”,

            “model”: “assistant:latest”,

            “modified_at”: “2024-01-01T00:00:00Z”,

            “size”: 4661235994,

            “digest”: “abc123”,

            “details”: {

                “format”: “gguf”,

                “family”: “llama”,

                “parameter_size”: “8B”,

                “quantization_level”: “Q4_0”

            }

        }]

    })

@app.route(“/api/version”, methods=[“GET”])

def version():

    return jsonify({“version”: “0.1.0”})

@app.route(“/”, methods=[“GET”])

def health():

    return “Ollama Proxy OK”

if __name__ == “__main__”:

    print(“Starting Ollama Proxy on port 11435…”)

    app.run(host=”0.0.0.0″, port=11435)

Step 2: Install Dependencies and Run

“`bash

# Install dependencies

pip install flask requests

# Run the proxy

python ollama-proxy.py

Step 3: Create a Systemd Service (Optional)

Create `/etc/systemd/system/ollama-proxy.service`:

“`ini

[Unit]

Description=Ollama LLM Proxy

After=network.target

[Service]

Type=simple

User=your-username

WorkingDirectory=/path/to/script

ExecStart=/usr/bin/python3 /path/to/ollama-proxy.py

Restart=always

RestartSec=5

[Install]

WantedBy=multi-user.target

“`

Enable and start:

“`bash

sudo systemctl daemon-reload

sudo systemctl enable --now ollama-proxy

Step 4: Configure Home Assistant

Add the Proxy as an Ollama Service

1. Go to **Settings → Devices & Services → Add Integration**

2. Search for **Ollama**

3. Enter the URL: `http://YOUR_SERVER_IP:11435`

4. Click Submit

Create a Conversation Agent

1. Go to the new Ollama integration

2. Click Add conversation agent

3. Select model: `assistant:latest`

4. Uncheck “Prefer handling commands locally”

5. Save

Configure Voice Assistant

1. Go to Settings → Voice assistants

2. Edit your assistant (or create new)

3. Set Conversation agent to your new Ollama agent

4. Ensure STT is set to Whisper and TTS to Piper

Point Your Voice Device to the Assistant

For HA Voice devices or Wyoming Satellites:

1. Find the device’s Assistant selector entity

2. Set it to your new voice assistant

Step 5: Test

Say your wake word, then:

– “Turn on the living room lights”

– “What’s the weather in Seattle?”

– “Set the thermostat to 72”

– “What time is it?”

– “Tell me a joke” (goes to LLM)

Customization

Add More Devices

Edit the device mappings in the script:

“`python

LIGHTS = {

    “living room”: “light.living_room”,

    “kitchen”: “light.kitchen”,

    “garage”: “light.garage”,

    # Add yours

}

Add More Fast Handlers

Create new handler functions for device types you use frequently:

“`python

def handle_music(text):

    if “play music” in text.lower():

        call_ha_service(“media_player”, “media_play”, {“entity_id”: “media_player.speaker”})

        return “Playing music.”

    return None

“`

Add to the handler list in `handle_command()`.

Adjust LLM Timeout

If responses are slow, the HA voice pipeline may timeout. Options:

– Increase fast-path coverage for common commands

– Use a faster LLM model

– Adjust HA’s timeout settings (if available)

Troubleshooting

“No such entity” errors

– Check device mappings match your actual HA entity IDs

– Verify HA_TOKEN has permission to control devices

Proxy not responding

– Check firewall allows port 11435

– Verify proxy is running: `curl http://localhost:11435/api/tags`

Voice assistant times out

– Add more fast-path handlers for common queries

– Check LLM API latency

Wake word not detected

– Check OpenWakeWord is running

– Verify wake word model is loaded

– Adjust microphone sensitivity

This approach was developed for https://github.com/clawdbot/clawdbot, a personal AI assistant framework. The proxy pattern works with any OpenAI-compatible API.