Self-hosted voice assistant with mobile app
from eager_eagle@lemmy.world to selfhosted@lemmy.world on 21 Feb 20:44
https://lemmy.world/post/43424949

Any experiences with a self-hosted assistant like the modern Google Assistant? Looking for something LLM-powered that is smarter than older assistants that would just try to call 3rd party tools directly and miss or misunderstand requests half of the time.

I’d like integration with a mobile app to use it from the phone and while driving. I see Home Assistant has an Android Auto integration. Has anyone used this, or another similar option? Any blatant limitations?

#selfhosted

threaded - newest

artyom@piefed.social on 21 Feb 20:59 next collapse

Multi-billion dollar companies like Google and Apple can’t even figure this shit out, doubt some nerd is gonna do it for free.

eager_eagle@lemmy.world on 21 Feb 21:20 collapse
  1. They can and do; 2. LLMs can do tool calling just fine, even self-hosted ones.
artyom@piefed.social on 21 Feb 21:37 collapse

LOL they can’t even reliably turn the lights on, WTF are you talking about?

eager_eagle@lemmy.world on 21 Feb 22:03 collapse

maybe last you tried it was over 6 months ago, maybe you’re using the old google assistant, or idk, but it definitely works for me

artyom@piefed.social on 21 Feb 22:33 collapse

Everything I’ve read says Gemini is like 10x worse than Google Assistant.

wildbus8979@sh.itjust.works on 21 Feb 21:04 next collapse

Home Assistant can absolutely do that. If you are ok with simple intent based phrasing it’ll do it out of the box. If you want complex understanding and reasoning you’ll have to run a local LLM, like Llama, on top of it

eager_eagle@lemmy.world on 21 Feb 21:21 collapse

yeah, that’s what I’m looking for. Do you know of a way to integrate ollama with HA?

lyralycan@sh.itjust.works on 21 Feb 21:48 collapse

I don’t think there’s a straightforward way like a HACS integration yet, but you can access Ollama from the web with open-webui and save the page to your homepage:

<img alt="" src="https://sh.itjust.works/pictrs/image/ae206b55-421f-4035-ac6a-a51774739fcd.jpeg">

Just be warned, you’ll need a lot of resources depending on which model you choose and its parameter count (4B, 7B etc) – Gemma3 4B uses around 3GB storage, 0.5GB RAM and 4GB of VRAM to respond. It’s a compromise as I can’t get replacement RAM, and tends to be wildly inaccurate with large responses. The one I’d rather use, Dolphin-Mixtral 22B, takes 80GB storage and 17GB min RAM, the latter of which I can’t afford to take from my other services.

penguin@lemmy.pixelpassport.studio on 21 Feb 21:10 next collapse

Home Assistant can do that, the quality will really depend on what hardware you have to run the LLM. If you only have a CPU you’ll be waiting 20 seconds for a response, which could also be pretty poor if you have to run a small quantized model

Kirk@startrek.website on 21 Feb 21:16 collapse

Maybe things have improved but the last time I tried the Home Assistant er- assistant, it was garbage at anything other than the most basic commands given perfectly.