Do you guys think there will ever be a FOSS voice assistant?

milkytoast@kbin.social to Free and Open Source Software@beehaw.org – 62 points –

Theres Dicio, which honestly does all that I need a voice assistant to do, but I have to open the app to use it, I cant just say "Hey Dicio" or whatever. Is something like that possible?

32

Home Assistant invested quite a bit into the technology to create a FOSS voice assistant over the past year. It still needs quite a bit of work, but the foundation is there; it supports wake words ("Hey ..."), speech-to-text to hear your command, interpretation and command processing, and text-to-speech to return results.

The downsides are that it's still quite technical to set up primarily due to the lack of commercially available hardware, and the command library is fairly small at this point.

With some of this foundational work out of the way, I expect Home Assistant to move forward quickly to improve, and other projects can work off the same pieces if they desire to as well.

Here's their year-end post about it: https://www.home-assistant.io/blog/2023/12/13/year-of-the-voice-chapter-5/

should have clarified, I'm not looking for a home assistant, I'm looking for a voice assistant on my phone. either way super exited to see where they take this

I don't see how being home-assistant excludes it from working on your phone. The only difference is that your phone acts as the "satellite" rather than a stationary device.

I have been trying to get home assistant voice assistant to work in my kurbenetes cluster. The documentation is nearlynon existant for configuring it without using their dedicated core OS version with the addon store.

Second issue is the esphone $13 voice assistant setup they have, the integration requires a UDP port for every audio steam. Home assistant picks a random UDP port currently which sucks for kurbenetes, docker you have to set to host networking mode. Someone made a patch that allows you to specify your own range, but I haven't gotten it working with the patch yet. It looks like their maybe an issue with the esphone device not using the correct channel for the microphone so nothing is being recorded.

I don't know if it fits all the prereqs of a FOSS, but there's mycroft and there's also jasper

But I have no idea how advanced they are, or how good their 3rd party integrations are.

Mycroft is defunct

Source?

Probably for the best. They'd been spinning their wheels while sucking most of the oxygen out of the room for several years now. Time for somebody else to give it a go

Home assistant is getting into voice assistants. I'm considering getting a few to try jt out

I don't know Dicio, but I mean can you just leave the app open? Because that's essentially what the other assistants are, just devices with the app always open.

If you can leave the app open, and it otherwise complies with your requirements, then we already have a FOSS voice assistant, it just doesn't have its own dedicated hardware yet. But if you would dedicate some hardware to it, like an old phone, then it could be largely equivalent.

There could be a software implementation that works perfectly fine on desktop PCs, especially Linux, but problem is hardware. I don't see commercial smartphone manufactures giving access to 'unauthorized uses' like foss projects usually go around.

You're right. The 'open source' android phones are the perfect example. But FOSS needs to stop relying on these fascist hardware stack and opt for better open modular platforms. We have examples for such things - like the framework laptops or fairphones. It's somewhat tolerable for laptops. But we are still too far behind in terms of mobiles and desk boxes needed for these sorts of projects.

considering android allows you to actively change the default assistant it won't be a problem, we already have plenty of apps that use overlays that are foss so that's not an issue either, so I really have no idea what you think would be locked down here.

dicio is just kind of a clunky app

I'm using https://rhasspy.readthedocs.io/en/latest/ together with HomeAssistant which does what you describe. It combines a lot of different things into one nice UI, one of the things is listening to a wake word with help of one of those:

  • Raven
  • Porcupine
  • Snowboy
  • Mycroft Precise
  • Pocketsphinx
  • External Command

With some of them you can even train it to use your own wake word.

there is kinda, you can get koboldAI which is open source and get an actually foss model to run with it and slap the program kobold assistant on top of it

There is no reason why it should not exist other than the fact that there really is no interest. Except for a few uses here and there (driving for example), voice assistants are just gimmicks.

I mean, that's like your personal opinion and not some objective fact.

Not really. Even Amazon, Apple and Google have been investing in assistants less and less. They have had massive lay offs from voice assistant teams.

https://www-theregister-com.cdn.ampproject.org/v/s/www.theregister.com/AMP/2022/11/23/voice_assistants_fail/?amp_gsa=1&_js_v=a9&usqp=mq331AQIUAKwASCAAgM%3D#amp_ct=1704369199164&_tf=From%20%251%24s&aoh=17043691929854&referrer=https%3A%2F%2Fwww.google.com&share=https%3A%2F%2Fwww.theregister.com%2F2022%2F11%2F23%2Fvoice_assistants_fail%2F

https://www.bbc.com/news/business-64371426

"But it's not clear whether they are money-making opportunities. Reports say most interactions are relatively simple tasks like checking the weather, or playing music.

More broadly, according to one report, over the past three years voice assistant use has been falling and another report suggests that the adoption of smart speakers is slowing."

So no it's not just "my opinion". But sure just down vote and fuck off.

I mean, that only says something about the money making part and not the pure usability. For many people the commercial options are a gimmick, sure. But are these options, with a clear focus on milking the customers for money, really the ultimate state for voice assistants? I'd argue they are not. There is a space for free voice assistants that let users control their data and that still provide a value. Beyond users with disabilities that make it hard to impossible to use computers, voice assistants won't ever do something you cannot do with a computer (which includes smartphones). If that makes you consider them a gimmick then I don't have an argument. But I think it is nice and convenient to be able to use a computer with your voice while doing something else.

I think that that just means that there's not much point in developing them further. they're still great for the simple tasks, like texting while driving.

There’s one you can use with Home Assistant that works pretty well for home automation commands.

I’ve just found that I don’t really like using voice control for things…