Best way to mask or disguise your voice in real time to avoid voice biometrics?

I have no choice and I am required to use microsoft teams. I don’t want microsoft or someplace else to store my voice biometrics when the microsoft account is already tied under my real identity real name. Is buying some cheap microphone the best way to counter that?

Is there a good way to use a voice changer that doesn’t really show I am using one, just enough to affect the voice print? I’ve seen microphones having some built in hardware for changing voice, maybe something like that would help. These are the same people I will be meeting physically, so my voice should not sound that different or else it will get suspicious.

What would be the best approach and also not embarrass myself? I don’t know if the technology is that advanced and I am just being paranoid.

Honestly, I think that, unfortunately, AIs nowadays are clever enough to be able to fingerprint your voice nonetheless (if using a cheap mic or some software).
Not like AI was needed in the first place to begin with. :sweat_smile:

But 2nd bad news, I don’t think that there is a FOSS tool that helps with that either.
So overall, no solution to this specific problem.

Also, even if you do find one it doesn’t change the problem: anybody in the call would be sending and receiving your voice too.
So, if you protect your voice on your end somehow, it doesn’t change the fact that it would be uploaded to a server by other attendees of the call.


As a whole, be defensive with your voice tho:

  • visit your bank and tell them that you need to be in person to approve some money transfer
  • tell your family members that you will never call them on their phone number but only on Signal (for example), might reduce the risk of their being deceived by a random phone call

Mostly depends on the why you want to protect your voice in the first place. Are you a high priority target or are afraid of loosing something if people imitate your voice? :wink:

2 Likes

I was hoping there would be a way to at least distort it enough to not sound too recognizable and to reduce the accuracy of any voiceprint software. I thought maybe talking under a cloth and slightly further from the mic could help.

My voice has been included in several semi-popular videos and I wouldn’t want any possibility for someone to be able to link me to that one, since it is under my real identity on microsoft.

Are you willing to put up with closed source Voice Changers? Because while there are open source projects, either they are basic, archived in the first place or unmaintained

talking about things like voicemod tbh but yeah

edit: for android there is voicesmith
Voicesmith | F-Droid - Free and Open Source Android App Repository (you could get it off the repo too if im not mistaken) and maybe figure out how to translate it through teams from here

1 Like

Realistically speaking, for it to be efficient I think that it would require quite a few things:

  • changing the pitch/tone
  • shortening/elongating your words/sentences with some post-processing delay
  • getting into your way of speaking, aka altering your words and way of speaking
  • adding “useful noise”, as in making the signal dirty (not just a blowing fan) to alter the audio frequence to the point where it is not audible to your ear yet has an actual real impact on the output

Even in interview on Darknet Diaries, I feel like the voices doesn’t have a lot of processing or at least, they are done in post-production.

So overall, the best for some livestream real-time alteration would be to have some Speech to Speech as in:

  • you talk into the mic
  • some scripts runs in the background
  • spits a voice out

Which in practice would look like it’s some donation on a Twitch livestream or a VTuber kind of situation. :wink:
Here are 2 projects that I looked in the past, maybe the could be of use:

It’s overall feasible to obfuscate your voice, but it will require some skills/time to setup the server locally. Even if impersonal, it will at least allow to efficiently hide your voice unlike a cloth/filter/other naive solution that would just be undo with the most basic of voice algorithms. :blush:

It will also help with this because, given enough time: the accuracy of any tool will reach 100%.
Solution: don’t give any real data by using a virtual voice.


EDIT: I lost the exact link of the asian person doing live speech out in English but I know it’s achievable. You mostly need to train the model with your real voice first, then stream it through the mic and after a small-ish delay, it should display the voice out according to the model.
Here is a quick video about the general idea.

And here are 2 relevant repos:

I know it’s a lot of work and might not be accessible, but this would be your best defense when it comes down to hiding your voice[1].
Don’t use any kind of proprietary tools or “fancy service” if you want to it to be truly privacy-respectful.


  1. still wouldn’t not change the way you speak tho ↩︎

2 Likes

I will urge not to use any of the GitHub voice changers use or the second text speech as they have been archived or not maintained for a while.

This is why we’re not recommending open source atm

1 Like

what you’ve listed here is definitely bulletproof against any voice recognition. But the issue is that if it’s found out I’m using tts, I’d get embarrassed and would be having huge consequences. A trained ear would probably know quick enough. This would be the best option when online anonymized, what I should’ve done before having my voice recordings published.

I was under the impression that no matter what algorithms a voice recognition software would use, it would still have to analyze and work with everything it has which would be some slightly distorted low quality audio, and if we would compare A: clear high quality 30 seconds audio recording and B: Distorted, bad quality 30 second audio recording, I was assuming that it could go around 95-99% for A and maybe 50% for B. But I am probably stretching it.

EDIT: I don’t know if it makes any difference, but all the videos that had my voice in it was recorded through discord from the other party. Not sure how accurate the information is but I’ve read that the automatic audio processing discord does would noticeably affect the accuracy when some voice recognition software is involved, compared to a direct recording.