Your voice, their voiceprint

by Naaxiom

Voiceprints – the fingerprints for identifying your voice – have recently been making headlines throughout the surveillance and privacy community. The technology  has developed to a point that you can now be identified by your voice alone, regardless of which telephone number, email address, YouTube account you are using. This is by no means a new phenomenon in the biometrics field, but Voiceprints have recently gained a lot of traction in the surveillance industry due to the increasing breadth of technology that uses ones voice, i.e., smartphones, Skype, voice search, etc.

siri

Many companies, such as Google and Apple have publicly declared that they are storing users’ voice data via their services (Google Voice & Siri), which are both seemingly at the disposal of Law Enforcement Agencies (LEAs) for subpoena requests to acquire user data not only for building a case, but for further surveillance capabilities using this biometric. Many VoIP providers have built-in voice retention given to them based on partnerships with speech analysis firms.

How is this done?

There are two phonetic factors involved in matching a voiceprint to an identity. One is the formants, being the amplitude peaks of the speaker’s sounds. Vowels such as [i], [u] and [a] (shown below) are distinguished by listeners based upon the different frequencies of these formants, generated by a combination of the speaker’s physical vocal tract and the position of their tongue and lips to manipulate the air canal to produce the desired sound. Formants are not universal among speakers, ranging in tone/frequency due to the physical characteristics of their vocal tract. This first step helps Voiceprint software in identifying the speaker’s characteristics, such as male vs. female, tall vs. short, and further distinguishing characteristics.

spectrograph of vowels [i], [u] and [a]

spectrograph of vowels [i], [u] and [a]

The other phonetic factor involved is the speaker’s distinct articulations, including dialectal factors like having a “caught” vs. “cot” distinction, glottalization (‘dropping’ the [t] in words such as “bottle”), and so on. [2]

Who is taking this information?

It is not fully known the extent to which organizations are harvesting voice data, but a few companies have come out publicly claiming to have massive databases of unknowing users’ Voiceprints. One such company, Russian-based SpeechPro, has bragged about storing “millions” of Americans’ voices to provide LEAs with intelligence for a cost. Many other companies harvesting user data, such as Google, Apple, and now Facebook with its voice-chat, are actively storing users’ voices for future analysis. Any government telephone service openly records voices, and at least have the courtesy of notifying the caller, however many centralized call centers and service providers do this same thing allowed by the fine print of their respective fine print of the terms of agreement.

SpeechPro

Are there ways to “encrypt” my voice?

Aside from conventional encryption methods such as sending a recording via secured, encrypted email using PGP/GPG, or using an over-the-air encryption service such as CellCrypt, there are ways of scrambling the source to increase your security and privacy. These methods can be cumbersome, but add an extra layer of security in case you are using an unencrypted connection. Using these methods is not fool-proof, and I would be open to any more source-based voice scrambling techniques.

  • Modify the formants using a vocal transformer with a randomized algorithm and real-time adjustments. This will change the formants of your voiceprint to make it more difficult to narrow down your sex and physical characteristics. Without randomized adjustments, vocal transformers can be easily reversed.
  • Modify your personal speaking style and changing your dialect. This will take some research and vocal skill, but an easy way to do this is to simply imitate a famous person with a characteristic speaking style.
  • Add background noise. Vocal analysis requires a good source of sound, and although background noise reduction technology is advancing at a quick pace, this will always help. Many smartphones today have built-in noise reduction, so the best way to keep background noise would be not to use a phone microphone, but a computer or physical microphone.

@naaxiom

References:

[1] “Comparing Formants” http://speechpro.com/media/news/2012-10-15

[2] “Voiceprint Identification”: http://expertpages.com/news/voiceprint_identification.htm

Changing formants: http://documentation.apple.com/en/soundtrackpro/effectsreference/index.html

Whitepapers:

http://www.sersc.org/journals/IJHIT/vol4_no2_2011/6.pdf

http://www.ll.mit.edu/publications/journal/pdf/vol08_no2/8.2.4.speakerrecognition.pdf

http://www.nuance.com/ucmprod/groups/enterprise/@web-enus/documents/collateral/nc_018621.pdf

Vendors: http://daon.com, http://persay.com, http://speechpro-usa.com

Advertisements