What OpenAI really want is regulatory intervention that forces them and their competitors to have KYC (know your customer) rules like banks. This should prevent the plethora of new entrant start ups from entering the field. The technology is pretty much a commodity at this stage and they know it. Without that regulatory intervention they’ll be just another model amongst 10s of others.
The comms pre-brief across video and audio is there to get the media to start projecting fear of what this technology will do unchecked. If they get their way then all the existing voice generating models then have to tackle the harder problem of navigating regulatory authority which you can only do at a scale benefiting them and a select few others.
Like your other article about tech companies creating problems that “only they can solve” if only they were given some more money or less (or allowed self) regulation rings true.
Maybe the researchers should train the model on Sam Altman and show why this should be available to the public. I'm sure the CEO won't mind his voice available to everyone to use
The reason we're all going to die is because too many people were either too stupid or too gullible to not realize this: there is NO "good" use case for any of these technologies that comes even close to making up for *completely and utterly breaking the idea of Objective Reality*.
Totally fallacious to jump from: "I'm going to clone my voice to help me create 10 presentations about X, Y, Z for which I have the text" or "I'm going to clone my voice with X variation to sound more sophisticated or work on my accent, then practice it over time and adapt it as my real voice" or "I'm going to clone my voice and connect it to this translation app to be able to talk to X people in Y country with my real voice" to "We're all going to die" and "We're breaking the idea of objective reality". People need to unplug from their phones and laptops if that's what they think is going to happen with this kind of harmless technology, and not absorb so much content. First of all, the subset of users who will use this technology to try to do harm is strikingly small (and service providers or intelligence agencies will target those malicious users). It's already happening on a very small scale, and we're quickly getting the hang of it. Second, we're going to build more robust security and authentication systems to deal with this problem (you can't just call your bank and be authenticated like that, you have to have a secure bank ID app linked to your personal device like in Sweden to be able to do that). Third, if someone is unlucky enough to have someone else target their parents or family with this trick, then either they have shared too much voice information and sensitive data online publicly (which they should not have to begin with), or they are very wealthy and worthy of being targeted (in which case they are a tiny data point in the universe of points), or they are a politician or powerful target and have security in place to deal with the threat, or they are just facing the "you're being trolled" type of situation. If you're talking about mass disinformation and misinformation and how it will affect people, then that's on them for being so gullible and not filtering their feeds, and they should disconnect or unplug from social media and stop watching radicalized news outlets.
People are going to need to meditate a little bit, but please, stop the fear mongering about petty technology. Please focus on automated drones and "AI girlfriends", those are harmful technologies because they directly impact your well-being. Voice cloning is not harmful, period.
" People need to unplug from their phones and laptops if that's what they think is going to happen with this kind of harmless technology, and not absorb so much content. "
I said: "It's already happening on a very small scale, and we're quickly getting the hang of it."
Yes, if someone calls you asking for money, hang up and call them directly to verify the story. As for the CEO and bank stories, it's just a "skills issue" as the new generations say (no security and weak training against spear phishing). The problem will probably get worse when the technology is obscured and banned, because people (rightly) do not want to be controlled or told what to do by the government.
"The use of voice cloning to do harm" is not the same as "voice cloning is harmful". Automated drones (weaponized ones) are harmful because they are used for the purpose of harming people, while AI girlfriends are also harmful because they are used to create a false narrative and sense of relationship that does not exist, and to invade the user's privacy. On the other hand, I have given really useful ideas on how voice cloning can be used without its purpose or primary application being to cause harm. Does that make sense, or do you still disagree? I did not mean to sound adversarial, so I apologize if I came across a bit hostile.
It just feels like you are really stressed about the whole thing. I used to be there, about other issues like privacy, China, and cybersecurity and big tech. It is not real and it does not help you or anybody to feel that way or to project that state of mind. We need to be rational and realistic. Voice cloning is not the end of society.
So far, all OpenAI has are some really expensive party tricks, I suppose they can cruise on hoovered-up VC money for a while, but they've gotta start selling a product at some point, and I don't know if disrupting commercial art, help desks, and corporate writing is the big payoff for them. They need a cell phone, something ubiquitous, something EVERYBODY uses. What is it? You could tie the voice, video, and writing blocks together and crank out entertainment, Marilyn Monroe and Rudolph Valentino in a remake of A Star Is Born. Doesn't seem big enough. I think they want to build Deep Thought, a computer that can give the Ultimate Answer.
It seems like the perpetual-motion-machine here is that OpenAI converts billions in Microsoft money into tens of billions in Microsoft stock valuation. That works so long as the hype bubble stays inflated.
I LOVE the smell of Ranting in the morning! It keeps me sane for the time being to know there is someone who can articulate a first cause problem with clarity. I am personally now dealing with a robotic voice caller that keeps reiterating the same message over and over to my landline phone dozens of times throughout the 24 hour period. So, I had to disconnect my landline to receive NO CALLS at all. How nice is that for a new problem I should NOT need to deal with? I have AT&T, my son has T-Mobile - both with the same problem. How much worse could it get using God knows what as a means to harass the public? UGH!
The 15-second claim is new, but deliberately left unsaid is how they accomplish that - which I suspect has to do with building up a giant backlog of existing recordings of voices, so that there are always enough statistically similar phonemes to patch what comes from the new sample. One of the few concrete details OpenAI released about this is that it relies on 'a combination of licensed and publicly available data', presumably that backlog.
And the only real reason you would do that, of course, is that you're not actually concerned with 'cloning' individual voices so much as providing enough plausible facsimiles for fraudsters, who I honestly believe is the intended target audience
Right, this is the key point. They're taking a thing that was sketchy, flawed, and still required a fair amount of data and boasting that they've made it much more polished, requiring far less data.
Back in the 1970s, I worked for a technology group that used some primitive image modification technology to make a mouth look like it was speaking arbitrary phonemes. If I remember correctly, they used circa mid-1970s audio technology, probably analog, to fake our sponsor's voice and synched it up with the fake video of him speaking. It was an impressive demo for its day, and we were told to delete it. It was garbage by modern standards, but we've had 50 years of improved technology. As best I can tell, the delete command still works.
Quite. It kind of looks like inventing a terrible biological virus and mot because you actually want to develop protection against such a virus. What is true for the 'gene space' should probably have an equivalent in the 'meme space'.
Having said that, I strongly suspect that Hanlon's Razor holds here for a large part. Sam seems shifty enough, but many of his fellow travelers are probably honestly convinced of the things they (and he) says.
In the end, the issue is not so much that these systems are or will be intelligent, but that we humans aren't (enough).
Of course, you are aware that several others have already come to market with this tech right?
Names you've never heard of, companies you've never heard of, releases you've never heard of and they never got in any spotlight, never commanded the world's attention to discuss the risks or solutions, never blogged about it or sat down with reporters.
I'm not quibbling with the problems of the technology, I'm not saying Altman is driven by altruism.
But, the fact that Altman is out there on the stage is the only reason most of us even know about it and have something to rant about.
Otherwise, it is just a genie escaped from a bottle you never saw.
While we rant, let us also give thanks to someone for spoon-feeding the present to us so we can opine about the future and the past here.
In the end, this article makes Altman's argument for him. Let's impede the competition, if Altman is bad, those ones we don't hear from are worse in the end.
Indeed they are not the protagonist in this case, that character has been working the room in silence already and Altman's voice is
I think the key difference (which, granted, I should've added a paragraph on) is that the versions of this product that have already come to market require more data.
(I see Tyler makes this point elsewhere in the thread. Thanks Tyler!)
The big companies aren't treating Generative AI as a problem to be solved. They're treating it as a race to be won.
OpenAI isn't the first company to produce AI voice cloning. But they are bragging that they can now do it with only 15 seconds of data. That's not a good thing!
And I agree that we should impede the competition. But let's impede Altman too! We should have serious regulatory frameworks, with serious penalties, and Sam Altman shouldn't be in charge of setting them as he sees fit.
I find your comment very hostile. I would like to be able to clone my voice for presentations, to automate tasks, or to express creativity (if I can come up with a temporary voice variation that sounds really funny or cool and use it later), or eventually to have a device speak aloud with my voice in another language. How do we balance usability and security? Also, if OpenAI does not brag about its powerful technologies, how will it raise money? Please keep in mind that the development of AGI will be associated with the development of very uncomfortable (subjectively) technologies. I am not sure that this kind of venting is fruitful or will have a positive effect on society.
I see a future sci-fi series: “Customer of Interest.” You’re being recorded. OpenAI has a system, a machine that mimics your voice and lets criminals use it every hour of every day. I know because I’m Sam Altman, I paid a bunch of techies to build it. I wanted it to make money for me but now it’s creating windfalls for every grifter, fraudster and terrorist with a laptop and no moral code - like me.
There is no use case for this technology. Not everything that can be invented should be invented
Michael Crichton taught us that, over and over again.
I feel the same way about Sora - there isn’t a single good reason for something like that to exist. Outlaw it.
If it works, it can make holywood quality stuff for near free, are you kidding me? And I kind of hate OpenAI
What OpenAI really want is regulatory intervention that forces them and their competitors to have KYC (know your customer) rules like banks. This should prevent the plethora of new entrant start ups from entering the field. The technology is pretty much a commodity at this stage and they know it. Without that regulatory intervention they’ll be just another model amongst 10s of others.
The comms pre-brief across video and audio is there to get the media to start projecting fear of what this technology will do unchecked. If they get their way then all the existing voice generating models then have to tackle the harder problem of navigating regulatory authority which you can only do at a scale benefiting them and a select few others.
Like your other article about tech companies creating problems that “only they can solve” if only they were given some more money or less (or allowed self) regulation rings true.
Very helpful point, thanks.
Maybe the researchers should train the model on Sam Altman and show why this should be available to the public. I'm sure the CEO won't mind his voice available to everyone to use
That would be a good demo as he speaks with a distinct vocal fry
The reason we're all going to die is because too many people were either too stupid or too gullible to not realize this: there is NO "good" use case for any of these technologies that comes even close to making up for *completely and utterly breaking the idea of Objective Reality*.
Totally fallacious to jump from: "I'm going to clone my voice to help me create 10 presentations about X, Y, Z for which I have the text" or "I'm going to clone my voice with X variation to sound more sophisticated or work on my accent, then practice it over time and adapt it as my real voice" or "I'm going to clone my voice and connect it to this translation app to be able to talk to X people in Y country with my real voice" to "We're all going to die" and "We're breaking the idea of objective reality". People need to unplug from their phones and laptops if that's what they think is going to happen with this kind of harmless technology, and not absorb so much content. First of all, the subset of users who will use this technology to try to do harm is strikingly small (and service providers or intelligence agencies will target those malicious users). It's already happening on a very small scale, and we're quickly getting the hang of it. Second, we're going to build more robust security and authentication systems to deal with this problem (you can't just call your bank and be authenticated like that, you have to have a secure bank ID app linked to your personal device like in Sweden to be able to do that). Third, if someone is unlucky enough to have someone else target their parents or family with this trick, then either they have shared too much voice information and sensitive data online publicly (which they should not have to begin with), or they are very wealthy and worthy of being targeted (in which case they are a tiny data point in the universe of points), or they are a politician or powerful target and have security in place to deal with the threat, or they are just facing the "you're being trolled" type of situation. If you're talking about mass disinformation and misinformation and how it will affect people, then that's on them for being so gullible and not filtering their feeds, and they should disconnect or unplug from social media and stop watching radicalized news outlets.
People are going to need to meditate a little bit, but please, stop the fear mongering about petty technology. Please focus on automated drones and "AI girlfriends", those are harmful technologies because they directly impact your well-being. Voice cloning is not harmful, period.
" People need to unplug from their phones and laptops if that's what they think is going to happen with this kind of harmless technology, and not absorb so much content. "
This technology 𝙝𝙖𝙨 𝙖𝙡𝙧𝙚𝙖𝙙𝙮 𝙗𝙚𝙚𝙣 𝙪𝙨𝙚𝙙 𝙞𝙣 𝙛𝙪𝙧𝙩𝙝𝙚𝙧𝙖𝙣𝙘𝙚 𝙤𝙛 𝙘𝙧𝙞𝙢𝙚𝙨. https://www.npr.org/2023/03/22/1165448073/voice-clones-ai-scams-ftc
I said: "It's already happening on a very small scale, and we're quickly getting the hang of it."
Yes, if someone calls you asking for money, hang up and call them directly to verify the story. As for the CEO and bank stories, it's just a "skills issue" as the new generations say (no security and weak training against spear phishing). The problem will probably get worse when the technology is obscured and banned, because people (rightly) do not want to be controlled or told what to do by the government.
You clearly don't have an aging parent living alone.
"It's already happening on a small scale" is not the same as "voice cloning is not harmful period". *Totally fallacious jump*, you might say.
It's supsicious that you think voice can only be cloned if it was recorded online. People still talk in meatspace.
"The use of voice cloning to do harm" is not the same as "voice cloning is harmful". Automated drones (weaponized ones) are harmful because they are used for the purpose of harming people, while AI girlfriends are also harmful because they are used to create a false narrative and sense of relationship that does not exist, and to invade the user's privacy. On the other hand, I have given really useful ideas on how voice cloning can be used without its purpose or primary application being to cause harm. Does that make sense, or do you still disagree? I did not mean to sound adversarial, so I apologize if I came across a bit hostile.
It just feels like you are really stressed about the whole thing. I used to be there, about other issues like privacy, China, and cybersecurity and big tech. It is not real and it does not help you or anybody to feel that way or to project that state of mind. We need to be rational and realistic. Voice cloning is not the end of society.
On the other hand, I can also respect your opinion and freedom of identity and expression, even if we disagree. Have a great day Chaos Goblin.
So far, all OpenAI has are some really expensive party tricks, I suppose they can cruise on hoovered-up VC money for a while, but they've gotta start selling a product at some point, and I don't know if disrupting commercial art, help desks, and corporate writing is the big payoff for them. They need a cell phone, something ubiquitous, something EVERYBODY uses. What is it? You could tie the voice, video, and writing blocks together and crank out entertainment, Marilyn Monroe and Rudolph Valentino in a remake of A Star Is Born. Doesn't seem big enough. I think they want to build Deep Thought, a computer that can give the Ultimate Answer.
(Always appreciate a good Hitchhiker's reference)
It seems like the perpetual-motion-machine here is that OpenAI converts billions in Microsoft money into tens of billions in Microsoft stock valuation. That works so long as the hype bubble stays inflated.
https://www.techradar.com/computing/artificial-intelligence/chatgpt-might-get-its-own-dedicated-personal-device-with-jony-ives-help
I LOVE the smell of Ranting in the morning! It keeps me sane for the time being to know there is someone who can articulate a first cause problem with clarity. I am personally now dealing with a robotic voice caller that keeps reiterating the same message over and over to my landline phone dozens of times throughout the 24 hour period. So, I had to disconnect my landline to receive NO CALLS at all. How nice is that for a new problem I should NOT need to deal with? I have AT&T, my son has T-Mobile - both with the same problem. How much worse could it get using God knows what as a means to harass the public? UGH!
Doesn't AI voice cloning already exist? Several news reports on people being impersonated to solicit ransom money from their relatives were published this last year, like on 60 Minutes https://www.cbsnews.com/news/how-digital-theft-targets-people-from-millennials-to-seniors-60-minutes-2023-05-21/.
Why is this reported as a new technology OpenAI came up with?
The 15-second claim is new, but deliberately left unsaid is how they accomplish that - which I suspect has to do with building up a giant backlog of existing recordings of voices, so that there are always enough statistically similar phonemes to patch what comes from the new sample. One of the few concrete details OpenAI released about this is that it relies on 'a combination of licensed and publicly available data', presumably that backlog.
And the only real reason you would do that, of course, is that you're not actually concerned with 'cloning' individual voices so much as providing enough plausible facsimiles for fraudsters, who I honestly believe is the intended target audience
Right, this is the key point. They're taking a thing that was sketchy, flawed, and still required a fair amount of data and boasting that they've made it much more polished, requiring far less data.
That's a bad thing though. They should feel bad.
Like the man said, comms baby, comms. Like all the tech Apple "invented", if you have the biggest megaphone you can claim anything.
Back in the 1970s, I worked for a technology group that used some primitive image modification technology to make a mouth look like it was speaking arbitrary phonemes. If I remember correctly, they used circa mid-1970s audio technology, probably analog, to fake our sponsor's voice and synched it up with the fake video of him speaking. It was an impressive demo for its day, and we were told to delete it. It was garbage by modern standards, but we've had 50 years of improved technology. As best I can tell, the delete command still works.
Quite. It kind of looks like inventing a terrible biological virus and mot because you actually want to develop protection against such a virus. What is true for the 'gene space' should probably have an equivalent in the 'meme space'.
Having said that, I strongly suspect that Hanlon's Razor holds here for a large part. Sam seems shifty enough, but many of his fellow travelers are probably honestly convinced of the things they (and he) says.
In the end, the issue is not so much that these systems are or will be intelligent, but that we humans aren't (enough).
Hot damn Dave, you nail it. That Falcon Heavy ride to Mars needs plenty of seating me thinks.
This is such a bad idea that it really needs to be killed with fire.
...why do these "humans" hate humans so much?...
Of course, you are aware that several others have already come to market with this tech right?
Names you've never heard of, companies you've never heard of, releases you've never heard of and they never got in any spotlight, never commanded the world's attention to discuss the risks or solutions, never blogged about it or sat down with reporters.
I'm not quibbling with the problems of the technology, I'm not saying Altman is driven by altruism.
But, the fact that Altman is out there on the stage is the only reason most of us even know about it and have something to rant about.
Otherwise, it is just a genie escaped from a bottle you never saw.
While we rant, let us also give thanks to someone for spoon-feeding the present to us so we can opine about the future and the past here.
In the end, this article makes Altman's argument for him. Let's impede the competition, if Altman is bad, those ones we don't hear from are worse in the end.
Indeed they are not the protagonist in this case, that character has been working the room in silence already and Altman's voice is
I think the key difference (which, granted, I should've added a paragraph on) is that the versions of this product that have already come to market require more data.
(I see Tyler makes this point elsewhere in the thread. Thanks Tyler!)
So this strikes me as running parallel to the AI-Content-Farms problem that I wrote about last year. https://davekarpf.substack.com/p/what-are-we-going-to-do-about-generative
The big companies aren't treating Generative AI as a problem to be solved. They're treating it as a race to be won.
OpenAI isn't the first company to produce AI voice cloning. But they are bragging that they can now do it with only 15 seconds of data. That's not a good thing!
And I agree that we should impede the competition. But let's impede Altman too! We should have serious regulatory frameworks, with serious penalties, and Sam Altman shouldn't be in charge of setting them as he sees fit.
I find your comment very hostile. I would like to be able to clone my voice for presentations, to automate tasks, or to express creativity (if I can come up with a temporary voice variation that sounds really funny or cool and use it later), or eventually to have a device speak aloud with my voice in another language. How do we balance usability and security? Also, if OpenAI does not brag about its powerful technologies, how will it raise money? Please keep in mind that the development of AGI will be associated with the development of very uncomfortable (subjectively) technologies. I am not sure that this kind of venting is fruitful or will have a positive effect on society.
I see a future sci-fi series: “Customer of Interest.” You’re being recorded. OpenAI has a system, a machine that mimics your voice and lets criminals use it every hour of every day. I know because I’m Sam Altman, I paid a bunch of techies to build it. I wanted it to make money for me but now it’s creating windfalls for every grifter, fraudster and terrorist with a laptop and no moral code - like me.
“Chat GPT, Kill Sam Altman. Chat GPT, Kill Sam Altman. Chat GPT, Kill Sam Altman.”