OpenAI introduces Sora, its text-to-video AI model

Technology@lemmy.world – 424 points – 9 months ago

OpenAI introduces Sora, its text-to-video AI model

I know people have been scared by new technology since technology, but I've never before fallen into that camp until now. I have to admit, this really does frighten me.

Boo!

What’s wild to me is how Yann LeCun doesn’t seem to see this as an issue at all. Many other leading researchers (Yoshua Bengio, Geoffrey Hinton, Frank Hutter, etc.) signed that letter on the threats of AI and LeCun just posts on Twitter and talks about how we’ll just “not build” potentially harmful AI. Really makes me lose trust in anything else he says.

There with you. This is really worrying to me. This technology is advancing way faster than were adjusting to it. I haven't even gotten over how amazing GPT2.5 is but most people already seem to be taking it for granted. We didn't have anything even close to this just few years prior

To make that statement a little more accurate, I'm afraid of the humans that will abuse this technology and societies ability to adapt to it. There's some amazingly cool things that can come about from this, like all the small indie creators that lack the connections and project management skills to make their ambitions come to life will be able to achieve their vision, and that's really cool and I'm excited for that, but my excitement is smashed from knowing all the bad that will come with this.

Only the third most confusing entry in the Kingdom Hearts series

Lol And KH4 is gonna be about Sora being in the real world. This storyline is getting out of hand.

The folks with access to this must be looking at some absolutely fantastic porn right now!

Oh its going to be fantastic all right.

Fantastical chimera monster porn, at least for the beginning.

'obama giving birth', 'adam sandler with big feet', 'five nights at freddy's but everyone's horny'

possibilities are endless

I don't think they would make a model like this uncensored.

Honestly, let's make it mainstream. Get it to a point where it's more profitable to mass produce Ai porn than exploit young women from god knows where.

This is so much better than all text-to-video models currently available. I'm looking forward to read the paper but I'm afraid they won't say much about how they did this. Even if the examples are cherry picked, this is mind blowing!

I'm looking forward to reading the paper

You mean the 100 page technical report

Just get ChatGPT to summarize it. Big brain time.

Full circle.

Eventually, the internet will just be AI criticizing itself to create a better version of itself...

Hang on...

How do you know you're not AI?

Doo^doo doodoo doo^doo doodoo doo^doo doodoo

Can I get sora to create a video from the summary?

Looking forward to the day I can just copy paste the Silmarillion into a program and have it spit out a 20 hour long movie.

I was thinking exactly this but with the Bible. Not because I like the Bible but because I'd love to see how AI interprets one of the most important books in human history.

But yeha, the Silmarillion is basically a Bible from another universe.

Which is why christians are scared of them. It will open people's eyes to how anyone can write a fairytale. And so much better ones, too.

I wonder if in the 1800s people saw the first photograph and thought… “well, that’s the end of painters.” Others probably said “look! it’s so shitty it can’t even reproduce colors!!!”.

What it was the end of was talentless painters who were just copying what they saw. Painting stopped being for service and started being for art. That is where software development is going.

I have worked with hundreds of software developers in the last 20 years, half of them were copy pasters who got into software because they tricked people into thinking it was magic. In the future we will still code, just don’t bother with the thing the Prompt Engineer can do in 5 seconds.

What it was the end of was talentless painters who were just copying what they saw. Painting stopped being for service and started being for art. That is where software development is going.

I think a better way of saying this are people who were just doing it for a job, not because of a lot of talent or passion for painting.

But doing something just because it is a job is what a lot of people have to do to survive. Not everyone can have a profession that they love and have a passion for.

That's where the problem comes in when it comes to these generative AI.

And then the problem here is capitalism and NOT AI art. The capitalists are ALWAYS looking for ways to not pay us, if it wasnt AI art, it was always going to be something else

I think that's a bad analogy because of the whole being able to think part.

I'll be interested in seeing what (if anything) humans will be able to do better.

It was exactly the same as with AI art. The same histrionics about the end of art and the dangers to society. It's really embarrassing how unoriginal all this is.

Charles Baudelaire, father of modern art criticism, in 1859:

As the photographic industry was the refuge of every would-be painter, every painter too ill-endowed or too lazy to complete his studies, this universal infatuation bore not only the mark of a blindness, an imbecility, but had also the air of a vengeance. I do not believe, or at least I do not wish to believe, in the absolute success of such a brutish conspiracy, in which, as in all others, one finds both fools and knaves; but I am convinced that the ill-applied developments of photography, like all other purely material developments of progress, have contributed much to the impoverishment of the French artistic genius, which is already so scarce.

What it was the end of was talentless painters who were just copying what they saw. Painting stopped being for service and started being for art.

This attitude is not new, either. He addressed it thus:

I know very well that some people will retort, “The disease which you have just been diagnosing is a disease of imbeciles. What man worthy of the name of artist, and what true connoisseur, has ever confused art with industry?” I know it; and yet I will ask them in my turn if they believe in the contagion of good and evil, in the action of the mass on individuals, and in the involuntary, forced obedience of the individual to the mass.

The hardest part of coding is managing the project, not writing the content of one function. By the time LLMs can do that it's not just programming jobs that will be obsolete, it will be all office jobs.

This is still so bizarre to me. I've worked on 3D rendering engines trying to create realistic lighting and even the most advanced 3D games are pretty artificial. And now all of a sudden this stuff is just BAM super realistic. Not just that, but as a game designer you could create an entire game by writing text and some logic.

In my experience as a game designer, the code that LLMs spit out is pretty shit. It won't even compile half the time, and when it does, it won't do what you want without significant changes.

The correct usage of LLMs in coding imo is for a single use case at a time, building up to what you need from scratch. It requires skill both in talking to AI for it to give you what you want, knowing how to build up to it, reading the code it spits out so that you know when it goes south and the skill of actually knowing how to build the bigger picture software from little pieces but if you are an intermediate dev who is stuck on something it is a great help.

That or for rubber ducky debugging, it s also great in that

That sounds like more effort than just... writing the code.

It s situationally useful

Chatgpt once insisted my JSON was actually YAML

Technically it is, but I agree that is imprecise and nobody would say so IRL. Unless they are being a pedantic nerd, like I am right now.

3 more...

Keep in mind that this isn't creating 3d Billy volumes at all. While immensely impressive, the thing being created by this architecture is a series of 2d frames.

Because it's trained on videos of the real world, not on 3d renderings.

Lol you don't know how cruel that is. For decades programmers have devoted their passion to creating hyperrealistic games and 3D graphics in general, and now poof it's here like with a magic wand and people say "yeah well you should have made your 3D engine look like the real world, not to look like shit" :D

Welcome to the club my friend... Expert after expert is having this experience as AI develops in the past couple years and we discover that the job can be automated way more than we thought.

First it was the customer service chat agents. Then it was the writers. Then it was the programmers. Then it was the graphic design artists. Now it's the animators.

Another programmer here. The bottleneck in most jobs isn't in getting boilerplate out, which is where AI excels, it's in that first and/or last 10-20%, alongside dictating what patterns are suitable for your problem, what proprietary tooling you'll need to use, what API's you're hitting and what has changed in recent weeks/months.

What AI is achieving is impressive, but as someone that works in AI, I think that we're seeing a two-fold problem: we're seeing a limit of what these models can accomplish with their training data, and we're seeing employers hedge their bets on weaker output with AI over specialist workers.

The former is a great problem, because this tooling could be adjusted to make workers lives far easier/faster, in the same way that many tools have done so already. The latter is a huge problem, as in many skilled worker industries we've seen waves of layoffs, and years of enshitification resulting in poorer products.

The latter is also where I think we'll see a huge change in culture. IMO, we'll see existing companies bet it all and die from supporting AI over people, and a new wave of companies focus on putting output of a certain standard to take on larger companies.

This is a really balanced take, thank you

Writer here, absolutely not having this experience. Generative AI tools are bad at writing, but people generally have a pretty low bar for what they think is good enough.

These things are great if you care about tech demos and not quality of output. If you actually need the end result to be good though, you’re gonna be waiting a while.

If you actually need the end result to be good though, you’re gonna be waiting a while.

I agree with everything you said, but it seems in the context of AI development "a while" is like, a few years.

That remains to be seen. We have yet to see one of these things actually get good at anything, so we don’t know how hard that last part is to do. I don’t think we can assume there will be continuous linear progress. Maybe it’ll take one year, maybe it’ll take 10, maybe it’ll just never reach that point.

Yeah a real problem here is how you get an AI which doesn't understand what it is doing to create something complete and still coherent. These clips are cool and all, and so are the tiny essays put out by LLMs, but what you see is literally all you are getting; there are no thoughts, ideas or abstract concepts underlying any of it. There is no meaning or narrative to be found which connects one scene or paragraph to another. It's a puzzle laid out by an idiot following generic instructions.

That which created the woman walking down that street doesn't know what either of those things are, and so it can simply not use those concepts to create a coherent narrative. That job still falls onto the human instructing the AI, and nothing suggests that we are anywhere close to replacing that human glue.

Current AI can not conceptualise -- much less realise -- ideas, and so they can not be creative or create art by any sensible definition. That isn't to say that what is produced using AI can't be posed as, mistaken for, or used to make art. I'd like to see more of that last part and less of the former two, personally.

Current AI can not conceptualise – much less realise – ideas, and so they can not be creative or create art by any sensible definition.

I kinda 100% agree with you on the art part since it can't understand what it's doing... On the other hand, I could swear that if you look at some generated AI imagines it's kind of mocking us. It's a reflection of our society in a weird mirror. Like a completely mad or autistic artist that is creating interesting imagery but has no clue what it means. Of course that exists only in my perception.

But it the sense of "inventive" or "imaginative" or "fertile" I find AI images absolutely creative. As such it's telling us something about the nature of creative process, about the "limits" of human creativity - which is in itself art.

When you sit there thinking up or refining prompts you're basically outsourcing the imaginative visualizing part of your brain. An "AI artist" might not be able draw well or even have the imagination, but he might have a purpose or meaning that he's trying to visualize with the help of AI. So AI generation is at least some portion of the artistic or creative process but not all of it.

Imagine we could have a brain computer interface that lets us perceive virtual reality like with some extra pair of eyes. It could scan our thoughts and allows us to "write text" with our brain, and then immediately feeds back a visual AI generated stream that we "see". You'd be a kind of creative superman. Seeing / imagining things in their head is of course what many people do their whole life but not in that quantity or breadth. You'd hear a joke and you would not just imagine it, you'd see it visualized in many different ways. Or you'd hear a tragedy and...

Like a completely mad or autistic artist that is creating interesting imagery but has no clue what it means.

Autists usually have no trouble understanding the world around them. Many are just unable to interface with it the way people normally do.

It’s a reflection of our society in a weird mirror.

Well yes, it's trained on human output. Cultural biases and shortcomings in our species will be reflected in what such an AI spits out.

When you sit there thinking up or refining prompts you’re basically outsourcing the imaginative visualizing part of your brain. [...] So AI generation is at least some portion of the artistic or creative process but not all of it.

We use a lot of devices in our daily lives, whether for creative purposes or practical. Every such device is an extension of ourselves; some supplement our intellectual shortcomings, others physical. That doesn't make the devices capable of doing any of the things we do. We just don't attribute actions or agency to our tools the way we do to living things. Current AI possess no more agency than a keyboard does, and since we don't consider our keyboards to be capable of authoring an essay, I don't think one can reasonably say that current AI is, either.

A keyboard doesn't understand the content of our essay, it's just there to translate physical action into digital signals representing keypresses; likewise, an LLM doesn't understand the content of our essay, it's just translating a small body of text into a statistically related (often larger) body of text. An LLM can't create a story any more than our keyboard can create characters on a screen.

Only once/if ever we observe AI behaviour indicative of agency can we start to use words like "creative" in describing its behaviour. For now (and I suspect for quite some time into the future), all we have is sophisticated statistical random content generators.

Still waiting on the programmer part. In a nutshell AI being say 90% perfect means you have 90% working code IE 10% broken code. Images and video (but not sound) is way easier cause human eyes kinda just suck. Couple of the videos they've released pass even at a pretty long glance. You only notice funny businesses once you look closer.

I can't imagine that digital artists/animators have reason to worry. At the upper end, animated movies will simply get flashier, eating up all the productivity gains. In live action, more effects will be pure CGI. At the bottom end, we may see productions hiring VFX artists, just as naturally as they hire makeup artists now.

When something becomes cheaper, people buy more of it, until their demand is satisfied. With food, we are well past that point. I don't think we are anywhere near that point with visual effects.

It seems to me that AI won't completely replace jobs (but will do in 10-20 years). But will reduce demand because oversaturation + ultraproductivity with AI. Moreover, AI will continue to improve. A work of a team of 30 people will be done with just 3 people.

Yeah. And it's not just how good the images look it's also the creativity. Everyone tries to downplay this but I've read texts and those videos and just from the prompts there is a "creative spark" there. It's not very bright spark lol but it's there.

I should get into this stuff but I feel old lol. I imagine you could generate interesting levels with obstacles and riddles and "story beats" too.

Because sometimes the generator just replicates bits of its training data wholesale. The "creative spark" isn't its own, it's from a human artist left uncredited and uncompensated.

Artists are "inspired" by existing art or things they see in real life all the time. So that they can replicate art doesn't mean they can't generate art. It's a non sequitur. But I'm sure people are going to keep insisting on this so lets not argue back and forth on this :D

3 more...

Besides the few glitched ones I wouldn't be able to tell they were generated. I didn't expect it this quick.

At least we can remake the last three star wars movies with a decent story line.

If you read Japanese, it's really obvious the Tokyo one is AI; the signage largely makes no sense, has incorrect characters, has weird mixing of characters, etc.

Someone wrote a decent story line for those??

Back to ChatGPT for that.

There are tons of books. Afaik the main storyline was an extragalactic invasion by a super evil swarm. Also explains why the emperor build so many ships.

There has been books out for years that Disney just didn't bother with. They can't be worse than what we got.

Unpopular opinion, but I actually liked the high level story of it, but I think it could have been told way way better.

The mammoth one is uncanny valley for me.

Would be good if openai could focus on things that are useful to humanity rather than trying to just do what we can do already, but with less jobs.

We already knew how to farm before John Deere; should we have focused away from agricultural industrialization in order to preserve jobs?

looks at the immense harm that agricultural industrialization has had on the climate, the environment and society

Apparently yes.

Working less is a great ideal for humanity.

Americans have this thing that their job defines them but we worked less than we did before, let's keep going.

Except the gains technology and automation bring are rarely evenly distributed in society. Just compare how productive a worker is today and how much we make compared to 50 years ago.

We make a lot more. Improvements are good.

You think people should be taxed more, vote for politicians trying to tax rich people more.

1 Generally people want to work, people don't want to be exploited by capitolists for a capitolist society where they barely make rent humans are generally workers. 2. This isn't working less, this isn't productivity improvement. This is less humanity in art and all just so employers don't need to spend money on workers.

Nothing is stopping anyone working for works sake. Personal I think that's a waste of time but people are free to do what they want.

Yes it is. It's the same as the printing press, or the electric switchboard, computers, cars, containerisation, 3d rendering verse drawing. Work used to be done by humans now the labour had been replaced to make something better quality, for a lower price with less workers.

Removing the artist is not "replacing the labor like the printing press".

No it is.

Why pursue any of the arts if they do not benefit humanity?

Because they look good enough for the web stories or RP I make

Ai generated images are not art.

Yes and no.

Currently you could say that ai is just efficiently guessing what we would want to see from pixel to pixel.

An artist may tune their style to be more similar to the art that they sold before in hopes of repeat buyers.

An AI looks at countless images and seeks out patterns which it refines. It mimics things and duplicates patterns.

An artists spends countless hours absorbed in the art of others to learn styles. Frequently they may mimic other works and iterate off of existing ideas.

Fan art, tracing, compositing - these are all things understood in the art community. If someone makes fan art of someone else's character does that invalidate their work as art?

AI invokes a reaction because it's getting "close." AI is receiving a lot of the same criticism that digital artists got for not using traditional mediums back in that technology's infancy.

Art is in the eye of the beholder. What defines art? Everything is relative. At present? AI is a tool. A bit unpolished and raw but so was CGI in the movie industry. Look how quickly that evolved.

AI could well be a tool for creating art in the future but as of yet it is not a tool I have ever seen to create anything I would consider art. Well, certainly not good art. Admittedly, every time I've been aware that it's been used at all it's because there are obvious AI errors present which make things look shit.

Without question. Early tablets and digital art couldn't hold a candle to traditional mediums. Even if the same artist created content for both. The tools are certainly rough.... but considering how young the technology is, and how far it has already come, I think we may soon arrive at a point where people may have issues distinguishing between the two.

Either way it's a fun topic to discuss. It's deeply interesting to see the variety of responses to it.

If nature carves a stone to look pretty, that's not art.

If a human carves a stone to look pretty, that's art. It has care and detail, it has something about humanity in it as it has a human behind it and everything that shaped them, shaped that stone.

It's that simple. Ai can not make art no more than the wind can.

I understand where you are coming from but to be fair the wind isn't using art as a reference. This is why I suggested it was a complex issue... and provided the examples that I did. There are quite a few similarities between ai models producing art and artists. Surely there are differences - but objectively speaking they do have quite a few similarities.

Art is specific to the beholder. Does what is before you evoke an emotional response? Was it produced for that purpose? If you provided paint and paper to an ape - would it be considered art? What about a child who has no concept of art?

From a non image perspective: music is art. Is a mashup music? What about other sample heavy music? Some people might argue that x genre isn't really music.

Back to prompt driven ai generated art: what if someone spent 70 hours tuning and modifying a prompt until the art fit their vision? 200 hours? What if they lacked the ability to draw or paint?

I genuinely don't believe this is a black and white issue. I do understand the implications of what ai tools have to the workforce - but that is a separate topic.

If the wind blows, cut up pieces of art magazines around and then land in a pile. That isn't art. It's just cut-up pieces of someone else's art.

If a person cuts up a magazine and pieces the parts together with intention and meaning. That can be art.

Art is not "I like this visially", art is not "you did this well." Art is human expression.

If the wind blows, cut up pieces of art magazines around and then land in a pile. That isn't art. It's just cut-up pieces of someone else's art.

I can't really agree with this example. I think you're suggesting the AI is completely independent of human expression and is completely random in its application of its training data (the cut up pieces I suppose?)

Generative AI is driven by a human prompt (description) and refined by further prompts which pushes the result in the direction of the prompters vision.

If a person cuts up a magazine and pieces the parts together with intention and meaning. That can be art.

This is in essence what is occuring above. I view this process as someone being provided a chisel and a block of stone:

The sculpture is already complete within the marble block, before I start my work. It is already there, I just have to chisel away the superfluous material.

-Michelangelo

As I suggested above AI is a tool that makes accessing art and expression available to anyone. The Ai is the chisel. They cut the stone with words.. It isn't just random clipart being thrown around either: The 'stone' is the culmination of all of the art the model has 'seen.' It has taken that data and found the patterns that different styles contain. You might describe this as the distillation of human expression into something new.

The source is art - human expression The prompt gives it form - human expression Further prompts drive the form to fit the users vision - human expression

There is intent and meaning.

Is it art in the traditional sense? Perhaps not in the same vein as ink and canvas but ... I believe, while it is certainly rough and unrefined, it can still be considered a tool to create art.

If you want, you can say that "prompt engineering" is an art. The act of engineering that prompts to get a picture, maybe that has a skill we might call art.

But no, the jpeg isn't art. It's a million cut-up images formed to make our monkey brains go "I enjoy".

Do you do this prompt engineering? The last time I had this conversation it turned out I was talking to someone that called themselves an artist because they put words into an ai.

Let's see if we can keep this civil, shall we?

First and foremost the model isn't compositing bits and pieces of other pictures - it's predicting what the next pixel should look like based on its training data. It is generating the image. In laymans terms: it's drawing based on what it has 'learned' by looking at other art. It's pretty interesting honestly.

I do have a background in art, though it is not my profession. Regardless of that- there are no requirements to create nor appreciate art.

A few good excerpts from wikipedia:

There is no generally agreed definition of what constitutes art, and its interpretation has varied greatly throughout history and across cultures.

Art can connote a sense of trained ability or mastery of a medium. Art can also refer to the developed and efficient use of a language to convey meaning with immediacy or depth. Art can be defined as an act of expressing feelings, thoughts, and observations.

Everyone is entitled to their opinions. Ours seem to differ- and that's fine. My views are simple: if someone can express themselves through a medium- it is a form of art.

Prompt engineering may be a form of expression, but the ai generated images composed of copying what it saw previously and repeating is not art. It has no humanity in it. The brush strokes have nothing to say. It is really no different than the wind blowing about cut-up versions of other people's art.

If someone intentionally opens a window that allows the wind to blow in and shuffle up the cut-up art into a new image, that may be performance art, the act of opening the window. But the final result, which has no humanity to it, is not art. Will never be art.

Prompt engineering is no different than opening that window and then letting the ai wind shuffle up everything it knew over and over until you find an esthetic you like. It's not art. It will never be art. Because it fundamentally can not be art.

I know I expressed this already but the wind analogy doesn't work here. It isn't random nor undirected.

As far as copying goes - considering your staunch stance on what is and isn't art I think it's fair to say you have some involvement with it.

Regardless of the medium we all start the same way. Imitation. In traditional art we are trained by observing what the masses find pleasing. When we observe most artists work we can identify these roots. Very few artists art is not based in the works of those before them.

This article does a fine job of expressing the above.

AI assisted (generative) art is a tool that provides a user access to a compendium of learned styles. It lowers the barrier of entry to express yourself through art.

I posit that this is such a divisive topic because there is so little difference between how we learn and how these models do. It garners a lot of the same negativity that a prodigy might. "Why is it so easy for them when I worked so hard. They don't appreciate it as much as me."

In the end art belongs to nobody and everybody. Art is amorphous; formless. Art and artistic expression can exist anywhere- even here. I personally am not so high minded to gatekeep such a broad field.

2 more...

Good luck keeping up that attitude as AI is advancing at this pace. You already can't tell them apart from human created images and and it'll just keep getting better. Stop kidding yourself.

Art is not about how believable it is. It's not a gauge of believability that an ai made this or not. There is no Turing test for art.

2 more...

If the natural state of technology is that there aren't enough jobs to sustain an economy, then our economic system is broken, and trying to preserve obsolete jobs is just preserving the broken status quo that primarily benefits the rich. Over time I'm thinking more and more that instead of trying to prop up an outdated economic system we should just let it fail, and then we have no choice but to rethink it.

Oh yes yes I'm sure that we will totally rethink our economic systems that's absolutely what will happen and it will totally result in the utopia you're dreaming of. I'm sure that will happen I'm sure it's not just the ultra wealthy noting how they can make even more profit whilst everyone else suffers can't be that I'm sure the government will do something we all have faith in that we know it's obvious that will happen

You think pushing the status quo is going to result in change? The sweet spot for the rich is to have everyone struggle while they enrich themselves, but not struggle so hard that it leads to an upheaval. We've tried patching up a broken system and it doesn't fix anything, it just slows the decline. I think an upheaval is the only answer, dunno when we'll hit the breaking point, but it will happen, it's inevitable. For the economy to fundamentally change it will require it becoming completely impossible to survive in the existing economy, otherwise nobody would want to risk a fundamental rethink of how things work.

2 more...

Ah yes, this definitely won’t have any negative ramifications.

The quality is really superior to what was shown with Lumiere. Even if this is cherry picking it seems miles above the competiton

I can't understand how the shadows and reflections are so accurate (not perfect, but convincing) like here or here.

The second one is easy as you don't need coherence between reflected and non-reflected stuff: Only the reflection is visible. The second one has lots of inconsistencies: I works kinda well if the reflected thing and reflection are close together in the image, it does tend to copy over uniformly-coloured tall lights, but OTOH it also invents completely new things.

Do people notice? Well, it depends. People do notice screen-space reflections being off in traditional rendering pipelines, not always, but it happens and those AI reflections are the same kind of "mostly there in most situations but let's cheap out to make it computationally feasible" type of deal: Ultimately processing information, tracking influence of one piece of data throughout the whole scene, comes with a minimum amount of required computational complexity and neither AI nor SSR do it.

Yeah we won't be needing proper raytracing with this kind of tech it's mind blowing

After seeing the horrific stuff my demented friends have made dall-e barf out I’m excited and afraid at the same time.

The example videos are both impressive (insofar that they exist) and dreadful. Two-legged horses everywhere, lots of random half-human-half-horse hybrids, walls change materials constantly, etc.

It really feels like all this does is generate 60 DALL-E images per second and little else.

For the limitations visual AI tends to have, this is still better than what I've seen. Objects and subjects seem pretty stable from Frame to Frame, even if those objects are quite nightmarish

I think "will Smith eating spaghetti" was only like a year ago

This would work very well with a text adventure game, though. A lot of them are already set in fantasy worlds with cosmic horrors everywhere, so this would fit well to animate what's happening in the game

I mean, it took a couple months for AI to mostly figure out that hand situation. Video is, I'd assume, a different beast, but I can't imagine it won't improve almost as fast.

It will get better, but in the mean time you just manually tell the AI to try again or adjust your prompt. I don't get the negativity about it not being perfect right off the bat. When the magic wand tool originally came out, it had tons of jagged edges. That didn't make it useless, it just meant it did a good chunk of the work for you and you just needed to manually get it the rest of the way there. With stable diffusion if I get a bad hand you just inpaint and regenerate it again until it's fixed. If you don't get the composition you want, just generate parts of the scene, combine it in an image editor, then have it use it as a base image to generate on top of.

They're showing you the raw output to show off the capabilities of the base model. In practice you would review the output and manually fix anything that's broken. Sure you'll get people too lazy to even do that, but non lazy people will be able to do really impressive things with this even in its current state.

YouTube is about to get flooded by the weirdest meme videos. We thought it was bad already, we ain't seen nothing yet.

If this goes well, future video compression might take a massive leap. Imagine downloading 2 hours movies with just 20kb file size because it just a bunch of prompts under the hood.

This would be the most GPU intensive compression algorithm of all time :)

And the largest ever decoder since it'll need the whole model to work. I'm not particularly knowledgeable on AI but I'll assume this will occupy hundreds of gigabytes, correct me if I'm wrong there. In comparison, libdav1d, an av1 decoder, weighs less than 2 MB.

If you randomize the seed it'll be a different render of the movie every time.

" but you haven't seen the ultimate limited edition fan version action cut of the directors cut"

Sounds like you already saw Madame Web

Looks good but still has the ai hallmarks, rotating legs, f’ed up gait.. impressive though and it’s going be wild to see what results from this latest pox on the tubes.

Imagine VR giving an AI generated world. It would be a Ready Player One in irl.

The compute power it would take to do that in realtime at the framerates required for VR to be comfortable for two separate perspectives would be absolutely beyond insane. But at the rate hardware improves and the breakneck speed these AI models are developing maybe it's not as far off as I think.

An Ai generated VR world would be a single map environment generated in the same way you wait at loading screens when a game starts or you move to an entirely new map.

A text to 3D game asset Ai wouldn't regenerate a new 3D world on every frame in the same way you wouldn't ask AI to draw a picture of an orange cat and then ask it to draw another picture of an orange cat shifted one pixel to the left if you wanted the cat moved a pixel. The result would be totally different picture.

I think we're talking about different kinds of implementations.

One being an ai generated 'video' that is interactive, generating new frames continuously to simulate a 3d space that you can move around in. That seems pretty hard to accomplish for the reasons you're describing. These models are not particularly stable or consistent between frames. The software does not have an understanding of the physical rules, just how a scene might look based on it's training data.

Another and probably more plausible approach is likely to come from the same frame generation technology in use today with things like DLSS and FSR. I'm imagining a sort of post-processing that can draw details on top of traditional 3d geometry. You could classically render a simple scene and allow ai to draw on top of the geometry in realtime to sort of fake higher levels of detail. This is already possible, but it seems reasonable to imagine that these tools could get more creative and turn a simple blocky undetailed 3d model into a photo-realistic object. Still insanely computationally expensive but grounding the AI with classic rendering to stabilize it's output could be really interesting.

I recently played a game where people found immortality and each individual just lived in their own personal virtual reality for thousands of years. It's kinda creepy seeing the recent advances in technology today lining up to that, minus the immortality part.

What game was that?

It's a spoiler to reveal the game so...

SPOILER: Sorry, I don't know how to do spoiler tags on this app but I'm referring to the antagonists in horizon forbidden west. Here's another sentence just to help hide the game for anyone scrolling by.

Her legs rotate around themselves and flip sides at 16s in. It's still very impressive, but ...yeah.

Wow didn't see that the first time

This is a base model, just because it's 90% there on its own doesn't mean you can't improve on it by adding extra safe guards. For example you can get LLMs to be more accurate by asking another LLM to proofread the work. I am frankly amazed that the base models are this good to begin with. I was totally expecting to need way more safeguarda from the get go, but we're getting a lot even without them. But I fully expect there to be AI tools that are specialized to identify where the base model messes up and then corrects it.

Shit posting 2.0 is here fellas

The cat video is funny, the cat has 5 legs :D

Seeing the 5 legged cat was the moment I started to believe this stuff really was AI generated.

I'm really impressed by the demo, but yes, let's see how well it works when it's made public.

People who don't think AI will take a lot of jobs may have to rethink...

This is the best summary I could come up with:

Sora is capable of creating “complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” according to OpenAI’s introductory blog post.

The company also notes that the model can understand how objects “exist in the physical world,” as well as “accurately interpret props and generate compelling characters that express vibrant emotions.”

Many have some telltale signs of AI — like a suspiciously moving floor in a video of a museum — and OpenAI says the model “may struggle with accurately simulating the physics of a complex scene,” but the results are overall pretty impressive.

A couple of years ago, it was text-to-image generators like Midjourney that were at the forefront of models’ ability to turn words into images.

But recently, video has begun to improve at a remarkable pace: companies like Runway and Pika have shown impressive text-to-video models of their own, and Google’s Lumiere figures to be one of OpenAI’s primary competitors in this space, too.

It notes that the existing model might not accurately simulate the physics of a complex scene and may not properly interpret certain instances of cause and effect.

The original article contains 395 words, the summary contains 190 words. Saved 52%. I'm a bot and I'm open source!

I'm pretty sure that's a model tho.

The demo looks pretty good, yes - but I won't believe it 'till I try it!

shit is going too far, as excited expected, and governments give a fuck about societies. Only in the EU, there are a few human-like movements.

Who's benefiting from this? Why is this even a fucking thing?

The most obvious, immediate use is better CGI in shows and movies. Personally, I like to be entertained, so I consider myself as benefitting from this.

The less immediate use is AI with an understanding of time and space, real world physics, cause and effect,...

just for starters imagine your tv commercials are fully customized to what you are triggered by and what you like..

beyond that they wont even need commercials.. the actual content you're viewing will have advertising and product placement embedded.

if you cant think of a beneficial use for this stuff, don't worry plenty of others will.

That doesn't really sound beneficial to me

Sounds beneficial to the ad company

Don't worry, plenty of others will apparently...

How is that good?

I salute you for not adding the /s. Steady under fire like a good soldier.

This can only be bad for artists and if you are happy about it you are a fascist

I honestly can't see this replacing anyone. This is the equivalent of stock footage. It's just gonna replace Shutterstock I guess