A conversation with the director of the award-winning documentary Age of AI and the AI film Seeing Red
A little background. Juan is a formidable director, designer and animator. I’ve known Juan for many years and we’ve worked together many times in the past. Subsequently he moved to LA and we’ve stayed in touch.
In late 2022, Juan directed the film that brought to the screen a discussion between Eric Schmidt (ex-CEO of Google), Daniel Huttenlocher, the Dean of the MIT Schwarzman College of Computing and Henry Kissinger on the future and implications of AI, entitled “Age of AI and Our Human Future”. While the film centered around broader AI implications like robotics and future employment issues, Juan chose to bring this important conversation to life using generative AI in many of the sequences. You can see the film here
Juan, how did this documentary come about? The production company Sandblast brought me on. I don’t think honestly that they were aware of the AI explosion that was about to happen in 2022. The AI capabilities were literally coming to life in front of my eyes. I found that whatever I was doing was becoming obsolete – in a matter of days!
One thing I found is that in generating images and video in AI it can look beautiful, but AI keeps offering me things that is not what I’m asking for and you have to fight it. You have to put in very specific parameters or it’s not going to work. Right now, AI is a little stubborn.
How did you work with the prompts? Well first, the order of the prompts is very important and strangely, you can write the same prompt and hour later and you’ll get a totally different result. AI is even better at certain times of the year or moments in a day!
Earlier this year Juan made a film that was created in its entirety with generative AI called Seeing Red. For this 5-minute piece, viewable on Juan’s Instagram, he generated over 100 separate shots and marked a significant step in creating original video using generative AI. This piece is pre-Sora (which is still to be released), and what it took for Juan to create this dazzling array of images is the discussion in our hour-long interview.
Astonishingly, this entire piece took just 3 days. When you step back and think about achieving such a complex film any other way, the days would stretch into weeks.
Juan, Why did you decide to make this film? My journey started in the strangest way. I initially planned to create a film inspired by Goya’s painting “Saturn Devouring His Son.” This dark and grotesque artwork always captured my imagination. One day, as I was thinking about this project I noticed a red plastic bag dancing in the air. Intrigued, I grabbed it and started filming the bag in various scenarios, captivated by its movement and the vividness of its color.
As is often my process, I put the project to sleep for about three years. When I revisited it, I realized that with the advent of AI generative images and videos I could finally bring this project to life in a new and exciting way. While I didn’t end up using any of the original material, the core idea evolved, and I came to feel that we humans, are a bit like Saturn—consuming and being consumed. This blend of traditional artistic influence and modern technology shaped “Seeing Red” into what it is now.
What was your general approach? Because AI is a moving target and keeps improving, my general approach was not to be too precious - otherwise it would only be perfect in some infinite future! So the piece was really a collection of ideas. Sometimes to get the right shot, you need maybe at least 30 tries, so you need a lot of iterations. I think we need to understand that that in generating AI, words have value. So it’s interesting that talking again and conveying ideas is going to get to the point and be better talkers!
What AI programs did you use? Stable Video Diffusion was very good with camera moves. Pica was the most surprising and RunwayML was the most reliable of them all. I also used Topaz as well Also Midjourney, Gemini and Dall-E 3.
Tell me about some of the techniques you used? I sometimes used different AI programs on the same shot. Also, us visual artists are not necessarily the best copywriters. So I used image reference. I find a visual reference and then spend a lot of time creating the perfect reference in Photoshop – It’s almost an artform in itself. With programs like Midjourney or Dream Studio you can give them a reference and dial up how close or far you want it from the reference image.
How much work did you do in post? Every shot has been treated - especially color, varying the speed and removing artifacts of each shot. I used AfterFX, Photoshop and Cinema 4D.
How do you see the future of AI in production – especially as Sora and the other applications from Google and Microsoft get released? It’s not so much about what AI can do – which is limitless – I’m much more interested in how we digest the treatment as humans. People will initially be amazed at what they can do and even get drunk on the process and results. That’s going to reduce the visual impact, so the way to engagement is for us humans to have a say in the outcome. As the speed of generative video AI gets faster, it will become near-realtime (audio is already there) and video will become more like a videogame for the audience.
What type of artist is going to be most effective in this area? My superpower is that I can actually convey ideas through drawings. You can even put in a prompt “I want an award-winning photograph”. In the end what makes you effective is the collection of choices that you make. Those choices are consistent and have a specific mission and objective that is going to result in really good outcome. For the video I made, anybody could come up with these images, but it’s the way I put it together, the choice of music, the choice of color etc. – all those choices are mine.
One thing that’s particularly beautiful working in Discord and Midjourney is you can see everybody around the world coming up with their own prompts – that can help you craft your own. It’s quite humbling.
I think a key part of every artist’s day is to allocate 20% of your time to learning.
Juan Delcan, Thank you! For this article, I have posted it on my blog as well as on Medium. I’ve included links to all the programs and applications discussed here. There are growing communities of AI filmmaker creators. As the new generation of applications and capabilities get released, films like the two Juan created will be markers in the rapid development of generative AI film. How a visual artist like Juan can bend AI to their artistic will demonstrates a pathway to integrating generative AI and human creativity.
Peter Corbett
LINKS
· Age of AI and Our Human Future
APPLICATIONS
· Dall-E-3
· RunwayML
· Sora (pre-release)
· Topaz
· Pica
· Gemini
· ChatGPT
Last week Anthony Vagnoni interviewed me for the Simian blog. Here's the link: https://www.gosimian.com/blog/profiles-features/Peter-Corbett-Takes-a-Close-Look-at-AI
Meanwhile, think about this:
Moore’s Law famously states that the number of transistors on a microchip will double every 2 years.
AI is doubling in power and capability every 6 months. Thus, with Sora and ChatGPT5 launching by late summer and Adobe’s Firefly for video launching later this year,
# AI will be twice as capable by this October.
# 4 times more capable this time next year
# 8 times more capable by the end of next year.
Right now, clients and even their agencies are very nervous about using AI. Some of this is (legitimate) copyright fears. BUT just wait for the first all-AI, mind-blowing creative TV spot to make it’s inevitable impact on the industry and the gold rush will be on!
Last week Anthony Vagnoni interviewed me for the Simian blog. Here's the link: https://www.gosimian.com/blog/profiles-features/Peter-Corbett-Takes-a-Close-Look-at-AI
Meanwhile, think about this:
Moore’s Law famously states that the number of transistors on a microchip will double every 2 years.
AI is doubling in power and capability every 6 months. Thus, with Sora and ChatGPT5 launching by late summer and Adobe’s Firefly for video launching later this year,
# AI will be twice as capable by this October.
# 4 times more capable this time next year
# 8 times more capable by the end of next year.
Right now, clients and even their agencies are very nervous about using AI. Some of this is (legitimate) copyright fears. BUT just wait for the first all-AI, mind-blowing creative TV spot to make it’s inevitable impact on the industry and the gold rush will be on!
There’s been much wailing and gnashing of teeth in the advertising production and post industry. The twin threats of growing in-house agency production and AI have caused some to say, “It’s just a cycle that will see a swing back as it always has”. To others who lament that “It was a fun business while it lasted!”.
Over the past 8 months or so, I’ve been attempting to gauge how our industry is going to shake out in the future. For those who don’t know me, I’m a long-time DGA director and DP. I founded or co-founded several production and post companies including Click 3X, Sound Lounge, Heard City, ClickFire Media and Bonfire. Recently I worked with a Madrid-based company to develop their virtual cloud-based workstation business.
Several months ago, I began helping a creative editorial company in researching in very specific terms, where AI can be instrumental in post. This was less about generative AI - creating images and video from a set of prompts - and more about how AI can work in post now.
Literally every day, I scan the half-dozen new AI apps then check out and try to test those that impact media production. Here’s what I see.
There are applications including existing tools that have been AI-infused and dramatically alter how we should approach post. From color-correct to beauty retouching to rotoscoping to editing to audio recording to stock footage selection and manipulation to CGI to design to all manner of VFX to animation - even the way production is managed - everything has changed. Every skill listed here has an AI-fueled solution that require way less craft to execute.
The AI capabilities in post flow directly to production. If AI can up-res shots, create slow motion, clean up faces, add lighting, remove unwanted objects, replace skies, automatically rotoscope anything and replace entire environments etc. then, “We’ll fix it in post” will be replaced by, “We’ll reconstruct it all with AI”. The production cost savings on many shoots could be seismic.
As an industry we have relied on the brilliant original creative inspiration to concept, direct, design and sell the treatment and then an entire host of craft people to carry out the execution. We make some money on the original concept, but we make most on the execution including versioning, client changes etc. AI literally replaces that much of this less expensively, more efficiently, and most important, without the investment in capital, infrastructure and teams that form the backbone of so much of our business.
So, who owns these AI processes and prompt skills? The Holding Companies are investing hundreds of millions in AI initiatives - including acquisitions - in order to own this. They too are in a race with their clients to make sure they remain relevant. Two-thirds of brands already have in-house agencies who are also well-positioned to invest in these AI systems.
While other technologies have taken years to filter through the production process, AI is moving so incredibly fast that new uses and applications will come on the market fully developed in months – not years. Nvidia just announced delivery of their new AI superchip Blackwell whose 208 billion transistors is 30 times faster than their previous most advanced chip. So, whatever you think of Sora, imagine AI video production 20 or 30 times more advanced. BTW Sora along with ChatGPT5 is set to launch in the market as early as this fall.
In conclusion, my view is that there is an inevitability that the business of production/post for agencies and brands will shrink significantly. There are still thousands of clients and agencies who will not have near-term access to this new wave and TV commercials will continue to thrive as an industry as more and more streamers become ad supported. But the writing (voice prompt?] is on the wall and entire craft categories and careers will virtually disappear. I wouldn’t want to count on making a living as an extra or a rotoscope artist!
I do think that as best we can, the production/post industry needs to familiarize themselves with and USE the new tools – not just experiment. This won’t stop the inevitable, but it will postpone the coming storm.
In future postings, I’ll list some of the developments and tools that I think directly impact production and post.
Peter Corbett
While everyone is focused on generative AI’s ability to create incredible video, useful AI can be found in the applications we have in our toolbox right now.
Firstly, I believe that in-house agency production and in-house brand agencies are not likely to have any significant advantage in the implementation of visual generative AI. Agencies are spending hundreds of millions on AI, but much of this investment relates to targeting, media, and consumer research. When it comes to Runway ML, Midjourney, Pika etc. everyone is generating from the same LLMs. While these agencies may control the work and continue to steer projects in-house, they will have little executional advantage.
The breathtaking speed of progress simply demands that production and post shops need to broadly embrace AI. Four Areas:
1. Production – Live action filming. AI capabilities that are readily accessible and potentially offer huge production efficiencies are embedded in the software we use now. Blackmagic’s Resolve is a good example. Topaz ($299) is another. These programs use AI to up-res up to 8K, interpolate to create slow motion shots without high-speed camera equipment, create actual depth maps - even add lighting effects where none existed. The potential savings and efficiencies once a production company filters these new capabilities to the shoot itself will be significant.
2. Post. AI can not only rotoscope but will also replace environments - even with dynamic camera moves. Color-correct and beauty retouching as well as sky replacements etc. - which can be a substantial component of a budget - can be achieved at a far lower cost than previously. Adobe’s new version of Premiere being released this year, will enable the editor to magically extend scenes, eliminate objects within the shot and even add and replace objects with AI - all within the edit timeline. Since Premiere will also integrate with other generative AI platforms like Sora, Runway ML and Pika, a substantial part of creating a film could be in the hands of the editor.
3. Stills to Video: The capability of AI programs like Runway ML to take an existing still image and create motion opens a huge opportunity to build new shots and sequences. Imagine a product shot that would normally require a set up with a full camera crew etc. can can use a still, then literally bring the image to cinematic life - add foliage rustling in a breeze, cloud movement, a dynamic camera move, drops of condensation rolling gently down the side of the product, dynamic lighting effects etc. All possible from a single still!
4. Creating new shots and sequences from scratch. Sora is being released in a few months as part of Chat GPT5. Right now, working with Sora is a little like a roulette wheel. You enter the prompts, never sure exactly what you’re going to get. Trying to noodle and finesse can result in frustrating random changes. However, this is where OpenAI’s goal of improving these programs is being focused. Adobe’s Firefly is a good example of such improvement where you can now dictate focal length, f-stop, depth-of-field lighting style etc.
Since Sora can build spectacular shots and full sequences that are either impossible or extremely expensive, integrating this immeasurable production value into a story is now not only available to the filmmaker, but also to his editor, his client and even the team cranking out an Instagram brand post! There’s a revealing interview on Sora with Mira Murati Open AI’s CTO in The Wall Street Journal (subscription required). In this interview the question was raised about where the videos used to train the LLM came from. OpenAI has a deal with Shutterstock, but the issue of copyright safety is going to loom large.
I know that as soon as I put out this piece, there’s a high likelihood it will almost immediately be outdated. There is now an entire community of AI filmmakers who are producing all-AI films. Out of this community will emerge directors and editors who inevitably will permeate into the commercial production business. OpenAI opened up Sora to several directors to try to make a short film just using voice prompts to generate the shots. I’d like to share one example from this from Walter Woodman of Shy Kids, “Air Head”. You can also check out behind-the-scenes here
Finally, to the many of you already experimenting and using AI tools. I would encourage you to comment and share your own experiences and advice.
Copyright © 2018 Peter Corbett Ventures & Consulting - All Rights Reserved.