So just to be clear, the technical postmortem is very interesting. I’d never heard of hyperframes [1] either, so I’ll have to check that out.
But (and again, I know this isn’t the point of the article) the big problem is that its practically a bingo-card of all the standard AI catchphrases. The whole “This isn’t X, it’s Y” thing (“This isn’t just food, it’s a promise”), followed by short strings of little one-liners like “one customer, one box, zero mistakes.”
And it of course it went for the middle-of-the-road neonoir cyberpunk and to hammer home the "cyberpunk aspect" it threw in some rather unnatural sounding Chinese/Japanese? (未才東京...?)
For anyone with even a passing familiarity with AI, it’s an absolute banshee's wail of LLM-style prosaic banality.
I agree with what you said, but I think it's reductive for a few reasons:
- It was (and still is) amazing to me that GPT Image 2.0 was this good at making a coherent manga page in one shot.
- This project was focused mainly on the animation pipeline, not the manga page content.
- As with all things AI, it is up to the user's creative direction to make the output good. I did not give much creative direction at all to the example manga page, thus the AI catchphrases. With some guidance, this seems to be a very promising way to get your manga ideas onto "paper" quickly if you desire.
Yeah, I agree, I think specificity is key. If you leave an LLM, or any AI model, to its own devices, it gravitates to the figurative "mean of its training data" which tends to be pretty sterile.
I actually have some examples of my own comics that I used to put gpt-image-2 through its paces, and the results were surprisingly good as well:
So just to be clear, the technical postmortem is very interesting. I’d never heard of hyperframes [1] either, so I’ll have to check that out.
But (and again, I know this isn’t the point of the article) the big problem is that its practically a bingo-card of all the standard AI catchphrases. The whole “This isn’t X, it’s Y” thing (“This isn’t just food, it’s a promise”), followed by short strings of little one-liners like “one customer, one box, zero mistakes.”
And it of course it went for the middle-of-the-road neonoir cyberpunk and to hammer home the "cyberpunk aspect" it threw in some rather unnatural sounding Chinese/Japanese? (未才東京...?)
For anyone with even a passing familiarity with AI, it’s an absolute banshee's wail of LLM-style prosaic banality.
[1] - https://github.com/heygen-com/hyperframes
Glad you enjoyed the technical postmortem!
I agree with what you said, but I think it's reductive for a few reasons:
- It was (and still is) amazing to me that GPT Image 2.0 was this good at making a coherent manga page in one shot.
- This project was focused mainly on the animation pipeline, not the manga page content.
- As with all things AI, it is up to the user's creative direction to make the output good. I did not give much creative direction at all to the example manga page, thus the AI catchphrases. With some guidance, this seems to be a very promising way to get your manga ideas onto "paper" quickly if you desire.
Yeah, I agree, I think specificity is key. If you leave an LLM, or any AI model, to its own devices, it gravitates to the figurative "mean of its training data" which tends to be pretty sterile.
I actually have some examples of my own comics that I used to put gpt-image-2 through its paces, and the results were surprisingly good as well:
https://mordenstar.com/other/gpt-2-comics