StableDiffusion2.0相比1.5是倒退嗎?Prompt實驗給你真相

Stability.ai 一週多前釋出了 Stable Diffusion 2.0 模型。這是繼 8 月 Stable Diffusion 1.4 版本以來最大的更新。但在 AI 影像生成模型激烈的競爭局面下,看起來社群並不買賬。SD 2.0 在 Reddit 上招來群嘲,人們抱怨,SD 舊版本的 prompt,在 2.0 下不僅不再管用,甚至效果明顯有倒退,生物體結構扭曲錯亂,質感奇怪。拿來跟討巧又低門檻的 Midjourney v4 一比較,簡直是場噩夢。
社群甚至有了 “陰謀” 的猜想,先於官方釋出的 2.0 開源模型是 Emad / SD Team 放出來的非常基礎的模型版本,它們還有一個藝術超模型集 hypernetwork/model set,但不會公開,而是用於自有商業服務 DreamStudio 或拿來賣 API。社群想用好東西,得靠自己動手 finetune

我對 SD2 的第一印象也跟社群差不多,不小的挫敗和失望。過去珍藏的prompt 跑完能看的不多。但拋棄舊思路,經過幾組的 prompt 實驗後,我又信心大振,發現了 Stable Diffusion 2.0 的很多亮點和優勢。
fine-art photography of a Clear crystal cube, floating highly on the sky, the tumultuous sea, Arctic Ocean, sunset, magic time, HDR, Minimalism, artistic, atmospheric, Centered symmetrical composition, conceptual design, futuristic, cinematic, hyper-detailed, 8K wallpaper -H 960 -S 3033668822
fine-art landscape and nature photography of ocean, Stunning Photos of breaking Ocean Wave, close-up view, High-speed photography, HDR, artistic, Minimalism Photography, cloudy sky, magic time, sunset, golden shining, atmospheric, depressing, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024 -S 6820342731
fine-art Photography a beautiful eye, blue and golden pupil, super close-up view, dark clear background, Minimalism, artistic, atmospheric, masterpiece, HDR, golden ratio composition, hyper-detailed, 500px, -W 960 -S 4972926877
fine-art underwater photography of swiming pool, Stunning Photos of running horse underwater, HDR, artistic, magic time, atmospheric, masterpiece, golden ratio composition, 500px, 8K, wallpaper, -H 1024 -S 9854093032
下面是我花了大概 4 個小時實驗結果和經驗分享。我使用的生成服務用的是 我和家屬 @virushuo 一起開發的 DFserver(基於 Huggingface Diffusers 實現的分散式 backend AI pieline server)的 discord bot。
本文中每張圖都提供了 prompt 和 seed (見 image caption), 都是我原創的,歡迎大家在其基礎上還原生成,做更多探索。需要注意的是,我用的是 diffusers + 2.0 模型, 同樣的 seed 在 Dreamstudio 上可能結果會不一樣。
所有結果都是純 prompt 生成,無 init image,無後期,也沒使用 negative prompt (用了可能更好玩)。
所有圖的生成引數:
  • CGS: 9
  • Steps: 25
  • Size: 769 * 1024 or 768 * 960
SD2.0 最大改進,基礎模型提供了更高的解析度 (從 512 增加到 768 px),用更少的步數就能達到很好的結果(從 50 steps 減少到 25 ),影像質量和細節的豐富程度上也有了顯著的提升。尤其突出的是對 光源、陰影、投影、物體表面的漫反射及環境反射、景深這些指標的處理,超越目前市面上的所有模型。
比如下面這三張海面上的透明晶體,橙色落日的光照如何在水面和晶體表面及內部形成漂亮的反射及折射,如何不同地作用於高透明水晶體和半透明的冰塊,以及透明水晶球上準確的球面化變形處理。

fine-art photography of a Clear crystal cube, reflecting the seaface, floating on the tumultuous sea, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, golden ratio composition, hyper-detailed, 8K wallpaper -H 960 -S 1717647526
fine-art photography of a small Clear crystal ice cube, floating above the horizon, the tumultuous sea, Sea floating with broken ice, sunset, magic time, HDR, Minimalism, artistic, atmospheric, Centered symmetrical composition, conceptual design, futuristic, cinematic, hyper-detailed, 8K wallpaper -H 960 -S 3033668822
fine-art photography of a Clear crystal ball, floating highly on the sky, the tumultuous sea, Arctic Ocean, sunset, magic time, HDR, Minimalism, artistic, atmospheric, Centered symmetrical composition, conceptual design, futuristic, cinematic, hyper-detailed, 8K wallpaper -H 960 -S 3033668822
下面一組實驗是水下場景的生成。水下場景的渲染和水體模擬在 CG 領域是皇冠級別的難度。AI 生成 能做到這個程度令我很吃驚。拋開復雜的光照處理和水波反射,水下奔馬那張甚至能看出來浮力的影響。
可能你會覺得目前為止跑出來的結果都有一些過飽和的傾向,過於 HDR 了,但這個問題還是可以透過調整 prompt、使用 negative prompt、或後期處理一下,拉低曲線或飽和度。
fine-art underwater photography of swiming pool, Stunning Photos of running horse underwater, HDR, artistic, magic time, atmospheric, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024 -S 6548921907
fine-art underwater photography of swiming pool, Stunning Photos of a smiling baby swiming underwater, High-speed photography, HDR, artistic, Minimalism Photography, magic time, sunset, golden shining, atmospheric, depressing, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024 -S 2625550821
fine-art underwater photography of Sea Palace underwater, HDR, artistic, magic time, atmospheric, masterpiece, golden ratio composition, 8K, wallpaper, -S 8516022692
fine-art underwater photography of swiming pool, Stunning Photos of Sea Palace underwater, HDR, artistic, magic time, atmospheric, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024 -S 7506193104
stunning fine-art underwater photography of a sunken pirate ship, close-up view, the Tyndall effect,HDR, artistic, magic time, atmospheric, masterpiece, golden ratio composition, brown and Ultramarine color, 8K, wallpaper, Fantasy style -W 1024 -S 2567103212
SD 1.5 的 prompt 照搬到 2.0 後,能倖存的很少。所以 SD2 的 prompt engineering 可能需要不同的嘗試思路。很明顯,過短和過長的 prompt 在 SD2 裡都是不好用的。你不可能用像在 Midjourney v4 裡那樣,用 “Fire fox chibi” 這麼短的詞就跑出來漂亮結果;也無需採用之前常見的做法,靠大量堆疊 “修飾詞” 或 “參考藝術家” 來拼盤隨機出一個結果。
也可以不再使用 trending on artstation, 500px 這類 “向AI神靈的祈禱詞”,親測加不加對結果沒啥影響。
我實驗下來的感受是,SD2 對修飾詞的響應,較之前版本,更為敏感和準確。這意味著它能提供更高的可控性,更精細。這讓帶著目標性的 prompt  設計變得更可行,更有的放矢,從矇眼鍊金的時代走出。這對於喜歡挑戰的玩家,無疑是個禮物。
下面四張是我實驗黑色液體金屬材質(liquid metal, dark)紋理的生成。
第一張看起來像打了強光的亮光厚塗丙烯媒介,不是很符合我預期。
liquid metal, dark, close-up view, hyper-detailed, photorealistic, studio light, amazing texture, -S 9368172487
第二張,我加上了修飾詞 flowing, Ribbon-like shine,感覺有點絲滑過頭了。
liquid metal, flowing, dark, Ribbon-like shine, hyper-detailed, photorealistic, studio light, amazing texture -S 9363724119
下面兩張我又增加了修飾詞 Solidified lava,比較接近我想要的效果。
我感覺 SD2 對 這三次修飾詞增加的響應還挺敏感的,肉眼可見的改變還挺明顯。
此外,我也沒有堆疊 rendering 類的修飾詞,沒加上一堆 3D 引擎。
liquid metal, flowing, dark, Solidified lava, Ribbon-like shine hyper-detailed, photorealistic, studio light, amazing texture -S 8857093629
liquid metal, flowing, dark, Solidified lava, Ribbon-like shine hyper-detailed, photorealistic, studio light, amazing texture -S 2378293576
下面三張是我對一張黑白沙丘攝影 prompt 的漸進最佳化,種子是相同的。第一張出來的構圖我很喜歡,想保留。但沙浪的對比太假了。我就加了 “perfect brightness and contrast balance” 試試,出於意料的管用了(第二張)。但沙浪的曲線又抖動了,我又加了 “ Extremely artistic curve ” (第三張)。
實驗次數不多,可能新增這兩個修飾詞的改善效果有運氣成分。但的確讓我看到 精細 editting 的可能性。
wild sand dune, epic, sand wave, night, coast, black and white photography, by adam ansel, hyper-detailed, masterpiece, Golden ratio composition, -S 6206530932
wild sand dune, epic, sand wave, night, coast, black and white photography, by adam ansel, hyper-detailed, masterpiece, Golden ratio composition, perfect brightness and contrast balance, -S 6206530932
wild sand dune, epic, sand wave, night, coast, black and white photography, by adam ansel, hyper-detailed, masterpiece, Golden ratio composition, Extremely artistic curve, perfect brightness and contrast balance, -S 6206530932
下面這組實驗是我觀察對不同藝術家風格的響應。6 張同主題冰山風景畫,只更換了藝術家。
  • Michael Whelan 是色彩明快構圖簡潔的奇幻題材插畫大師。
  • Bruce_Pennington 是風格復古、喜歡濃墨重彩的科幻插畫藝術家。
  • Chesley Bonestell 是異星地貌和太空題材的插畫家,筆觸豪放。
  • Andreas Rocha 則是遊戲和概念設定領域的數繪藝術家,風格更現代輕快(我很喜歡用他)
新版對藝術家風格響應還是挺敏感的,對用什麼藝術家可能出什麼效果變得更可預測,這都讓有目的性的 prompt 實驗及設計都變得可行。嗯,所以 SD2  裡,我就沒再使用過 3 位以上的藝術家啦。
fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, lonely island, sunset, magic time, by Michael Whelan, Minimalism, epic perspective, artistic, atmospheric, masterpiece, vivid color, HDR, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 6753514390
fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Michael Whelan, Minimalism, artistic, atmospheric, masterpiece, vivid color, HDR, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 3297248311
fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, lonely island, sunset, magic time, by Bruce_Pennington, Minimalism, epic perspective, artistic, atmospheric, masterpiece, vivid color, HDR, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 6753514390
fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Bruce Pennington, artistic, atmospheric, masterpiece, vivid color, HDR, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 5695592645
fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Chesley Bonestell, artistic, atmospheric, masterpiece, vivid color, HDR, golden ratio composition, hyper-detailed -W 960 -S 3071271062
fine-art watercolor landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 9252202207
下面這一組測試的是配色修飾詞,藝術家參考都是 Kaethe Butcher 的鋼筆肖像畫。隨便寫了 紅藍、黃藍、青 vs 熟赭 這幾個撞色風格,結果意想不到的準確,而且藝術感很強呢。
作為肖像,面部解剖的準確度不錯,豎幅也沒跑出上下兩張臉。下面4張結果是從總共不到 20 次生成裡挑選的。
fine-art pen portrait drawing by Kaethe Butcher, artistic, atmospheric, masterpiece, vivid color, HDR, golden ratio composition, hyper-detailed blue and red color, -H 960 -S 2295938736
fine-art pen portrait drawing by Kaethe Butcher, artistic, atmospheric, masterpiece, blue and yellow color, vivid color, HDR, golden ratio composition, hyper-detailed -H 960 -S 8374475260
fine-art pen portrait drawing, side view, sad pretty face, young lady, by Kaethe Butcher, artistic, atmospheric, masterpiece, vivid color, HDR, golden ratio composition, hyper-detailed, -H 960 -S 8239847668
fine-art pen portrait drawing, side view, sad pretty face, young lady, by Kaethe Butcher, artistic, atmospheric, masterpiece, vivid color, HDR, golden ratio composition, hyper-detailed, -H 960 -S 8239847668
下面兩組實驗的是乾溼兩種繪畫媒介,油畫和水彩。不同媒介的筆觸屬性和邊緣渲染特徵、對畫布/紙表面的模擬,都挺驚豔的。對透明玻璃器皿和銅器的描繪我很喜歡。
在油畫媒介上,檸檬表皮模擬了油畫顏料的龜裂紋理。而水彩媒介,最後一張上,乾溼畫法的模擬都很到位。
fine-art oil painting of still life, glass bowl and lemons, close-up view, clear dark background, by dan mumford, by james Jean, artistic, atmospheric, masterpiece, vivid color, dark shadow, high contrast, HDR, golden ratio composition, hyper-detailed -W 1024 -S 6786310330
fine-art oil painting of still life, glass bowl and lemons, close-up view, Minimalism, clear dark background, by dan mumford, by james Jean, artistic, atmospheric, masterpiece, vivid color, dark shadow, high contrast, golden ratio composition, hyper-detailed -H 1024 -S 2895457491
fine-art watercolor painting of still life, glass goblets and copper teapot, lemons and dead flowers, Minimalism, clear dark background, by John Singer Sargent, by Sherree Valentine Daines, artistic, atmospheric, masterpiece, blue and yellow color, vivid color, HDR, golden ratio composition, hyper-detailed -H 1024 -S 8362409777
我自己畫水彩的,反正下面這張我很難看起來是原作掃描件還是AI生成的。
fine-art watercolor painting of still life, glass goblets and copper teapot, lemons and dead flowers, Minimalism, clear dark background, by John Singer Sargent, by Sherree Valentine Daines, artistic, atmospheric, masterpiece, blue and yellow color, vivid color, HDR, golden ratio composition, hyper-detailed -H 1024 -S 7401752247
這組還是油畫 vs 水彩這兩種古典 fine-art 媒介的對比,風景主題的。雖然參考的藝術家 Andreas Rocha 是隻畫數繪的大師
fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, vivid color, HDR, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 8104947253
fine-art watercolor landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece,darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 1950272985
fine-art watercolor landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 5369511481
fine-art watercolor landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 1950272985
fine-art watercolor landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, darker shadow, high contrast, golden ratio composition, hyper-detailed
SD2 釋出後的一個爭議是,社群發現其訓練集集裡移除了有爭議的名人肖像。用名人作為關鍵詞生成的肖像特徵不再明顯,(是的,可能 在 2.0 裡你們再也跑不出來長著美人魚尾巴的 Emma Watson 或 Gal Gadot 了,但奧巴馬好像還是可以的)。
但我想是,如果需要的話,需要任何一個人的肖像特徵生成,都是很容易透過自定義 finetune 來取得的。作為一個基石型的開放模型,我個人認同 SD 的做法,在倫理爭議多考慮一點,把有爭議的資料從訓練集裡越早排除掉越好。
我對名人再加工沒什麼興趣,但倒是版畫風格試了藝術史上幾張著名的臉,特徵鮮明得很,一看就能猜出來它們都是誰。
fine-art woodcut colorful printmaking of portraint of Christ ,with the crown of thorns, close-up view, Minimalism, clear dark background, by dan mumford, by aaron horkey, artistic, atmospheric, masterpiece, vivid color, dark shadow, high contrast, golden ratio composition, hyper-detailed -H 960
fine-art woodcut colorful pringmaking of close-up portraint of Frankenstein, Minimalism, clear dark background, by dan mumford, by aaron horkey, by bernie wrightson, artistic, atmospheric, masterpiece, vivid color, long shadow, high contrast, golden ratio composition, hyper-detailed -H 1024 -S 7356494580
fine-art woodcut colorful pringmaking of close-up portrait of van gogh, Minimalism, clear dark background, by dan mumford, by aaron horkey, artistic, atmospheric, masterpiece, vivid color, dark shadow, high contrast, golden ratio composition, hyper-detailed -H 960
fine-art woodcut colorful pringmaking of close-up portrait of Mona Lisa, Minimalism, clear dark background, by dan mumford, by aaron horkey, artistic, atmospheric, masterpiece, vivid color, dark shadow, high contrast, golden ratio composition, hyper-detailed -H 960 -S 3140988994
這組實驗了不同的自然材質的細節表現力:冰塊、雪地、沙地、海浪、海浪的泡沫。
fine-art photography of a small Clear crystal ice cube, floating above the horizon, snow field, sunset, magic time, HDR, Minimalism, artistic, atmospheric, Centered symmetrical composition, conceptual design, futuristic, cinematic, hyper-detailed, 8K wallpaper -H 960 -S 3033668822
fine-art landscape and nature photography of Undulating sand dunes in Death Valley, HDR, artistic, Minimalism Photography, sand wave, cloudy sky, magic time, golden shining, atmospheric, depressing, photography by Erez Marom, golden and Ultramarine color, masterpiece, golden ratio composition, 500px, 8K, wallpaper, -H 1024 -S 3097727685
fine-art landscape and nature photography of ocean, Stunning Photos of breaking Ocean Wave, close-up view, High-speed photography, HDR, artistic, Minimalism Photography, cloudy sky, magic time, sunset, golden shining, atmospheric, depressing, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024
fine-art landscape and nature photography of ocean, Stunning Photos of breaking Ocean Wave, bird view, High-speed photography, HDR, artistic, Minimalism Photography, magic time, sunset, golden shining, atmospheric, depressing, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024 -S 6820342731
接下來我還會接著實驗 SD 2.0 更多風格的生成,以及 depth2img、inpainting 模型和自定義 finetune,分享給大家。
AI 生成模型想要作為專業化工具進入更嚴肅應用領域,能使用草稿圖引導來控制配色及構圖、迭代時需要的精細編輯功能,低門檻的模型 finetune,在這三個方向上的成熟,是重要條件。
Grand Canyon, moonrise, dead old trees, black and white photography, by adam ansel, hyper-detailed, masterpiece -S 9477247668
最後以一張 不朽的 Adam Ansel 的 月升大峽谷收尾,謝謝觀看,這是我用 SD2.0 跑出來的第一張成功結果。
上一次更新裡,我提到了我剛釋出了一個 專為 AI 藝術家和愛好者們設計的 APP —— Kalos.art。訪問文章連結:AI 終於能為我掙錢了 
我今天釋出的圖片都發布在了我的 Kalos 賬號,大家如果需要購買這些作品的使用授權,歡迎點選閱讀原文。或者 只是支援一下,來我充個電、點個贊哦。


其它有用的連結:

我跟家屬開發的 開源分散式 AI 模型 pipeline 後端服務—— DFserver : https://github.com/huo-ju/dfserver
Stable Diffusion 2.0 已開源的模型:https://github.com/Stability-AI/stablediffusion
Stability.ai 的官方付費 AI 影像生成線上服務: https://beta.dreamstudio.ai
Huggingface Diffusers: https://huggingface.co/docs/diffusers/index

相關文章