
阿里妹導讀
一、前言
1.1 AIGC 發展背景
-
根據影像構成的型別,影像按照顏色和灰度的多少可以分為二值圖、灰度圖、索引圖和RGB圖,影像生成模型可實現不同影像型別的轉換。 -
在實際應用中,模型的效果表現主要體現在生成影像的質量和影像的多樣性,其在平面設計、遊戲製作、動畫製作等領域均有廣泛的應用,另外,在醫學影像合成與分析,化合物合成和藥物發現等方面,影像生成也具有很大的應用潛力。
1.2 技術發展的關鍵階段

-
GAN生成階段:
-
自迴歸生成階段:
-
擴散模型生成階段:
1.3 主流模型實現原理及優缺點
-
擴散模型(Diffusion Model)

-
CLIP(Contrastive Language-image Pre-training)

1.4 當前AIGC 行業發展趨勢

二、應用場景

2)基於旅程資訊,期望大模型在汽車內容社群,自動生成如下的風格化素材,並推送

同時為了最大化的c端引流,車企對AIGC的能力提出了極高的要求,尤其注重生圖細節的下列部分:
-
生圖的風格化,是否能完全遵從指令
-
汽車logo和邊緣的色差
-
背景車型無違和拼裝等
三、實踐落地
3.1 AIGC生圖工具選型
-
在SDXL模型推理上相較於其他 UI 有很大的效能最佳化,圖片生成速度相較於 webui 有 10%~25% 的提升。 -
高度自定義,可以讓使用者更加精準和細粒度控制整個圖片生成過程,深度使用者可以透過 ComfyUI 更簡單地生成更好的圖片。 -
Workflow 以 json 或者圖片的形式更易於分享傳播,可以更好地提高效率。 -
開發者友好,Workflow 的 API 呼叫可以透過簡單載入相同的 API 格式 json 檔案,以任何語言來呼叫生成圖片。

ComfyUI的工作流配置頁面
3.2 業務流程確認

往往容易被忽視的第1步,就是基於業務需求設計完整的工作流。
3.3 自定義節點開發
classExample:
"""
A example node
Class methods
-------------
INPUT_TYPES (dict):
Tell the main program input parameters of nodes.
IS_CHANGED:
optional method to control when the node is re executed.
Attributes
----------
RETURN_TYPES (`tuple`):
The type of each element in the output tuple.
RETURN_NAMES (`tuple`):
Optional: The name of each output in the output tuple.
FUNCTION (`str`):
The name of the entry-point method. For example, if `FUNCTION = "execute"` then it will run Example().execute()
OUTPUT_NODE ([`bool`]):
If this node is an output node that outputs a result/image from the graph. The SaveImage node is an example.
The backend iterates on these output nodes and tries to execute all their parents if their parent graph is properly connected.
Assumed to be False if not present.
CATEGORY (`str`):
The category the node should appear in the UI.
DEPRECATED (`bool`):
Indicates whether the node is deprecated. Deprecated nodes are hidden by default in the UI, but remain
functional in existing workflows that use them.
EXPERIMENTAL (`bool`):
Indicates whether the node is experimental. Experimental nodes are marked as such in the UI and may be subject to
significant changes or removal in future versions. Use with caution in production workflows.
execute(s) -> tuple || None:
The entry point method. The name of this method must be the same as the value of property `FUNCTION`.
For example, if `FUNCTION = "execute"` then this method's name must be `execute`, if `FUNCTION = "foo"` then it must be `foo`.
"""
def __init__(self):
pass
@classmethod
def INPUT_TYPES(s):
"""
Return a dictionary which contains config for all input fields.
Some types(string): "MODEL", "VAE", "CLIP", "CONDITIONING", "LATENT", "IMAGE", "INT", "STRING", "FLOAT".
Input types "INT", "STRING" or "FLOAT" are special values for fields on the node.
The type can be a listfor selection.
Returns: `dict`:
- Key input_fields_group(`string`): Can be either required, hidden or optional. A node class must have property `required`
- Value input_fields(`dict`): Contains input fields config:
* Key field_name(`string`): Name of a entry-point method's argument
* Value field_config(`tuple`):
+ First value is a string indicate the type of field or a listfor selection.
+ Second value is a config for type "INT", "STRING" or "FLOAT".
"""
return{
"required": {
"image": ("IMAGE",),
"int_field": ("INT", {
"default": 0,
"min": 0, #Minimum value
"max": 4096, #Maximum value
"step": 64, #Slider's step
"display": "number", # Cosmetic only: display as "number"or"slider"
"lazy": True # Will only be evaluated if check_lazy_status requires it
}),
"float_field": ("FLOAT", {
"default": 1.0,
"min": 0.0,
"max": 10.0,
"step": 0.01,
"round": 0.001, #The value representing the precision to round to, will be set to the step value by default. Can be set to False to disable rounding.
"display": "number",
"lazy": True
}),
"print_to_screen": (["enable", "disable"],),
"string_field": ("STRING", {
"multiline": False, #True if you want the field to look like the one on the ClipTextEncode node
"default": "Hello World!",
"lazy": True
}),
},
}
RETURN_TYPES = ("IMAGE",)
#RETURN_NAMES = ("image_output_name",)
FUNCTION = "test"
#OUTPUT_NODE = False
CATEGORY = "Example"
def check_lazy_status(self, image, string_field, int_field, float_field, print_to_screen):
"""
Return a list of input names that need to be evaluated.
This function will be called if there are any lazy inputs which have not yet been
evaluated. As long as you return at least one field which has not yet been evaluated
(and more exist), this function will be called again once the value of the requested
field is available.
Any evaluated inputs will be passed as arguments to this function. Any unevaluated
inputs will have the value None.
"""
if print_to_screen == "enable":
return ["int_field", "float_field", "string_field"]
else:
return []
def test(self, image, string_field, int_field, float_field, print_to_screen):
if print_to_screen == "enable":
print(f"""Your input contains:
string_field aka input text: {string_field}
int_field: {int_field}
float_field: {float_field}
""")
image = 1.0 - image
return (image,)
"""
The node will always be re executed if any of the inputs change but
this method can be used to force the node to execute again even when the inputs don't change.
You can make this node return a number or a string. This value will be compared to the one returned the last time the node was
executed, if it is different the node will be executed again.
This method is used in the core repo for the LoadImage node where they return the image hash as a string, if the image hash
changes between executions the LoadImage node is executed again.
"""
#@classmethod
# Set the web directory, any .js file in that directory will be loaded by the frontend as a frontend extension
# WEB_DIRECTORY = "./somejs"
# Add custom API routes, using router
from aiohttp import web
from server import PromptServer
@PromptServer.instance.routes.get("/hello")
async def get_hello(request):
return web.json_response("hello")
# A dictionary that contains all nodes you want to export with their names
# NOTE: names should be globally unique
NODE_CLASS_MAPPINGS = {
"Example": Example
}
# A dictionary that contains the friendly/humanly readable titles for the nodes
NODE_DISPLAY_NAME_MAPPINGS = {
"Example": "Example Node"
}
qwen-max的plugin 節點
from http import HTTPStatus
import dashscope
import json
class 旅行文字生成:
def __init__(self):
dashscope.api_key = ""
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"system_prompt": ("STRING", {"default": """請根據我輸入的中文描述,生成符合主題的完整提示詞。生成後的內容服務於一個繪畫AI,它只能理解具象的提示詞而非抽象的概念。請嚴格遵守以下規則,規則如下:
#內容
根據文字生成一張與風景相關的優美的畫面。
#風格
真實、高畫質、寫實
#action
1.提取途徑城市之一,根據此地點搜尋當地最著名的景點或建築,例如:上海,可提取上海東方明珠
2.提取有關天氣的詞彙,會決定於整個畫面的色調
3.提取有關心情、駕駛體驗的描述,與天氣同時決定畫面的色調
4.提取日期,判斷季節,作為畫面的主要色調參考
""",
"multiline": True
}),
"query_prompt": ("STRING", {
"default": """- 使用者標記emoji:出遊
- 使用者文字:新司機的五一齣遊!
- 出行時間:2024/5/2 下午10:38-2024/5/5 下午6:57
- 總駕駛時長:14小時28分鐘
- 公里數:645.4km
- 起點:上海市黃浦區中山南路1891號-1893號
- 起點天氣:晴天
- 終點:上海市閔行區申長路688號
- 終點天氣:多雲
- 途徑城市:湖州市 無錫市 常州市
- 組隊資訊:歐陽開心的隊伍
- 車輛資訊:黑色一代
""",
"multiline": True})
},
}
RETURN_TYPES = ("STRING",)
FUNCTION = "生成繪畫提示詞"
CATEGORY = "旅行文字生成"
def 生成繪畫提示詞(self, system_prompt, query_prompt):
messages = [
{'role': 'system', 'content': system_prompt},
{'role': 'user', 'content': query_prompt}
]
response = dashscope.Generation.call(
model="qwen-max",
messages=messages,
result_format='message'
)
if response.status_code == HTTPStatus.OK:
# Assuming the response contains the generated prompt in the 'output' field
painting_prompt = response.output.choices[0].message.content
else:
raise Exception('Request failed: Request id: %s, Status code: %s, error code: %s, error message: %s' % (
response.request_id, response.status_code,
response.code, response.message
))
return (painting_prompt,)
# A dictionary that contains all nodes you want to export with their names
NODE_CLASS_MAPPINGS = {
"旅行文字生成": 旅行文字生成
}
# A dictionary that contains the friendly/humanly readable titles for the nodes
NODE_DISPLAY_NAME_MAPPINGS = {
"旅行文字生成": "生成旅行本文提示詞"
}
萬相2.0的plugin 節點
from http import HTTPStatus
from urllib.parse import urlparse, unquote
from pathlib import PurePosixPath
import requests
import dashscope
from dashscope import ImageSynthesis
import random
classImageSynthesisNode:
"""
A node for generating images based on a provided prompt.
Class methods
-------------
INPUT_TYPES (dict):
Define the input parameters of the node.
IS_CHANGED:
Optional method to control when the node is re-executed.
Attributes
----------
RETURN_TYPES (`tuple`):
The type of each element in the output tuple.
FUNCTION (`str`):
The name of the entry-point method.
CATEGORY (`str`):
The category the node should appear in the UI.
"""
@classmethod
def INPUT_TYPES(s):
return{
"required": {
"prompt": ("STRING", {
"default": "",
"multiline": True
})
},
}
RETURN_TYPES = ("STRING",)
FUNCTION = "generate_image_url"
CATEGORY = "Image Synthesis"
def __init__(self):
# 設定API金鑰
dashscope.api_key = ""
def generate_image_url(self, prompt):
negative_prompt_str = '(car:1.4), NSFW, nude, naked, porn, (worst quality, low quali-ty:1.4), deformed iris, deformed pupils, (deformed, distorted, disfigured:1.3), cropped, out of frame, poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, cloned face, (mu-tated hands and fingers:1.4), disconnected limbs, extra legs, fused fingers, too many fingers, long neck, mutation, mutated, ugly, disgusting, amputa-tion, blurry, jpeg artifacts, watermark, water-marked, text, Signature, sketch'
random_int = random.randint(1,4294967290)
rsp = ImageSynthesis.call(
model='wanx2-t2i-lite',
prompt=prompt,
negative_prompt=negative_prompt_str,
n=1,
size='768*960',
extra_input={'seed':random_int}
)
if rsp.status_code == HTTPStatus.OK:
# 獲取生成的圖片URL
image_url = rsp.output.results[0].url
else:
raise Exception('Request failed: Status code: %s, code: %s, message: %s' % (
rsp.status_code, rsp.code, rsp.message
))
return (image_url,)
# A dictionary that contains all nodes you want to export with their names
NODE_CLASS_MAPPINGS = {
"ImageSynthesisNode": ImageSynthesisNode
}
# A dictionary that contains the friendly/humanly readable titles for the nodes
NODE_DISPLAY_NAME_MAPPINGS = {
"ImageSynthesisNode": "Image Synthesis Node"
}
# 示例呼叫
if __name__ == '__main__':
prompt = "A beautiful and realistic high-definition landscape scene, featuring the famous landmark of Wuxi, the Turtle Head Isle Park, as it is one of the cities passed through during the journey. The weather transitions from a clear, sunny day in the starting point, Shanghai, to a cloudy sky at the destination, also in Shanghai. The overall tone of the image reflects the transition from a bright, cheerful start to a more serene, calm atmosphere, with lush greenery and blooming flowers indicating the early summer season. The harmonious blend of natural beauty and man-made structures, along with the changing weather, creates a picturesque and tranquil setting"
node = ImageSynthesisNode()
image_url = node.generate_image_url(prompt)
print(f"Generated Image URL: {image_url}")
3.4 PAI 服務部署&增加算力選擇
-
標準版:適用於單使用者使用WebUI或使用一個例項呼叫API場景。支援透過WebUI生成影片,也可透過API進行呼叫。請求傳送時,會繞過EAS介面,前端直接將請求傳遞給後端伺服器,所有請求均由同一個後端例項進行處理。 -
API版:系統自動轉換服務為非同步模式,適用於高併發場景。僅支援透過API進行呼叫。如果需要多臺例項時,建議選用API版。
服務配置
{
"cloud": {
"computing": {
"instance_type": "ecs.gn8is-2x.8xlarge"
},
"networking": {
"security_group_id": "sg-uf626dg02ts498gqoa2n",
"vpc_id": "vpc-uf6usys7jvf2p7ugcyq1j",
"vswitch_id": "vsw-uf6lv36zo7kkzyq9blyc6"
}
},
"containers": [
{
"image": "eas-registry-vpc.cn-shanghai.cr.aliyuncs.com/pai-eas/comfyui:1.7-beta",
"port": 8000,
"script": "python main.py --listen --port 8000 --data-dir /deta-code-oss"
}
],
"metadata": {
"cpu": 32,
"enable_webservice": true,
"gpu": 2,
"instance": 1,
"memory": 256000,
"name": "jiashu16"
},
"name": "jiashu16",
"options": {
"enable_cache": true
},
"storage": [
{
"mount_path": "/deta-code-oss",
"oss": {
"path": "oss://ai4d-k4kulrqkyt37jhz1mv/482832/data-205381316445420758/",
"readOnly": false
},
"properties": {
"resource_type": "model"
}
}
]
}
3.5 節點和模型掛載

-
/custom_nodes:該目錄用來儲存ComfyUI外掛。編寫之後的qwen-max的plugin 節點和萬相2.0的plugin 節點,需要上傳到本資料夾。
-
/models:該目錄用來存放模型檔案。
-
/output:工作流最後的輸出結果的儲存地址。
3.6 基於workflow json的服務介面建設

工作流workflow api json樣例
{
"4": {
"inputs": {
"ckpt_name": "基礎模型XL _xl_1.0.safetensors"
},
"class_type": "CheckpointLoaderSimple",
"_meta": {
"title": "Checkpoint載入器(簡易)"
}
},
"6": {
"inputs": {
"text": [
"149",
0
],
"speak_and_recognation": true,
"clip": [
"145",
1
]
},
"class_type": "CLIPTextEncode",
"_meta": {
"title": "CLIP文字編碼器"
}
},
"7": {
"inputs": {
"text": "*I* *Do* *Not* *Use* *Negative* *Prompts*",
"speak_and_recognation": true,
"clip": [
"145",
1
]
},
"class_type": "CLIPTextEncode",
"_meta": {
"title": "CLIP文字編碼器"
}
}
3.7 工程架構和穩定性保障
-
重點展示基於PAI ComfyUI + 百鍊qwen + 百鍊萬相部分的架構設計和穩定性保障,上層應用部署在ACS或者ecs,可以基於客戶真實環境和利舊情況進行調整。

-
同時,所有生圖需要緊貼使用者最新的旅程時間,所以圖片都有季節性【旅遊旺季和淡季】。因此,整個系統架構,從模型層到應用層,都具備高QPS和彈性伸縮的能力。

四、技術服務避坑點
-
同步呼叫【標準版】:標準版服務僅支援同步呼叫方式,即客戶端傳送一個請求,同步等待結果返回。 -
非同步呼叫【API版】:API版服務僅支援非同步呼叫方式,即客戶端使用EAS的佇列服務向輸入佇列傳送請求,並透過訂閱的方式從輸出佇列查詢結果。
使用AnalyticDB和通義千問構建RAG應用
AnalyticDB與通義千問搭建AI智慧客服
本方案利用AnalyticDB PostgreSQL與DashScope靈積模型服務提供的通義千問模型構建Retrieval-Augmented Generation (RAG) 應用,透過檢索相關資訊並結合上下文生成準確的自然語言回答,增強語言模型處理和理解複雜查詢的深度。
點選閱讀原文檢視詳情。
關鍵詞
模型
資料
圖片
型別
場景