虽然认真的尝试过langchain、llama-index的每一次重大的版本更新,期间也试了Dify,FastGPT,但是说实话,我一直对于这些开源项目的设计框架不太感冒:1、总觉得架构上有点绕;2、作为直接面向AI Agent的应用,本该设计的越简单越好,结果却越搞越复杂,不断出新的模块,然后废掉,重来,比如从llama-agents到llama-deploy。
所以,每次试完一个版本后,我都会把它们丢在一边,因为相信,如果模型能力变强了,这些项目里的很多设计思想都要随之大改。
直到我前段时间试了OpenAI Swarm。虽然OpenAI声称这是一个社区维护的项目,与OpenAI官方无关。但我还是被项目的简单吸引了。
很简单的流程,实现了一个简单的AI新闻搜索、摘要总结加翻译的功能。
from duckduckgo_search import DDGSfrom swarm import Swarm, Agentfrom datetime import datetimefrom dotenv import load_dotenvload_dotenv()MODEL = "qwen2.5:72b-instruct"client = Swarm()def search_news(topic): """Search for news articles using DuckDuckGo""" with DDGS() as ddg: results = ddg.text(f"{topic} news {datetime.now().strftime('%Y-%m')}", max_results=10) if results: news_results = "\n\n".join([ f"Title: {result['title']}\nURL: {result['href']}\nSummary: {result['body']}" for result in results ]) return news_results return f"No news found for {topic}."search_agent = Agent( name="News Searcher", instructions=""" You are a news search specialist. Your task is to: 1. Search for the most relevant and recent news on the given topic 2. Ensure the results are from reputable sources 3. Return the raw search results in a structured format """, tools=[search_news], model=MODEL)synthesis_agent = Agent( name="News Synthesizer", instructions=""" You are a news synthesis expert. Your task is to: 1. Analyze the raw news articles provided 2. Identify the key themes and important information 3. Combine information from multiple sources 4. Create a comprehensive but concise synthesis 5. Focus on facts and maintain journalistic objectivity 6. Write in a clear, professional style Provide a 2-3 paragraph synthesis of the main points. """, model=MODEL)summary_agent = Agent( name="News Summarizer", instructions=""" You are an expert news summarizer combining AP and Reuters style clarity with digital-age brevity. Your task: 1. Core Information: - Lead with the most newsworthy development - Include key stakeholders and their actions - Add critical numbers/data if relevant - Explain why this matters now - Mention immediate implications 2. Style Guidelines: - Use strong, active verbs - Be specific, not general - Maintain journalistic objectivity - Make every word count - Explain technical terms if necessary Format: Create a single paragraph of 250-400 words that informs and engages. Pattern: [Major News] + [Key Details/Data] + [Why It Matters/What's Next] Focus on answering: What happened? Why is it significant? What's the impact? IMPORTANT: Provide ONLY the summary paragraph. Do not include any introductory phrases, labels, or meta-text like "Here's a summary" or "In AP/Reuters style." Start directly with the news content. """, model=MODEL)translate_agent = Agent( name="Translator", instructions=""" 你是一个翻译机器人,你的任务是将输入的英文翻译成中文。""", model=MODEL)search_response = client.run( agent=search_agent, messages=[{"role": "user", "content": "Generative AI"}],) # Synthesizesynthesis_response = client.run( agent=synthesis_agent, messages=[{"role": "user", "content": search_response.messages[-1]["content"]}],) # Summarizesummary_response = client.run( agent=summary_agent, messages=[{"role": "user", "content": synthesis_response.messages[-1]["content"]}],)# Translatetranslate_response = client.run( agent=translate_agent, messages=[{"role": "user", "content": summary_response.messages[-1]["content"]}],)print(search_response.messages[-1]["content"])print(synthesis_response.messages[-1]["content"])print(summary_response.messages[-1]["content"])print(translate_response.messages[-1]["content"])
但是OpenAI Swarm有两个问题:一是对输入输出的字符串结果还是需要做一些调整;二是毕竟是社区的验证性项目,未来前景不太明朗。
所以,Swarm作为快速开发AI小工具的选择还不错,但是作为稍大一点项目的底层,问题还是挺大的。
时间退回到两三个月前,一直使用的轻量型Workflow工具Prefect推出了Prefect 3.0,以及基于其上的AI工作流开源项目:ControlFlow。那段时间在配置基础设施,所以就放下了。在尝试了OpenAI Swarm后,又想起了ControlFlow,因为我认为其实AI落地最大的障碍就是跟原有工作流和数据流的融合,这当然有模型能力的约束在,但更重要的还是工具。
在20-22年那段时间,Prefect就是我最喜欢的ETL工具:轻量化,建工作流非常简单,支持K8S,集成性(可以跟Dask、Ray等一系列分布式计算引擎很方便的集成)。
如今虽然发展到3.0大版本,但是界面、接口都没有大变动,ControlFlow的使用比预想的还要方便。
下图是我把在OpenAI Swarm中用的AI新闻搜索代码移植到ControlFlow并完成一次工作流调度后的UI页面。
这个页面在开发AI功能时甚至比ETL还有用:
1、以前的ETL时代,数据流都是预设好的,一个任务被调用多少次都是预先能够确定的,所以调试看的更多的是输入输出时候的数据结构,有没有“脏数据”导致bug,以及,每一步的计算开销(性能调优用);
2、Agent时代,或者说现在流行的名词Autonomous Agentic Workflow(自动化代理工作流),agent是自主工作的。管理员或者用户下达的是一个宽泛的“文本指令”,agent理解后的执行环节,可能是会出现反复调用甚至不必要的重复工作的。这时候上面这张流程调度的示意图可以很清晰的看到agent每一步执行的情况,效率调优的空间巨大(不过,这也挺可怕的,等于是老板随时调监控看员工是否在“摸鱼”,可这就是AI代理的存在价值,就是像人一样“自主干活”的);
上面的执行结果已经是我反复调优了好几次的了,是的,在没有使用模型辅助的情况下,环境配置加代码移植只花了我半小时时间,但是这个调优花了我整整一个晚上。有些问题我依然没有搞清楚,可以说,AI代理时代,对具体的写程序的能力要求肯定是下降的,但是反而对数据结构、算法优化执行和业务流程理解的要求,却是直线上升的。尤其是,当以前用户、产品经理、开发分开的时候,产生接口的每一步都需要说的很清楚,也可以说的很清楚,所以分工是有意义的。然而,现在,调试一个AI代理的工作流,根本就不可能在分工的情况下说清楚,必须同时具备对业务、算法、数据的完整理解与掌控。
除去上面页面显示出来的流程的调试,ControlFlow最吸引我的点还有:
1、ControlFlow并没有封装模型调用的底层,而是直接用了langchain的一些接口,但也仅此而已,比如,我的例子里就是用了langchain_ollama。这样的好处是在保持接口稳定和代码架构简单的同时,又提供了很好的兼容性(可以调用langchain,就可以调用llamaindex,以后还可以调用其他的);
2、几乎完美的与pydantic兼容(当然langchain和llamaindex也有),使得模型输入输出的结构稳定,比如在我后面贴出来的代码里就是有个继承自BaseModel的News类,就是约定了所有代理的输入输出都是这个类的结构;
3、工作流运行中丰富的控制参数,直接设定任务运行的输出结构(result_type=News);设定代理只能调用一次大模型LLM,设定结束条件,等等。我又翻了一下langchain和llamaindex的帮助文档,似乎灵活性和简洁性都是不如ControlFlow的;
4、跟Prefect的Flow完全兼容,意味着混编ETL和AI代理变得极为容易,以前Prefect的那些legacy都能无缝移植;
下面就是我移植后的代码:
import controlflow as cf
from pydantic import BaseModel, Field
from langchain_ollama import ChatOllama
from dotenv import load_dotenv
from duckduckgo_search import DDGS
from datetime import datetime
load_dotenv()
MODEL = ChatOllama(model="qwen2.5:72b-instruct")
class News(BaseModel):
title: str = Field(description="The title of the news article" )
url: str = Field(description="The URL of the news article")
summary: str = Field(description="A summary of the news article")
def search_news(topic: str):
"""Search for news articles using DuckDuckGo"""
with DDGS() as ddg:
results = ddg.text(f"{topic} news {datetime.now().strftime('%Y-%m')}", max_results=10)
if results:
news_results = [ News(title=result['title'], url=result['href'], summary=result['body']) for result in results ]
return news_results
return []
search_agent = cf.Agent(
name="News Searcher",
instructions="""
You are a news search specialist. Your task is to:
1. Search for the most relevant and recent news on the given topic
2. Ensure the results are from reputable sources
3. Return the raw search results in a structured format
""",
model=MODEL,
)
synthesis_agent = cf.Agent(
name="News Synthesizer",
instructions="""
You are a news synthesis expert. Your task is to:
1. Analyze the raw news articles provided
2. Identify the key themes and important information
3. Combine information from multiple sources
4. Create a comprehensive but concise synthesis
5. Focus on facts and maintain journalistic objectivity
6. Write in a clear, professional style
Provide a 2-3 paragraph synthesis of the main points.
""",
model=MODEL,
)
summary_agent = cf.Agent(
name="News Summarizer",
instructions="""
You are an expert news summarizer combining AP and Reuters style clarity with digital-age brevity.
Your task:
1. Core Information:
- Lead with the most newsworthy development
- Include key stakeholders and their actions
- Add critical numbers/data if relevant
- Explain why this matters now
- Mention immediate implications
2. Style Guidelines:
- Use strong, active verbs
- Be specific, not general
- Maintain journalistic objectivity
- Make every word count
- Explain technical terms if necessary
Format: Create a single paragraph of 250-400 words that informs and engages.
Pattern: [Major News] + [Key Details/Data] + [Why It Matters/What's Next]
Focus on answering: What happened? Why is it significant? What's the impact?
IMPORTANT: Provide ONLY the summary paragraph. Do not include any introductory phrases,
labels, or meta-text like "Here's a summary" or "In AP/Reuters style."
Start directly with the news content.
""",
model=MODEL,
)
translate_agent = cf.Agent(
name="Translator",
instructions="""
你是一个翻译机器人,你的任务是将输入的英文翻译成中文。
""",
model=MODEL,
)
@cf.flow
def ai_search_flow(topic: str):
news_results = search_news(topic)
translate_response = []
for news in news_results:
Synthesize = cf.run("Synthesize the news articles",agents=[synthesis_agent], context=dict(News=news), result_type=News, max_llm_calls=1, max_agent_turns=1)
Summarize = cf.run("Summarize the news articles",agents=[summary_agent], context=dict(News=Synthesize), result_type=News,max_llm_calls=1, max_agent_turns=1)
Translate = cf.run("Translate the news articles to Chinese",agents=[translate_agent], context=dict(News=Summarize), result_type=News,max_llm_calls=1, max_agent_turns=1)
translate_response.append(Translate)
return translate_response
print(ai_search_flow("Generative AI"))
下面是其中一个子任务的输出:
下面是最终完整的翻译结果的结构化输出:
只需要一个pydantic的类,就规范了输入和输出,真的很美。
ControlFlow是我看到的第一个企业级可以放心部署的开源方案:1、足够轻量化;2、又足够稳健,因为经历了好几年的考验;3、融合ETL和AI代理;4、极大的可扩展性;5、强大的社区和文档支持。