龙8

2025 / 07 / 24
Smart Vision Paper Release: Introduces New Enterprise-Level LLM Agent Planning Paradigm “Routine” to Greatly Improve Execution Stability and Accuracy

Recently, the research team of Smart Vision under Digital China published a paper on Arxiv titled “Routine: A Structural Planning Framework for LLM-Agent System in Enterprise”. The paper introduces a structured planning framework called “Routine” aimed at addressing three core challenges faced by LLM-Agent in enterprise scenarios: incorrect process due to lack of domain knowledge, unstable execution caused by inconsistent planning formats, and poor user experience resulting from non-AI-native low-code methods.

20221210003345.jpg

Authors of the paper: Zeng Guancheng, Chen Xueyi, Hu Jiawang, Qi Shaohua, Mao Yaxuan, Wang Zhantao, Nie Yifan, Li Shuang, Feng Qiuyang, Qiu Pengxu, Wang Yujia, Han Wenqiang, Huang Linyan, Li Gang, Mo Jingjing, Hu Haowen (Corresponding Author)

Pain Points in Enterprises LLM-Agent Systems

The paper pointed out that there were significant bottlenecks in the actual deployment of enterprise-level intelligent agent systems:

1.Knowledge gap and chaotic tool arrangement: The general model lacks enterprise-specific scene knowledge, which makes it difficult to correctly arrange the tool chain, and often neglect key tool types.

2.Non-standard planning leading to execution deviations: The model relies on generalized understanding and follows non-standard planning, resulting in an unstable transformation process from planning to execution.

3.Limitations of low-code solutions: Traditional low-code methods have high barriers for non-technical personnel and their constructed workflows are difficult to be reused across scenarios. Non-AI-native methods are inefficient.

To address these issues, the Digital China team innovatively proposed the “Routine” planning paradigm.

Routine consists of multiple smaller and more specific sub-tasks with execution steps. The sub-tasks are independent but interrelated. Therefore, each Routine execution step needs to include sufficient information for the agent to follow the planning steps stably. Here is a complete composition of a Routine sub-task steps:

Step x. < Step Name >: < Step Behavior Description >, this step takes < Input Parameter Description >, outputs < Output Parameter Description >, and uses < Step Tool > tool;

In similar scenarios, Routine may contain overlapping steps, which only differs in certain process segments. It is similar to different branches of the same workflow. In this case, by creating branch steps and execution conditions, similar scenarios can be merged to set multiple similar workflows in one Routine.

Step x. < Step Name>: This step performs a branch condition check:

Branch x-1 Step 1. < Step Name>: If < Step Condition>, ..., use < Step Tool> tool;

Branch x-1 Step 2. < Step Name>:......, use < Step Tool> tool;

Branch x-2 Step 1. < Step Name>: If < Step Condition>, ..., use < Step Tool> tool;

Step y. < Step Name>:..., use the < Step Tool> tool;

Step z. < Step Name>:..., use the < Step Tool> tool, and complete the workflow;

The "Routine" format serves as an intermediate representation layer between the planning generated by the large language model and the actual execution engine. It clearly expresses each step of the tool invocation in a standardized format, covering key fields such as tool name, parameters, dependencies, and execution status. This approach effectively enhances the accuracy of the execution model in following the plan and guides the intelligent agent to complete diverse scenarios and tasks.

20221210003345.jpg

The core architecture of the Routine intelligent agent system

Centered on the Routine mechanism, the research team optimized the design of the intelligent agent system, including the optimization of four key modules:

1. Planning Module:

Standardized format: Routine consists of clear sub-tasks with step numbers, names, behavior descriptions, input/output parameter descriptions, and the names of the tools used, and it supports the representation of branch flow.

AI generation and optimization: Business experts provide the draft of the process, and the model uses a dedicated prompt template for optimization to output a structured natural language Routine. Ablation experiments have shown that the Routine optimized by AI can significantly improve the execution accuracy, approaching or even surpassing the benchmark of manual annotation.

2. Execution Module:

Small Parameter Model: Utilizing a small parameter model, the model is trained through methods such as instruction fine-tuning and reinforcement learning. Corresponding multi-step tool invocation reward functions are set to guide the model to adapt to the scenario and enhance the model's ability to follow instructions.

Context Engineering Mechanism: The research team identified the information and configurations required by the intelligent agent system to solve tasks, and constructed corresponding context templates. These templates not only included common elements such as role definitions, task backgrounds, and behavioral norms, but also contained key information such as system parameters, Routine planning for problem-solving, variable memory dictionaries, and tool lists.

3. Tool Call Instructions:

MCP Server: it utilizes the MCP server as the standardized tool layer to uniformly define and manage the names, parameters, and return formats of the tools.

4.Memory Module:

Process Memory: it stores the set of routines created by experts or optimized by AI, and dynamically retrieves the most relevant routines based on the similarity of user tasks, avoiding cramming all routines into the context.

Variable memory: it stores long text parameters as variable keys. During execution, these keys are automatically replaced with actual values, which can significantly reduce the context pressure on the model and minimizing symbol errors in parameter transmission.

These modules work collaboratively to form a fully functional intelligent agent system, as shown in the figure:

20221210003345.jpg

The article summarizes the following tasks

1. Structured Planning Paradigm: A structured standard planning format "Routine" was designed, which significantly enhanced the stability of the agent in solving complex problems through multi-step tool invocation. The verification in actual enterprise scenarios demonstrated that Routine had significantly improved the execution accuracy of model tool invocation, had increased the performance of GPT-4o from 41.1% to 96.3%, and the performance of Qwen3-14B from 32.6% to 83.3%.

2. Routine Compliance Ability Training: To further verify the effectiveness of the Routine framework, the research team constructed a Routine format instruction compliance training dataset. Through instruction fine-tuning and reinforcement learning, the accuracy of the scenario-specific evaluation increased to 88.2%, indicating that this framework significantly improved the compliance of the model in executing plans.

3. Data Distillation Based on Routine: a scenario-specific multi-step tool invocation dataset was generated by using the knowledge distillation method. Fine-tuning on this distilled dataset increased the model accuracy to 95.5%, which approached the level of GPT-4o. These results fully demonstrated the effectiveness of the Routine framework in optimizing the usage patterns of domain-specific tools and enhancing the model's ability to adapt to new scenarios.

Subsequent applications

With AI for Process as the core direction, the introduction of the Routine framework significantly enhances the adaptability of the agent system in enterprise scenarios. It not only optimizes the usage patterns of domain-specific tools, but also enhances the model's ability to handle complex tasks, providing a robust and reliable solution for the intelligence and automation of enterprise processes. In the future, by introducing reinforcement learning in the training process, the model's ability to generalize processes in diverse scenarios can be further improved. Through continuous research and optimization, the Routine framework is expected to further enhance the autonomy and adaptability of intelligent agents in the future, and promote the widespread application of enterprise intelligent agents in the enterprise environment.

Smart Vision will continue to deeply explore scenario-based knowledge engines and agent collaboration technologies, and will be committed to building enterprise-level intelligent agents with stronger process cognition and self-adaptive evolution capabilities. It will bridge the key path from complex business logic to agile AI implementation, which can provide solid, flexible, and scalable technical support for the intelligent transformation of enterprises.