精准0误差，输入价格打骨折！OpenAI官宣API支持结构化输出，JSON准确率100％

5802点击 2024-08-07 18:55

程序员福音！OpenAI新推出的模型API全部支持结构化输出，JSON Schema匹配率高达100％，成本还立减一半。

还在绞尽脑汁想一堆提示词，为一顿操作后五花八门的输出结果而头疼？

OpenAI终于听到了群众的呼声，为广大开发者送上渴望已久的第一大功能。

OpenAI今日宣布新功能上线，ChatGPT API现已支持JSON结构化输出。

精准0误差，输入价格打骨折！OpenAI官宣API支持结构化输出，JSON准确率100％

JSON（JavaScript Object Notation）是文件和数据交换格式的行业标准，因为它既易于人类读取又易于机器解析。

然而，LLM常常与JSON对着干，经常会产生幻觉，要不生成仅部分遵循指令的响应，要不就生成一堆「天书」，根本无法完全解析。

精准0误差，输入价格打骨折！OpenAI官宣API支持结构化输出，JSON准确率100％

这就需要开发人员使用多种开源工具、尝试不同的提示或重复请求等来生成理想的输出结果，耗时耗力。

结构化输出功能于今天发布，以上棘手的难题迎刃而解，确保模型生成的输出与JSON中规定的schema相匹配。

一直以来，结构化输出功能是开发人员呼声最高的头号功能，奥特曼在推文中也表示，该版本是应广大用户的要求发布的。

精准0误差，输入价格打骨折！OpenAI官宣API支持结构化输出，JSON准确率100％

OpenAI发布的新功能确实击中了许多开发者的心，他们一致认为「This is a big deal」。

纷纷留言表示赞叹，直呼「Excellent!」。

精准0误差，输入价格打骨折！OpenAI官宣API支持结构化输出，JSON准确率100％

几家欢喜几家愁，OpenAI的这次更新，又让人担心会吞噬初创公司。

精准0误差，输入价格打骨折！OpenAI官宣API支持结构化输出，JSON准确率100％

然而，对于更多的普通用户来说，他们更关心的问题是GPT-5到底什么时候发布，至于JSON Schema，「那是什么？」

精准0误差，输入价格打骨折！OpenAI官宣API支持结构化输出，JSON准确率100％

毕竟，没有GPT-5的消息，OpenAI今年秋季的DevDay，可能与去年相比，将会显得安静了许多。

轻松确保模式一致性

有了结构化输出，只需要定义一个JSON Schema，AI就会不再「任性」，乖乖按照指令要求输出数据。

并且，新功能不仅仅让AI变得更加听话，还能大大提高输出内容的可靠性。

在对复杂的JSON schema的跟踪评估中，带有结构化输出的新模型gpt-4o-2024-08-06获得了100%的满分。相比之下，gpt-4-0613的得分不到40%。

精准0误差，输入价格打骨折！OpenAI官宣API支持结构化输出，JSON准确率100％

实际上，JSON Schema功能就是OpenAI在去年的DevDay上推出的。

现在，OpenAI在API中扩展了这项功能，确保模型生成的输出与开发人员提供的JSON Schema完全匹配。

从非结构化输入生成结构化数据是当今应用中人工智能的核心用例之一。

开发人员使用OpenAI API构建强大的助手，能够通过函数调用获取数据和回答问题，提取结构化数据以进行数据输入，并构建多步骤的智能体工作流（multi-step agentic workflows），从而允许LLM采取行动。

技术原理

OpenAI采用了一种双管齐下的方法来提高模型输出与JSON Schema的匹配度。

最新的gpt-4o-2024-08-06模型经过训练，可以更好地理解复杂的Schema并生成与之匹配的输出。

尽管模型性能已显著提升，在基准测试中达到了93%的准确性，但固有不确定性仍然存在。

为了确保开发者构建应用的稳定性，OpenAI提供了一种更高准确度的方法来约束模型的输出，从而实现100%的可靠性。

约束解码

OpenAI采用了一种称为约束采样或约束解码的技术，默认情况下，模型生成输出时完全不受约束，可能从词汇表中选择任何token作为下一个输出。

这种灵活性可能导致错误，例如，在生成有效JSON时随意插入无效字符。

为了避免此类错误，OpenAI使用动态约束解码的方法，确保生成的输出token始终符合提供的schema。

为了实现这一点，OpenAI将提供的JSON Schema转换为上下文无关文法（CFG）。

对于每个JSON Schema，OpenAI计算出一个代表该模式的语法，并在采样期间高效地访问预处理的组件。

这种方法不仅使生成的输出更准确，还减少了不必要的延迟。首次请求新模式可能会有额外的处理时间，但随后的请求通过缓存机制实现快速响应。

备选方案

除了CFG方法，其他方法通常使用有限状态机（FSM）或正则表达式来进行约束解码。

然而，这些方法在动态更新有效token时能力有限。特别是对于复杂的嵌套或递归数据结构，FSM通常难以处理。

OpenAI的CFG方法在表达复杂schema时表现出色。例如，支持递归模式的JSON schema在OpenAI API上已得到实现，但无法通过FSM方法表达。

输入成本节省一半

支持函数调用的所有模型均可实现结构化输出，包括最新的GPT-4o和GPT-4o-mini模型，以及微调模型。

此功能可在Chat Completions API、Assistants API和Batch API上使用，并兼容视觉输入。

与gpt-4o-2024-05-13版本相比，gpt-4o-2024-08-06版本在成本上也更具优势，开发者可以在输入端节省50%的成本（2.50美元/1M oken），在输出端节省33%的成本（10.00美元/1M token）。

如何使用结构化输出

在API中可以使用两种形式引入结构化输出：

函数调用

通过在函数定义中设置strict: true，可以实现通过工具的结构化输出。

此功能适用于支持工具的所有型号，包括所有型号gpt-4-0613和gpt-3.5-turbo-0613及更高版本。

启用结构化输出后，模型输出将与提供的工具定义匹配。

示例请求：

POST /v1/chat/completions
{
  "model": "gpt-4o-2024-08-06",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function."
    },
    {
      "role": "user",
      "content": "look up all my orders in may of last year that were fulfilled but not delivered on time"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "query",
        "description": "Execute a query.",
        "strict": true,
        "parameters": {
          "type": "object",
          "properties": {
            "table_name": {
              "type": "string",
              "enum": ["orders"]
            },
            "columns": {
              "type": "array",
              "items": {
                "type": "string",
                "enum": [
                  "id",
                  "status",
                  "expected_delivery_date",
                  "delivered_at",
                  "shipped_at",
                  "ordered_at",
                  "canceled_at"
                ]
              }
            },
            "conditions": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "column": {
                    "type": "string"
                  },
                  "operator": {
                    "type": "string",
                    "enum": ["=", ">", "<", ">=", "<=", "!="]
                  },
                  "value": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "number"
                      },
                      {
                        "type": "object",
                        "properties": {
                          "column_name": {
                            "type": "string"
                          }
                        },
                        "required": ["column_name"],
                        "additionalProperties": false
                      }
                    ]
                  }
                },
                "required": ["column", "operator", "value"],
                "additionalProperties": false
              }
            },
            "order_by": {
              "type": "string",
              "enum": ["asc", "desc"]
            }
          },
          "required": ["table_name", "columns", "conditions", "order_by"],
          "additionalProperties": false
        }
      }
    }
  ]
}

示例输出：

{
  "table_name": "orders",
  "columns": ["id", "status", "expected_delivery_date", "delivered_at"],
  "conditions": [
    {
      "column": "status",
      "operator": "=",
      "value": "fulfilled"
    },
    {
      "column": "ordered_at",
      "operator": ">=",
      "value": "2023-05-01"
    },
    {
      "column": "ordered_at",
      "operator": "<",
      "value": "2023-06-01"
    },
    {
      "column": "delivered_at",
      "operator": ">",
      "value": {
        "column_name": "expected_delivery_date"
      }
    }
  ],
  "order_by": "asc"
}

response_format参数的新选项

开发人员现在可以通过response_format的新选项json_schema选择是否需要规定格式的输出。

当模型不调用工具，而是以结构化方式响应用户时，这一功能非常有用。

此功能适用于最新的GPT-4o型号：今天发布的gpt-4o-2024-08-06和gpt-4o-mini-2024-07-18 。

将response_format设置为strict:true时，模型输出将与提供的schema匹配。

示例请求：

POST /v1/chat/completions
{
  "model": "gpt-4o-2024-08-06",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful math tutor."
    },
    {
      "role": "user",
      "content": "solve 8x + 31 = 2"
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "math_response",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "steps": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "explanation": {
                  "type": "string"
                },
                "output": {
                  "type": "string"
                }
              },
              "required": ["explanation", "output"],
              "additionalProperties": false
            }
          },
          "final_answer": {
            "type": "string"
          }
        },
        "required": ["steps", "final_answer"],
        "additionalProperties": false
      }
    }
  }
}

示例输出：

{
  "steps": [
    {
      "explanation": "Subtract 31 from both sides to isolate the term with x.",
      "output": "8x + 31 - 31 = 2 - 31"
    },
    {
      "explanation": "This simplifies to 8x = -29.",
      "output": "8x = -29"
    },
    {
      "explanation": "Divide both sides by 8 to solve for x.",
      "output": "x = -29 / 8"
    }
  ],
  "final_answer": "x = -29 / 8"
}

开发人员可以使用结构化输出逐步生成答案，以引导达到预期的输出。

根据OpenAI的说法，开发人员不需要验证或重试格式不正确的响应，并且该功能允许更简单的提示。

原生SDK支持

OpenAI称他们的Python和Node SDK已更新，原生支持结构化输出。

为工具提供架构或响应格式就像提供Pydantic或Zod对象一样简单，OpenAI的SDK能将数据类型转换为支持的JSON模式、自动将JSON响应反序列化为类型化数据结构以及解析拒绝。

from enum import Enum
from typing import Union

from pydantic import BaseModel

import openai
from openai import OpenAI


class Table(str, Enum):
    orders = "orders"
    customers = "customers"
    products = "products"


class Column(str, Enum):
    id = "id"
    status = "status"
    expected_delivery_date = "expected_delivery_date"
    delivered_at = "delivered_at"
    shipped_at = "shipped_at"
    ordered_at = "ordered_at"
    canceled_at = "canceled_at"


class Operator(str, Enum):
    eq = "="
    gt = ">"
    lt = "<"
    le = "<="
    ge = ">="
    ne = "!="


class OrderBy(str, Enum):
    asc = "asc"
    desc = "desc"


class DynamicValue(BaseModel):
    column_name: str


class Condition(BaseModel):
    column: str
    operator: Operator
    value: Union[str, int, DynamicValue]


class Query(BaseModel):
    table_name: Table
    columns: list[Column]
    conditions: list[Condition]
    order_by: OrderBy


client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function.",
        },
        {
            "role": "user",
            "content": "look up all my orders in may of last year that were fulfilled but not delivered on time",
        },
    ],
    tools=[
        openai.pydantic_function_tool(Query),
    ],
)

print(completion.choices[0].message.tool_calls[0].function.parsed_arguments)

而且，本机结构化输出支持也可用于response_format 。

from pydantic import BaseModel

from openai import OpenAI


class Step(BaseModel):
    explanation: str
    output: str


class MathResponse(BaseModel):
    steps: list[Step]
    final_answer: str


client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor."},
        {"role": "user", "content": "solve 8x + 31 = 2"},
    ],
    response_format=MathResponse,
)

message = completion.choices[0].message
if message.parsed:
    print(message.parsed.steps)
    print(message.parsed.final_answer)
else:
    print(message.refusal)

其他用例

开发人员经常使用OpenAI的模型为各种用例生成结构化数据。

其他一些例子包括：

-根据用户意图动态生成用户界面

开发人员可以使用结构化输出来创建代码或UI生成应用程序。

使用相同的response_format ，可根据用户输入生成不同的 UI。

例如，创建「园丁的登录界面」：

精准0误差，输入价格打骨折！OpenAI官宣API支持结构化输出，JSON准确率100％

就是用以下代码生成的：

{
  "type": "div",
  "label": "",
  "children": [
    {
      "type": "header",
      "label": "",
      "children": [
        {
          "type": "div",
          "label": "Green Thumb Gardening",
          "children": [],
          "attributes": [{ "name": "className", "value": "site-title" }]
        },
        {
          "type": "div",
          "label": "Bringing Life to Your Garden",
          "children": [],
          "attributes": [{ "name": "className", "value": "site-tagline" }]
        }
      ],
      "attributes": [{ "name": "className", "value": "header" }]
    },
    {
      "type": "section",
      "label": "",
      "children": [
        {
          "type": "div",
          "label": "",
          "children": [
            {
              "type": "div",
              "label": "About Us",
              "children": [
                {
                  "type": "div",
                  "label": "At Green Thumb Gardening, we specialize in transforming your outdoor spaces into beautiful, thriving gardens. Our team has decades of experience in horticulture and landscape design.",
                  "children": [],
                  "attributes": [
                    { "name": "className", "value": "about-description" }
                  ]
                }
              ],
              "attributes": [{ "name": "className", "value": "about-section" }]
            }
          ],
          "attributes": [{ "name": "className", "value": "content" }]
        }
      ],
      "attributes": [{ "name": "className", "value": "about-container" }]
    },
    {
      "type": "section",
      "label": "",
      "children": [
        {
          "type": "div",
          "label": "",
          "children": [
            {
              "type": "div",
              "label": "Our Services",
              "children": [
                {
                  "type": "div",
                  "label": "Garden Design",
                  "children": [],
                  "attributes": [
                    { "name": "className", "value": "service-item" }
                  ]
                },
                {
                  "type": "div",
                  "label": "Plant Care & Maintenance",
                  "children": [],
                  "attributes": [
                    { "name": "className", "value": "service-item" }
                  ]
                },
                {
                  "type": "div",
                  "label": "Seasonal Cleanup",
                  "children": [],
                  "attributes": [
                    { "name": "className", "value": "service-item" }
                  ]
                },
                {
                  "type": "div",
                  "label": "Custom Landscaping",
                  "children": [],
                  "attributes": [
                    { "name": "className", "value": "service-item" }
                  ]
                }
              ],
              "attributes": [{ "name": "className", "value": "services-list" }]
            }
          ],
          "attributes": [{ "name": "className", "value": "content" }]
        }
      ],
      "attributes": [{ "name": "className", "value": "services-container" }]
    }
  ],
  "attributes": [{ "name": "className", "value": "landing-page" }]
}

- 将最终答案与支撑性的推理或附加评论分开

为模型提供一个单独的思维链字段可以提高响应的最终质量。

请求：

{
  "model": "gpt-4o-2024-08-06",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant"
    },
    {
      "role": "user",
      "content": "9.11 and 9.9 -- which is bigger?"
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "reasoning_schema",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "reasoning_steps": {
            "type": "array",
            "items": {
              "type": "string"
            },
            "description": "The reasoning steps leading to the final conclusion."
          },
          "answer": {
            "type": "string",
            "description": "The final answer, taking into account the reasoning steps."
          }
        },
        "required": ["reasoning_steps", "answer"],
        "additionalProperties": false
      }
    }
  }
}

结构化输出：

{
  "reasoning_steps": [
    "First step is to compare the numbers 9.11 and 9.9.",
    "Both numbers have the same whole number part, which is 9.",
    "To compare the decimal parts, convert them to the same number of decimal places.",
    "9.11 has two decimal places: it is 9.11.",
    "9.9 has one decimal place: it can be rewritten as 9.90.",
    "Now, compare 9.11 and 9.90 by looking at the decimal parts.",
    "Compare 11 with 90.",
    "90 is greater than 11, so 9.90 is greater than 9.11."
  ],
  "answer": "9.9 is bigger than 9.11."
}

- 从非结构化数据中提取结构化数据

例如，指示模型从会议记录中提取待办事项、截止日期和作业等内容。

请求：

POST /v1/chat/completions
{
  "model": "gpt-4o-2024-08-06",
  "messages": [
    {
      "role": "system",
      "content": "Extract action items, due dates, and owners from meeting notes."
    },
    {
      "role": "user",
      "content": "...meeting notes go here..."
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "action_items",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "action_items": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "description": {
                  "type": "string",
                  "description": "Description of the action item."
                },
                "due_date": {
                  "type": ["string", "null"],
                  "description": "Due date for the action item, can be null if not specified."
                },
                "owner": {
                  "type": ["string", "null"],
                  "description": "Owner responsible for the action item, can be null if not specified."
                }
              },
              "required": ["description", "due_date", "owner"],
              "additionalProperties": false
            },
            "description": "List of action items from the meeting."
          }
        },
        "required": ["action_items"],
        "additionalProperties": false
      }
    }
  }
}

结构化输出：

{
  "action_items": [
    {
      "description": "Collaborate on optimizing the path planning algorithm",
      "due_date": "2024-06-30",
      "owner": "Jason Li"
    },
    {
      "description": "Reach out to industry partners for additional datasets",
      "due_date": "2024-06-25",
      "owner": "Aisha Patel"
    },
    {
      "description": "Explore alternative LIDAR sensor configurations and report findings",
      "due_date": "2024-06-27",
      "owner": "Kevin Nguyen"
    },
    {
      "description": "Schedule extended stress tests for the integrated navigation system",
      "due_date": "2024-06-28",
      "owner": "Emily Chen"
    },
    {
      "description": "Retest the system after bug fixes and update the team",
      "due_date": "2024-07-01",
      "owner": "David Park"
    }
  ]
}

安全的结构化输出

安全是OpenAI的首要任务——新的结构化输出功能将遵守OpenAI现有的安全政策，并且仍然允许模型拒绝不安全的请求。

为了使开发更简单，API响应上有一个新的refusal字符串值，它允许开发人员以编程方式检测模型是否生成拒绝而不是与架构匹配的输出。

当响应不包含拒绝并且模型的响应没有过早中断（如finish_reason所示）时，模型的响应将可靠地生成与提供的schema匹配的有效JSON。

{
  "id": "chatcmpl-9nYAG9LPNonX8DAyrkwYfemr3C8HC",
  "object": "chat.completion",
  "created": 1721596428,
  "model": "gpt-4o-2024-08-06",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "refusal": "I'm sorry, I cannot assist with that request."
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 81,
    "completion_tokens": 11,
    "total_tokens": 92
  },
  "system_fingerprint": "fp_3407719c7f"
}

参考资料：

https://openai.com/index/introducing-structured-outputs-in-the-api/

https://x.com/sama/status/1820881534909300769

https://venturebeat.com/ai/openai-has-finally-released-the-no-1-feature-developers-have-been-desperate-for/

文章来自于微信公众号新智元作者新智元

精准0误差，输入价格打骨折！OpenAI官宣API支持结构化输出，JSON准确率100％

关键词: openai , AI , chatGPT , 大模型

AITNT资源拓展

根据文章内容,系统为您匹配了更有价值的资源信息。内容由AI生成,仅供参考

AI代理

【开源免费】Browser-use 是一个用户AI代理直接可以控制浏览器的工具。它能够让AI 自动执行浏览器中的各种任务，如比较价格、添加购物车、回复各种社交媒体等。
项目地址：https://github.com/browser-use/browser-use

AI工作流

【开源免费】字节工作流产品扣子两大核心业务：Coze Studio（扣子开发平台）和 Coze Loop（扣子罗盘）全面开源，而且采用的是 Apache 2.0 许可证，支持商用！
项目地址：https://github.com/coze-dev/coze-studio

【开源免费】n8n是一个可以自定义工作流的AI项目，它提供了200个工作节点来帮助用户实现工作流的编排。
项目地址：https://github.com/n8n-io/n8n
在线使用：https://n8n.io/（付费）

【开源免费】DB-GPT是一个AI原生数据应用开发框架，它提供开发多模型管理（SMMF）、Text2SQL效果优化、RAG框架以及优化、Multi-Agents框架协作、AWEL（智能体工作流编排）等多种技术能力，让围绕数据库构建大模型应用更简单、更方便。
项目地址：https://github.com/eosphoros-ai/DB-GPT?tab=readme-ov-file


【开源免费】VectorVein是一个不需要任何编程基础，任何人都能用的AI工作流编辑工具。你可以将复杂的工作分解成多个步骤，并通过VectorVein固定并让AI依次完成。VectorVein是字节coze的平替产品。
项目地址：https://github.com/AndersonBY/vector-vein?tab=readme-ov-file
在线使用：https://vectorvein.ai/（付费）

智能体

【开源免费】AutoGPT是一个允许用户创建和运行智能体的（AI Agents）项目。用户创建的智能体能够自动执行各种任务，从而让AI有步骤的去解决实际问题。
项目地址：https://github.com/Significant-Gravitas/AutoGPT

﻿【开源免费】MetaGPT是一个“软件开发公司”的智能体项目，只需要输入一句话的老板需求，MetaGPT即可输出用户故事 / 竞品分析 / 需求 / 数据结构 / APIs / 文件等软件开发的相关内容。MetaGPT内置了各种AI角色，包括产品经理 / 架构师 / 项目经理 / 工程师，MetaGPT提供了一个精心调配的软件公司研发全过程的SOP。
项目地址：https://github.com/geekan/MetaGPT/blob/main/docs/README_CN.md

免费使用GPT-4o

【免费】ffa.chat是一个完全免费的GPT-4o镜像站点，无需魔法付费，即可无限制使用GPT-4o等多个海外模型产品。
在线使用：https://ffa.chat/

微调

【开源免费】XTuner 是一个高效、灵活、全能的轻量化大模型微调工具库。它帮助开发者提供一个简单易用的平台，可以对大语言模型（LLM）和多模态图文模型（VLM）进行预训练和轻量级微调。XTuner 支持多种微调算法，如 QLoRA、LoRA 和全量参数微调。
项目地址：https://github.com/InternLM/xtuner

prompt

【开源免费】LangGPT 是一个通过结构化和模板化的方法，编写高质量的AI提示词的开源项目。它可以让任何非专业的用户轻松创建高水平的提示词，进而高质量的帮助用户通过AI解决问题。
项目地址：https://github.com/langgptai/LangGPT/blob/main/README_zh.md
在线使用：https://kimi.moonshot.cn/kimiplus/conpg00t7lagbbsfqkq0