跳转到主要内容

标签(标签)

资源精选(342) Go开发(108) Go语言(103) Go(99) angular(83) LLM(79) 大语言模型(63) 人工智能(53) 前端开发(50) LangChain(43) golang(43) 机器学习(39) Go工程师(38) Go程序员(38) Go开发者(36) React(34) Go基础(29) Python(24) Vue(23) Web开发(20) Web技术(19) 精选资源(19) 深度学习(19) Java(18) ChatGTP(17) Cookie(16) android(16) 前端框架(13) JavaScript(13) Next.js(12) 安卓(11) 聊天机器人(10) typescript(10) 资料精选(10) NLP(10) 第三方Cookie(9) Redwoodjs(9) ChatGPT(9) LLMOps(9) Go语言中级开发(9) 自然语言处理(9) PostgreSQL(9) 区块链(9) mlops(9) 安全(9) 全栈开发(8) OpenAI(8) Linux(8) AI(8) GraphQL(8) iOS(8) 软件架构(7) RAG(7) Go语言高级开发(7) AWS(7) C++(7) 数据科学(7) 智能体(6) whisper(6) Prisma(6) 隐私保护(6) JSON(6) DevOps(6) 数据可视化(6) wasm(6) 计算机视觉(6) 算法(6) Rust(6) 微服务(6) 隐私沙盒(5) FedCM(5) 语音识别(5) Angular开发(5) 快速应用开发(5) 提示工程(5) Agent(5) LLaMA(5) 低代码开发(5) Go测试(5) gorm(5) REST API(5) kafka(5) 推荐系统(5) WebAssembly(5) GameDev(5) CMS(5) CSS(5) machine-learning(5) 机器人(5) 游戏开发(5) Blockchain(5) Web安全(5) nextjs(5) Kotlin(5) 低代码平台(5) 机器学习资源(5) Go资源(5) Nodejs(5) PHP(5) Swift(5) RAG架构(4) devin(4) Blitz(4) javascript框架(4) Redwood(4) GDPR(4) 生成式人工智能(4) Angular16(4) Alpaca(4) 编程语言(4) SAML(4) JWT(4) JSON处理(4) Go并发(4) 移动开发(4) 移动应用(4) security(4) 隐私(4) spring-boot(4) 物联网(4) 网络安全(4) API(4) Ruby(4) 信息安全(4) flutter(4) 专家智能体(3) Chrome(3) CHIPS(3) 3PC(3) SSE(3) 人工智能软件工程师(3) LLM Agent(3) Remix(3) Ubuntu(3) GPT4All(3) 软件开发(3) 问答系统(3) 开发工具(3) 最佳实践(3) RxJS(3) SSR(3) Node.js(3) Dolly(3) 移动应用开发(3) 低代码(3) IAM(3) Web框架(3) CORS(3) 基准测试(3) Go语言数据库开发(3) Oauth2(3) 并发(3) 主题(3) Theme(3) earth(3) nginx(3) 软件工程(3) azure(3) keycloak(3) 生产力工具(3) gpt3(3) 工作流(3) C(3) jupyter(3) 认证(3) prometheus(3) GAN(3) Spring(3) 逆向工程(3) 应用安全(3) Docker(3) Django(3) R(3) .NET(3) 大数据(3) Hacking(3) 渗透测试(3) C++资源(3) Mac(3) 微信小程序(3) Python资源(3) JHipster(3) 语言模型(2) 可穿戴设备(2) JDK(2) SQL(2) Apache(2) Hashicorp Vault(2) Spring Cloud Vault(2) Go语言Web开发(2) Go测试工程师(2) WebSocket(2) 容器化(2) AES(2) 加密(2) 输入验证(2) ORM(2) Fiber(2) Postgres(2) Gorilla Mux(2) Go数据库开发(2) 模块(2) 泛型(2) 指针(2) HTTP(2) PostgreSQL开发(2) Vault(2) K8s(2) Spring boot(2) R语言(2) 深度学习资源(2) 半监督学习(2) semi-supervised-learning(2) architecture(2) 普罗米修斯(2) 嵌入模型(2) productivity(2) 编码(2) Qt(2) 前端(2) Rust语言(2) NeRF(2) 神经辐射场(2) 元宇宙(2) CPP(2) 数据分析(2) spark(2) 流处理(2) Ionic(2) 人体姿势估计(2) human-pose-estimation(2) 视频处理(2) deep-learning(2) kotlin语言(2) kotlin开发(2) burp(2) Chatbot(2) npm(2) quantum(2) OCR(2) 游戏(2) game(2) 内容管理系统(2) MySQL(2) python-books(2) pentest(2) opengl(2) IDE(2) 漏洞赏金(2) Web(2) 知识图谱(2) PyTorch(2) 数据库(2) reverse-engineering(2) 数据工程(2) swift开发(2) rest(2) robotics(2) ios-animation(2) 知识蒸馏(2) 安卓开发(2) nestjs(2) solidity(2) 爬虫(2) 面试(2) 容器(2) C++精选(2) 人工智能资源(2) Machine Learning(2) 备忘单(2) 编程书籍(2) angular资源(2) 速查表(2) cheatsheets(2) SecOps(2) mlops资源(2) R资源(2) DDD(2) 架构设计模式(2) 量化(2) Hacking资源(2) 强化学习(2) flask(2) 设计(2) 性能(2) Sysadmin(2) 系统管理员(2) Java资源(2) 机器学习精选(2) android资源(2) android-UI(2) Mac资源(2) iOS资源(2) Vue资源(2) flutter资源(2) JavaScript精选(2) JavaScript资源(2) Rust开发(2) deeplearning(2) RAD(2)

category

In the previous part, we covered basic architecture concepts for using Large Language Models (LLMs) like GPT inside smart features and products. Today, we focus on how our application can interpret the completion before returning a response to the user.

Table of Contents

1. Overview
2. Asking for JSON completions
3. Managing output consistency
4. Function calling
5. Tools (plugins)

01. Overview

After we’ve (1) prepared our prompt and (2) sent it to the LLM to generate a response, (3) our application can read the completion to determine follow-up actions like triggering events, executing functions, calling databases, or making other LLM requests.

Basic client/server AI chat workflow.

02. Asking for JSON completions

To programmatically trigger actions based on the LLM’s response, we don’t want to play a guessing game, trying to discern intent through scattered words and phrases in the completion.

A more reliable solution is to prompt the model to respond in JSON, which is a widely-used text format for representing structured data, supported by all major programming languages, and human-readable. 🧐

Here is an example of JSON describing criteria for a trip:

{
  "departureCity": "New York, USA",
  "destination": "Paris, France",
  "numberOfTravelers": 2,
  "businessTrip": false,
  "specialRequests": [
    "Room with view of eiffel tower",
    "Guided tour arrangements"
  ],
  "budgetRange": {
    "min": 3000,
    "max": 4000
  }
}

Describing our desired structure

Letting the AI improvize the data structure won’t be reliable or actionable. Our prompt must describe what we’re looking for.

💡 For this purpose, even if we want JSON outputs, the common practice with GPT is to use Typescript-style type definition. A simple syntax that can describe data types (text strings, numbers, booleans, etc.), optional properties (using ?), and comments to guide the LLM (starting with //).

Example of address extracting prompt.

This “technical” prompt engineering might require someone with coding knowledge. 🤝 But if multiple people work on different parts of your prompts, make sure they collaborate closely, as their instructions could potentially interfere with one another.

03. Managing output consistency

The amazing non-deterministic capabilities of LLMs come at a cost: they do not always follow format instructions. 😩 There are several factors at play:

Code fluency of the model

The less code your model was initially trained on, the wordier instruction it might require. If it didn’t learn any at all, I don’t think you make up for that with prompting.

Fine-tuning

On the other hand, if you fine-tune an LLM with only examples of the JSON structure you’re looking for, it will learn to do this task and won’t require format instruction anymore.

Prompt engineering

With capable enough models like GPT, output consistency can still be influenced by:

  • The phrasing of the instruction
  • The position of the instructions in the prompt
  • And the rest of the prompt/chat history, as they could contain sentences that confuse the model.

👍 Tips with GPT:

  • When using large prompts, place your format instructions at the end, close to the expected response.
  • Ask the model to wrap the JSON between the ```json and ``` Markdown delimiters, which seem to help it write valid code and facilitate the extraction.
  • Get inspiration from tools like LangChain‘s Structured Output Parser, but ensure it fits your specific needs and the model you use.

Hot take on LangChain 🦜🔥

LangChain is the most popular Python/JavaScript framework simplifying the development LLM applications by providing plug-and-play prompt chaining, RAG, agents/tools, along with multi-AI-provider support.

The community behind it curates prompts and architecture best practices from the latest research papers. It’s a great source of inspiration, ideal for prototyping and solving common use cases (e.g., chat with a knowledge base).

But I consider that real-life production applications require more control of your system, which is not rocket science BTW (and the reason for this series of articles). 🤓

Using LangChain, you are dependent on their underlying choices and updates like this format-instructions prompt that doubled in token size in May (see before and after). If you’re only using GPT, it doesn’t need to be explained what JSON is. You’re just paying extra tokens, reducing your context window, and risking confusing the model’s attention. 🍗🙄

04. Function calling

Let’s go back to our app! Now, if we prompt the model with a list of possible actions and request a JSON-formatted completion, our app can easily interpret it and execute the corresponding code for each action.

This allows the AI to interact with various components of our application (memory, database, API calls, etc.)

Simple shopping list assistant.

Security concerns

Obviously, if uncontrolled, letting a non-deterministic AI play with your system exposes you to security issues like prompt injection (a hacker crafting a message that gets the model to do what they want). To alleviate that:

  • Avoid letting the AI execute arbitrary code like running its own SQL queries. Instead, limit its interactions to a set of preapproved actions.
  • Execute AI actions with user-level permissions to make sure it cannot access data from other users.
  • Make sure you understand what AI tools can do before using them. Engineering expertise and critical thinking are crucial here.

OpenAI’s function calling

In June 2023, OpenAI released access to their GPT function-calling feature, which, much like chat prompts, introduced a new proprietary API syntax abstracting what could already be done with prompting, coupled with a fine-tuning of their model for more reliability.

☝ Their implementation is reliable but not yet a standard across AI models and providers. Just like LangChain, it represents a tradeoff between ease of use and control over your tech.

05. Tools (plugins)

Combining JSON responses, function calling, and prompt chaining, we can assemble a smart assistant able to complete tasks involving external resources.

Example of a Calendar Manager AI assistant.
  1. We start by running a first LLM query, providing the model with a task and a set of tools.
  2. Based on its response, we execute the corresponding function and feed the result back to the model in a subsequent query.
  3. And we repeat the process, letting the AI decide on the next step until it reaches a final answer.
You can find and copy/paste the full prompt here.

👀 With GPT chat prompts, I use “user” messages for everything that represents the environment perceived by the AI, even if it’s not a message from the user. It seems more natural with the way the model was fine-tuned.

🙃 I didn’t use Typescript-style format instruction here because the JSON is part of an example (few-shot prompting) and didn’t require optional properties or comments. The only rule is to experiment and find what’s appropriate for your specific use case.

🧠 Prompting the AI to verbalize its thought process is known to radically increase the output’s quality. This technique is called Chain of Thought.

📆 The LLM doesn’t know the current date and time. If your use case requires it, add it to your prompt (just like ChatGPT).

👩🏽 Finally, you don’t need to show the user all of the AI's internal thoughts. But you can reduce waiting time perception by giving action feedback (e.g., “checking calendar”), just like ChatGPT does with their plugins.