图像输入

文件附件 （type: "file"）：提供绝对路径;运行时从磁盘读取文件，将其转换为 base64，并将其发送到 LLM。
Blob 附件 （type: "blob"）：直接提供 base64 编码的数据;当图像已在内存中时非常有用（例如屏幕截图、生成的图像或 API 中的数据）。

概述

关系图：显示描述的过程的序列图。

概念	Description
文件附件	`type: "file"` 的附件和磁盘上图像的绝对 `path`
Blob 附件	`type: "blob"`、base64 编码的 `data` 和 `mimeType` 的附件 - 不需要磁盘 I/O
自动编码	对于文件附件，运行时将读取图像并将其自动转换为 base64
自动调整大小	运行时会自动调整图像的大小，或降低超出模型特定限制的图像的质量。
视觉功能	模型必须具有 `capabilities.supports.vision = true` 才能处理图像

快速入门 - 文件附件

使用文件附件类型将图像文件附加到任何邮件。该路径必须是磁盘上映像的绝对路径。

代码语言 navigation

TypeScript

import { CopilotClient } from "@github/copilot-sdk";

const client = new CopilotClient();
await client.start();

const session = await client.createSession({
    model: "gpt-4.1",
    onPermissionRequest: async () => ({ kind: "approve-once" }),
});

await session.send({
    prompt: "Describe what you see in this image",
    attachments: [
        {
            type: "file",
            path: "/absolute/path/to/screenshot.png",
        },
    ],
});

Python

from copilot import CopilotClient, PermissionDecisionApproveOnce

client = CopilotClient()
await client.start()

session = await client.create_session(
    on_permission_request=lambda req, inv: PermissionDecisionApproveOnce(),
    model="gpt-4.1",
)

await session.send(
    "Describe what you see in this image",
    attachments=[
        {
            "type": "file",
            "path": "/absolute/path/to/screenshot.png",
        },
    ],
)

package main

import (
    "context"
    copilot "github.com/github/copilot-sdk/go"
    "github.com/github/copilot-sdk/go/rpc"
)

func main() {
    ctx := context.Background()
    client := copilot.NewClient(nil)
    client.Start(ctx)

    session, _ := client.CreateSession(ctx, &copilot.SessionConfig{
        Model: "gpt-4.1",
        OnPermissionRequest: func(req copilot.PermissionRequest, inv copilot.PermissionInvocation) (rpc.PermissionDecision, error) {
            return &rpc.PermissionDecisionApproveOnce{}, nil
        },
    })

    path := "/absolute/path/to/screenshot.png"
    session.Send(ctx, copilot.MessageOptions{
        Prompt: "Describe what you see in this image",
        Attachments: []copilot.Attachment{
            &copilot.UserMessageAttachmentFile{
                DisplayName: "screenshot.png",
                Path:        path,
            },
        },
    })
}

ctx := context.Background()
client := copilot.NewClient(nil)
client.Start(ctx)

session, _ := client.CreateSession(ctx, &copilot.SessionConfig{
    Model: "gpt-4.1",
    OnPermissionRequest: func(req copilot.PermissionRequest, inv copilot.PermissionInvocation) (rpc.PermissionDecision, error) {
        return &rpc.PermissionDecisionApproveOnce{}, nil
    },
})

path := "/absolute/path/to/screenshot.png"
session.Send(ctx, copilot.MessageOptions{
    Prompt: "Describe what you see in this image",
    Attachments: []copilot.Attachment{
        &copilot.UserMessageAttachmentFile{
            DisplayName: "screenshot.png",
            Path:        path,
        },
    },
})

.NET

using GitHub.Copilot;
using GitHub.Copilot.Rpc;

public static class ImageInputExample
{
    public static async Task Main()
    {
        await using var client = new CopilotClient();
        await using var session = await client.CreateSessionAsync(new SessionConfig
        {
            Model = "gpt-4.1",
            OnPermissionRequest = (req, inv) =>
                Task.FromResult(PermissionDecision.ApproveOnce()),
        });

        await session.SendAsync(new MessageOptions
        {
            Prompt = "Describe what you see in this image",
            Attachments = new List<UserMessageAttachment>
            {
                new UserMessageAttachmentFile
                {
                    Path = "/absolute/path/to/screenshot.png",
                    DisplayName = "screenshot.png",
                },
            },
        });
    }
}

using GitHub.Copilot;
using GitHub.Copilot.Rpc;

await using var client = new CopilotClient();
await using var session = await client.CreateSessionAsync(new SessionConfig
{
    Model = "gpt-4.1",
    OnPermissionRequest = (req, inv) =>
        Task.FromResult(PermissionDecision.ApproveOnce()),
});

await session.SendAsync(new MessageOptions
{
    Prompt = "Describe what you see in this image",
    Attachments = new List<UserMessageAttachment>
    {
        new UserMessageAttachmentFile
        {
            Path = "/absolute/path/to/screenshot.png",
            DisplayName = "screenshot.png",
        },
    },
});

Java

import com.github.copilot.sdk.CopilotClient;
import com.github.copilot.sdk.events.*;
import com.github.copilot.sdk.json.*;
import java.util.List;

try (var client = new CopilotClient()) {
    client.start().get();

    var session = client.createSession(
        new SessionConfig()
            .setModel("gpt-4.1")
            .setOnPermissionRequest(PermissionHandler.APPROVE_ALL)
    ).get();

    session.send(new MessageOptions()
        .setPrompt("Describe what you see in this image")
        .setAttachments(List.of(
            new Attachment("file", "/absolute/path/to/screenshot.png", "screenshot.png")
        ))
    ).get();
}

快速入门 — Blob 附件

如果内存中已有图像数据（例如应用捕获的屏幕截图或从 API 提取的图像），请使用 blob 附件直接发送它，而无需写入磁盘。

代码语言 navigation

TypeScript

import { CopilotClient } from "@github/copilot-sdk";

const client = new CopilotClient();
await client.start();

const session = await client.createSession({
    model: "gpt-4.1",
    onPermissionRequest: async () => ({ kind: "approve-once" }),
});

const base64ImageData = "..."; // your base64-encoded image
await session.send({
    prompt: "Describe what you see in this image",
    attachments: [
        {
            type: "blob",
            data: base64ImageData,
            mimeType: "image/png",
            displayName: "screenshot.png",
        },
    ],
});

Python

from copilot import CopilotClient, PermissionDecisionApproveOnce

client = CopilotClient()
await client.start()

session = await client.create_session(
    on_permission_request=lambda req, inv: PermissionDecisionApproveOnce(),
    model="gpt-4.1",
)

base64_image_data = "..."  # your base64-encoded image
await session.send(
    "Describe what you see in this image",
    attachments=[
        {
            "type": "blob",
            "data": base64_image_data,
            "mimeType": "image/png",
            "displayName": "screenshot.png",
        },
    ],
)

package main

import (
    "context"
    copilot "github.com/github/copilot-sdk/go"
    "github.com/github/copilot-sdk/go/rpc"
)

func main() {
    ctx := context.Background()
    client := copilot.NewClient(nil)
    client.Start(ctx)

    session, _ := client.CreateSession(ctx, &copilot.SessionConfig{
        Model: "gpt-4.1",
        OnPermissionRequest: func(req copilot.PermissionRequest, inv copilot.PermissionInvocation) (rpc.PermissionDecision, error) {
            return &rpc.PermissionDecisionApproveOnce{}, nil
        },
    })

    base64ImageData := "..."
    mimeType := "image/png"
    displayName := "screenshot.png"
    session.Send(ctx, copilot.MessageOptions{
        Prompt: "Describe what you see in this image",
        Attachments: []copilot.Attachment{
            &copilot.UserMessageAttachmentBlob{
                Data:        base64ImageData,
                MIMEType:    mimeType,
                DisplayName: &displayName,
            },
        },
    })
}

mimeType := "image/png"
displayName := "screenshot.png"
session.Send(ctx, copilot.MessageOptions{
    Prompt: "Describe what you see in this image",
    Attachments: []copilot.Attachment{
        &copilot.UserMessageAttachmentBlob{
            Data:        base64ImageData, // base64-encoded string
            MIMEType:    mimeType,
            DisplayName: &displayName,
        },
    },
})

.NET

using GitHub.Copilot;
using GitHub.Copilot.Rpc;

public static class BlobAttachmentExample
{
    public static async Task Main()
    {
        await using var client = new CopilotClient();
        await using var session = await client.CreateSessionAsync(new SessionConfig
        {
            Model = "gpt-4.1",
            OnPermissionRequest = (req, inv) =>
                Task.FromResult(PermissionDecision.ApproveOnce()),
        });

        var base64ImageData = "...";
        await session.SendAsync(new MessageOptions
        {
            Prompt = "Describe what you see in this image",
            Attachments = new List<UserMessageAttachment>
            {
                new UserMessageAttachmentBlob
                {
                    Data = base64ImageData,
                    MimeType = "image/png",
                    DisplayName = "screenshot.png",
                },
            },
        });
    }
}

await session.SendAsync(new MessageOptions
{
    Prompt = "Describe what you see in this image",
    Attachments = new List<UserMessageAttachment>
    {
        new UserMessageAttachmentBlob
        {
            Data = base64ImageData,
            MimeType = "image/png",
            DisplayName = "screenshot.png",
        },
    },
});

Java

import com.github.copilot.sdk.CopilotClient;
import com.github.copilot.sdk.events.*;
import com.github.copilot.sdk.json.*;
import java.util.List;

try (var client = new CopilotClient()) {
    client.start().get();

    var session = client.createSession(
        new SessionConfig()
            .setModel("gpt-4.1")
            .setOnPermissionRequest(PermissionHandler.APPROVE_ALL)
    ).get();

    var base64ImageData = "..."; // your base64-encoded image
    session.send(new MessageOptions()
        .setPrompt("Describe what you see in this image")
        .setAttachments(List.of(
            new BlobAttachment()
                .setData(base64ImageData)
                .setMimeType("image/png")
                .setDisplayName("screenshot.png")
        ))
    ).get();
}

支持的格式

支持的图像格式包括 JPG、PNG、GIF 和其他常见图像类型。对于文件附件，运行时从磁盘读取映像，并根据需要转换映像。对于 Blob 附件，可以直接提供 base64 数据和 MIME 类型。使用 PNG 或 JPEG 获得最佳效果，因为这些格式是支持最广泛的格式。

模型的字段列出了它接受的 capabilities.limits.vision.supported_media_types 确切 MIME 类型。

自动处理

运行时会自动处理图像以适应模型的约束。无需手动调整大小。

超出模型尺寸或大小限制的图像会自动调整大小（保留纵横比）或降低质量。
如果图像在处理后仍无法在限制范围内，则会跳过该图像，并且不会将其发送到 LLM。
模型的 capabilities.limits.vision.max_prompt_image_size 字段指示最大图像大小（以字节为单位）。

可以通过模型功能对象在运行时检查这些限制。为了获得最佳体验，请使用大小合理的 PNG 或 JPEG 图像。

视觉模型功能

并非所有模型都支持视觉。在发送图像之前检查模型的功能。

功能字段

领域	类型	Description
`capabilities.supports.vision`	`boolean`	模型是否可以处理图像输入
`capabilities.limits.vision.supported_media_types`	`string[]`	模型接受的 MIME 类型（例如 `["image/png", "image/jpeg"]`）
`capabilities.limits.vision.max_prompt_images`	`number`	每个提示的最大图像数
`capabilities.limits.vision.max_prompt_image_size`	`number`	最大图像大小（以字节为单位）

视觉限制类型

interface VisionCapabilities {
    vision?: {
        supported_media_types: string[];
        max_prompt_images: number;
        max_prompt_image_size: number; // bytes
    };
}

vision?: {
    supported_media_types: string[];
    max_prompt_images: number;
    max_prompt_image_size: number; // bytes
};

接收图像处理结果

当工具返回图像（例如屏幕截图或生成的图表）时，结果包含 "image" 具有 base64 编码数据的内容块。

领域	类型	Description
`type`	`"image"`	内容块类型鉴别器
`data`	`string`	Base64 编码的图像数据
`mimeType`	`string`	MIME 类型（例如） `"image/png"`

这些图像块显示在事件结果 tool.execution_complete 中。有关完整的事件生命周期，请参阅流式处理会话事件指南。

提示和限制

Tip	详细信息
直接使用 PNG 或 JPEG	避免转换开销 - 这些内容会原样发送到 LLM
使图像保持合理大小	大型图像可能会质量降低，这可能会丢失重要细节
对文件附件使用绝对路径	运行时从磁盘读取文件;相对路径可能无法正确解析
使用 BLOB 附件来处理内存中的数据	如果已有 base64 数据（例如屏幕截图、API 响应），Blob 将避免不必要的磁盘 I/O
首先检查视觉支持	将图像发送到没有视觉理解能力的非视觉模型会浪费标记。
支持多个映像	在一个消息中附加若干附件，直到达到模型的 `max_prompt_images` 限制
不支持 SVG	SVG 文件基于文本，并且从图像处理中排除

另见

流式处理会话事件：事件生命周期，包括工具结果内容块
引导和排队：发送带有附件的跟进邮件

在本文中

代码语言 navigation

代码语言 navigation