Cloudflare 中文文档
Vectorize
编辑这个页面
跳转官方原文档
Set theme to dark (⇧+D)

Insert vectors

Vectorize indexes allow you to insert vectors at any point: Vectorize will optimize the index behind the scenes to ensure that vector search remains efficient, even as new vectors are added or existing vectors updated.

​​ Supported vector formats

Vectorize supports vectors in three formats:

In most cases, a number[] array is the easiest when dealing with other APIs, and is the return type of most machine-learning APIs.

​​ Metadata

Metadata is an optional set of key-value pairs that can be attached to a vector on insert or upsert, and allows you to embed or co-locate data about the vector itself.

Metadata keys cannot be empty, contain the dot character (.), contain the double-quote character ("), or start with the dollar character ($).

Metadata can be used to:

  • Include the object storage key, database UUID or other identifier to look up the content the vector embedding represents.
  • The raw content (up to the metadata limits), which can allow you to skip additional lookups for smaller content.
  • Dates, timestamps, or other metadata that describes when the vector embedding was generated or how it was generated.

For example, a vector embedding representing an image could include the path to the R2 object it was generated from, the format, and a category lookup:

​​ Namespaces

Namespaces provide a way to segment the vectors within your index. For example, by customer, merchant or store ID.

To associate vectors with a namespace, you can optionally provide a namespace: string value when performing an insert or upsert operation. When querying, you can pass the namespace to search within as an optional parameter to your query.

A namespace can be up to 63 characters (bytes) in length and you can have up to 1000 namespaces per index. Refer to the Limits documentation for more details.

When a namespace is provided, only vectors within that namespace are used for the search. Namespace filtering is applied before vector search, not after.

To insert vectors with a namespace:

To query vectors within a namespace:

​​ Examples

​​ Workers API

Use the insert() and upsert() methods available on an index from within a Cloudflare Worker to insert vectors into the current index.

Refer to Vectorize API for additional examples.

​​ wrangler CLI

You can bulk upload vector embeddings directly:

  • The file must be in newline-delimited JSON (NDJSON format): each complete vector must be newline separated, and not within an array or object.
  • Vectors must be complete and include a unique string id per vector.

An example NDJSON formatted file:

​​ HTTP API

Vectorize also supports inserting vectors via the REST API, which allows you to operate on a Vectorize index from existing machine-learning tooling and languages (including Python).

For example, to insert embeddings in NDJSON format directly from a Python script:

This code would insert the vectors defined in embeddings.ndjson into the provided index. Python libraries, including Pandas, also support the NDJSON format via the built-in read_json method: