Add OpenSearch Serverless (AOSS) support by norrishuang · Pull Request #802 · zilliztech/VectorDBBench

norrishuang · 2026-06-18T12:36:17Z

Summary

Add support for Amazon OpenSearch Serverless (AOSS) to the awsopensearch client.

Changes

config.py: Add is_serverless and aws_region fields; use AWS SigV4 authentication via requests-aws4auth for AOSS connections
cli.py: Add --serverless flag and --aws-region option
aws_opensearch.py: Adapt client for AOSS constraints:
- Skip unsupported operations (cluster settings, force merge, manual refresh, replica updates, warmup API)
- Use smaller batch size (100) for bulk inserts
- Store id as a document field (AOSS doesn't support custom _id)
- Retrieve id from _source in search results
- Remove engine and encoder from method config (AOSS manages internally)
- Disable http_compress to avoid SigV4 checksum verification failures
README.md: Add OpenSearch Serverless section with usage example and notes

Usage

vectordbbench awsopensearch --db-label aoss \
  --serverless --aws-region us-east-1 \
  --host <collection-id>.aoss.us-east-1.on.aws --port 443 \
  --case-type Performance768D1M \
  --m 16 --ef-construction 200 --ef-search 40 \
  --number-of-shards 8 --number-of-replicas 0 \
  --engine faiss --metric-type cosine \
  --num-concurrency 80,100,120

Prerequisites

AWS credentials configured
requests-aws4auth installed
IAM identity policy with aoss:APIAccessAll
AOSS Data Access Policy granting index/collection permissions

Testing

Tested against a live AOSS collection with 1M 768-dim vectors (Cohere dataset). Data loading and index creation verified successfully.

- Add --serverless and --aws-region CLI options - Use AWS SigV4 authentication via requests-aws4auth for AOSS - Skip unsupported operations for serverless: cluster settings, force merge, manual refresh, replica updates, warmup API - Use smaller batch size (100) for serverless bulk inserts - Store id as document field (serverless doesn't support custom _id) - Retrieve id from _source in search results for serverless - Remove 'engine' and 'encoder' from index method config for serverless (AOSS manages these internally)

XuanYang-cn

I left a few inline comments for the AOSS paths that look like they still need fixes before merge.

- Route serverless through single-client insert path (AOSS doesn't support custom _id; the multi-client path would send _id and fail) - prepare_filter now filters NumGE on the stored 'id' field for serverless, and mappings store 'id' as a numeric (long) field so range queries work - Add boto3 and requests-aws4auth to the opensearch extra in pyproject.toml and to install/requirements_py3.11.txt - Update README serverless prerequisites to reference the opensearch extra and mention boto3

norrishuang · 2026-07-02T14:56:04Z

Thanks for the review, @XuanYang-cn! I've addressed all three comments in 57f7448:

Multi-client insert path: Serverless now always routes through the single-client insert path (insert_embeddings() short-circuits when serverless), so AOSS loads no longer hit the multi-client path that sends a custom _id. This keeps the no-custom-_id / _source.id / small-batch behavior for all serverless loads regardless of number_of_indexing_clients.
Filters: prepare_filter() now builds the NumGE range on the stored id field for serverless (instead of _id), and the mapping stores id as a numeric long field so range queries work. Filtered cases like NewIntFilterPerformanceCase will now filter on the inserted IDs correctly.
Dependencies: Added boto3 and requests-aws4auth to the opensearch extra in pyproject.toml (and to install/requirements_py3.11.txt), and updated the README serverless prerequisites to reference pip install 'vectordb-bench[opensearch]' and mention boto3. A clean vectordb-bench[opensearch] install now includes everything needed for serverless.

PTAL when you get a chance.

/assign @XuanYang-cn

XuanYang-cn

/lgtm

sre-ci-robot · 2026-07-03T07:03:33Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: norrishuang, XuanYang-cn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [XuanYang-cn]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

norrishuang added 4 commits June 17, 2026 15:55

Add OpenSearch Serverless section to README

7c7ff32

Disable http_compress for serverless to fix SigV4 checksum verification

5df3e26

Format code with black

6c54466

XuanYang-cn reviewed Jul 2, 2026

View reviewed changes

Comment thread vectordb_bench/backend/clients/aws_opensearch/aws_opensearch.py

Comment thread vectordb_bench/backend/clients/aws_opensearch/aws_opensearch.py

Comment thread vectordb_bench/backend/clients/aws_opensearch/config.py

XuanYang-cn approved these changes Jul 3, 2026

View reviewed changes

XuanYang-cn merged commit 224e8b9 into zilliztech:main Jul 3, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add OpenSearch Serverless (AOSS) support#802

Add OpenSearch Serverless (AOSS) support#802
XuanYang-cn merged 5 commits into
zilliztech:mainfrom
norrishuang:aoss-serverless

norrishuang commented Jun 18, 2026

Uh oh!

XuanYang-cn left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

norrishuang commented Jul 2, 2026

Uh oh!

XuanYang-cn left a comment

Uh oh!

sre-ci-robot commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

norrishuang commented Jun 18, 2026

Summary

Changes

Usage

Prerequisites

Testing

Uh oh!

XuanYang-cn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

norrishuang commented Jul 2, 2026

Uh oh!

XuanYang-cn left a comment

Choose a reason for hiding this comment

Uh oh!

sre-ci-robot commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants