Skip to content

feat: add commit message and system table utilities#119

Open
lucasfang wants to merge 1 commit into
apache:mainfrom
lucasfang:migrate_5
Open

feat: add commit message and system table utilities#119
lucasfang wants to merge 1 commit into
apache:mainfrom
lucasfang:migrate_5

Conversation

@lucasfang

Copy link
Copy Markdown
Contributor

Purpose

Linked issue: No linked issue

This change adds commit message handling and system table utilities support.

Included changes:

  • Commit Message Module (include/paimon/, src/paimon/core/table/sink/):

    • Adds public API header commit_message.h for commit message definitions.
    • Adds commit_message.cpp and commit_message_impl.h/.cpp for core implementation.
    • Adds commit_message_serializer.h/.cpp for serialization/deserialization logic.
    • Adds test coverage in commit_message_test.cpp and commit_message_impl_test.cpp.
  • System Table Module (src/paimon/core/table/system/):

    • Adds base system table abstraction with system_table.h/.cpp.
    • Adds schema management via system_table_schema.h/.cpp.
    • Adds scan capabilities with system_table_scan.h/.cpp.
    • Adds options system table implementation in options_system_table.h/.cpp.

Tests

Not run. Local compile, CMake, and gtest environment checks are not part of this PR description.

Test coverage included in this change:

  • commit_message_test.cpp
  • commit_message_impl_test.cpp

API and Format

This change adds public API in include/paimon/commit_message.h.

No storage format or protocol changes.

Documentation

No documentation changes required.

Generative AI tooling

Migrate-by: Aone Copilot (Qwen3.7-Max)

Copilot AI review requested due to automatic review settings June 25, 2026 06:18

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces two new core capabilities in the Paimon C++ codebase: (1) a public CommitMessage API with serialization/deserialization compatible with multiple historical Java commit-message versions, and (2) foundational “system table” infrastructure plus an initial options system table implementation.

Changes:

  • Added CommitMessage public API (include/paimon/commit_message.h) and core implementation/serializer with extensive backward-compat tests.
  • Added core system-table abstractions (SystemTable, loader, schema wrapper, scan split/plan) to enable reading special $... tables.
  • Implemented the options system table to expose table options as a readable system table.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/paimon/core/table/system/system_table.h Adds SystemTable interface and loader APIs.
src/paimon/core/table/system/system_table.cpp Implements system table support checks, path parsing, and loading from a system-table path.
src/paimon/core/table/system/system_table_schema.h Adds SystemTableSchema wrapper implementing SystemSchema.
src/paimon/core/table/system/system_table_schema.cpp Implements Arrow schema export and field introspection for system tables.
src/paimon/core/table/system/system_table_scan.h Adds system-table scan + split types.
src/paimon/core/table/system/system_table_scan.cpp Implements a trivial plan (single split) for system tables.
src/paimon/core/table/system/options_system_table.h Declares OptionsSystemTable implementation.
src/paimon/core/table/system/options_system_table.cpp Implements options system table schema + read path returning key/value rows.
include/paimon/commit_message.h Adds public CommitMessage API (serialize/deserialize/debug string).
src/paimon/core/table/sink/commit_message.cpp Implements CommitMessage API functions using serializer + memory segments.
src/paimon/core/table/sink/commit_message_serializer.h Declares versioned commit-message serializer.
src/paimon/core/table/sink/commit_message_serializer.cpp Implements serialization/deserialization across versions (v3–v11).
src/paimon/core/table/sink/commit_message_impl.h Adds internal CommitMessageImpl representation.
src/paimon/core/table/sink/commit_message_impl.cpp Implements CommitMessageImpl stringification and equality helpers.
src/paimon/core/table/sink/commit_message_test.cpp Adds extensive compatibility + roundtrip tests.
src/paimon/core/table/sink/commit_message_impl_test.cpp Adds a ToString() regression/format test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +151 to +154
Result<std::unique_ptr<TableRead>> OptionsSystemTable::NewRead(
const std::shared_ptr<ReadContext>& context) const {
return std::make_unique<OptionsTableRead>(table_schema_->Options(), context->GetMemoryPool());
}
Comment on lines +246 to +248
} else if (version <= 2) {
return Status::NotImplemented("deserialize 08 not implemented");
} else {
Comment on lines +43 to +47
Result<std::string> CommitMessage::Serialize(const std::shared_ptr<CommitMessage>& commit_message,
const std::shared_ptr<MemoryPool>& pool) {
CommitMessageSerializer serializer(pool);
MemorySegmentOutputStream out(MemorySegmentOutputStream::DEFAULT_SEGMENT_SIZE, pool);
PAIMON_RETURN_NOT_OK(serializer.Serialize(commit_message, &out));
Comment on lines +53 to +58
Result<std::string> CommitMessage::SerializeList(
const std::vector<std::shared_ptr<CommitMessage>>& commit_messages,
const std::shared_ptr<MemoryPool>& pool) {
CommitMessageSerializer serializer(pool);
MemorySegmentOutputStream out(MemorySegmentOutputStream::DEFAULT_SEGMENT_SIZE, pool);
PAIMON_RETURN_NOT_OK(serializer.SerializeList(commit_messages, &out));
Comment on lines +64 to +76
Result<std::shared_ptr<CommitMessage>> CommitMessage::Deserialize(
int32_t version, const char* buffer, int32_t length, const std::shared_ptr<MemoryPool>& pool) {
if (buffer == nullptr) {
return Status::Invalid("buffer is null pointer");
}
if (length <= 0) {
return Status::Invalid("length is equal or less than zero");
}
CommitMessageSerializer serializer(pool);
auto input_stream = std::make_shared<ByteArrayInputStream>(buffer, length);
DataInputStream in(input_stream);
return serializer.Deserialize(version, &in);
}
Comment on lines +78 to +90
Result<std::vector<std::shared_ptr<CommitMessage>>> CommitMessage::DeserializeList(
int32_t version, const char* buffer, int32_t length, const std::shared_ptr<MemoryPool>& pool) {
if (buffer == nullptr) {
return Status::Invalid("buffer is null pointer");
}
if (length <= 0) {
return Status::Invalid("length is equal or less than zero");
}
CommitMessageSerializer serializer(pool);
auto input_stream = std::make_shared<ByteArrayInputStream>(buffer, length);
DataInputStream in(input_stream);
return serializer.DeserializeList(version, &in);
}
@lucasfang

Copy link
Copy Markdown
Contributor Author

Thank you @suxiaogang223 for the contributions to the manifests and files system tables. migrated as part of this batch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants