feat: add commit message and system table utilities#119
Open
lucasfang wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces two new core capabilities in the Paimon C++ codebase: (1) a public CommitMessage API with serialization/deserialization compatible with multiple historical Java commit-message versions, and (2) foundational “system table” infrastructure plus an initial options system table implementation.
Changes:
- Added
CommitMessagepublic API (include/paimon/commit_message.h) and core implementation/serializer with extensive backward-compat tests. - Added core system-table abstractions (
SystemTable, loader, schema wrapper, scan split/plan) to enable reading special$...tables. - Implemented the
optionssystem table to expose table options as a readable system table.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/paimon/core/table/system/system_table.h | Adds SystemTable interface and loader APIs. |
| src/paimon/core/table/system/system_table.cpp | Implements system table support checks, path parsing, and loading from a system-table path. |
| src/paimon/core/table/system/system_table_schema.h | Adds SystemTableSchema wrapper implementing SystemSchema. |
| src/paimon/core/table/system/system_table_schema.cpp | Implements Arrow schema export and field introspection for system tables. |
| src/paimon/core/table/system/system_table_scan.h | Adds system-table scan + split types. |
| src/paimon/core/table/system/system_table_scan.cpp | Implements a trivial plan (single split) for system tables. |
| src/paimon/core/table/system/options_system_table.h | Declares OptionsSystemTable implementation. |
| src/paimon/core/table/system/options_system_table.cpp | Implements options system table schema + read path returning key/value rows. |
| include/paimon/commit_message.h | Adds public CommitMessage API (serialize/deserialize/debug string). |
| src/paimon/core/table/sink/commit_message.cpp | Implements CommitMessage API functions using serializer + memory segments. |
| src/paimon/core/table/sink/commit_message_serializer.h | Declares versioned commit-message serializer. |
| src/paimon/core/table/sink/commit_message_serializer.cpp | Implements serialization/deserialization across versions (v3–v11). |
| src/paimon/core/table/sink/commit_message_impl.h | Adds internal CommitMessageImpl representation. |
| src/paimon/core/table/sink/commit_message_impl.cpp | Implements CommitMessageImpl stringification and equality helpers. |
| src/paimon/core/table/sink/commit_message_test.cpp | Adds extensive compatibility + roundtrip tests. |
| src/paimon/core/table/sink/commit_message_impl_test.cpp | Adds a ToString() regression/format test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+151
to
+154
| Result<std::unique_ptr<TableRead>> OptionsSystemTable::NewRead( | ||
| const std::shared_ptr<ReadContext>& context) const { | ||
| return std::make_unique<OptionsTableRead>(table_schema_->Options(), context->GetMemoryPool()); | ||
| } |
Comment on lines
+246
to
+248
| } else if (version <= 2) { | ||
| return Status::NotImplemented("deserialize 08 not implemented"); | ||
| } else { |
Comment on lines
+43
to
+47
| Result<std::string> CommitMessage::Serialize(const std::shared_ptr<CommitMessage>& commit_message, | ||
| const std::shared_ptr<MemoryPool>& pool) { | ||
| CommitMessageSerializer serializer(pool); | ||
| MemorySegmentOutputStream out(MemorySegmentOutputStream::DEFAULT_SEGMENT_SIZE, pool); | ||
| PAIMON_RETURN_NOT_OK(serializer.Serialize(commit_message, &out)); |
Comment on lines
+53
to
+58
| Result<std::string> CommitMessage::SerializeList( | ||
| const std::vector<std::shared_ptr<CommitMessage>>& commit_messages, | ||
| const std::shared_ptr<MemoryPool>& pool) { | ||
| CommitMessageSerializer serializer(pool); | ||
| MemorySegmentOutputStream out(MemorySegmentOutputStream::DEFAULT_SEGMENT_SIZE, pool); | ||
| PAIMON_RETURN_NOT_OK(serializer.SerializeList(commit_messages, &out)); |
Comment on lines
+64
to
+76
| Result<std::shared_ptr<CommitMessage>> CommitMessage::Deserialize( | ||
| int32_t version, const char* buffer, int32_t length, const std::shared_ptr<MemoryPool>& pool) { | ||
| if (buffer == nullptr) { | ||
| return Status::Invalid("buffer is null pointer"); | ||
| } | ||
| if (length <= 0) { | ||
| return Status::Invalid("length is equal or less than zero"); | ||
| } | ||
| CommitMessageSerializer serializer(pool); | ||
| auto input_stream = std::make_shared<ByteArrayInputStream>(buffer, length); | ||
| DataInputStream in(input_stream); | ||
| return serializer.Deserialize(version, &in); | ||
| } |
Comment on lines
+78
to
+90
| Result<std::vector<std::shared_ptr<CommitMessage>>> CommitMessage::DeserializeList( | ||
| int32_t version, const char* buffer, int32_t length, const std::shared_ptr<MemoryPool>& pool) { | ||
| if (buffer == nullptr) { | ||
| return Status::Invalid("buffer is null pointer"); | ||
| } | ||
| if (length <= 0) { | ||
| return Status::Invalid("length is equal or less than zero"); | ||
| } | ||
| CommitMessageSerializer serializer(pool); | ||
| auto input_stream = std::make_shared<ByteArrayInputStream>(buffer, length); | ||
| DataInputStream in(input_stream); | ||
| return serializer.DeserializeList(version, &in); | ||
| } |
Contributor
Author
|
Thank you @suxiaogang223 for the contributions to the manifests and files system tables. migrated as part of this batch. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: No linked issue
This change adds commit message handling and system table utilities support.
Included changes:
Commit Message Module (
include/paimon/,src/paimon/core/table/sink/):commit_message.hfor commit message definitions.commit_message.cppandcommit_message_impl.h/.cppfor core implementation.commit_message_serializer.h/.cppfor serialization/deserialization logic.commit_message_test.cppandcommit_message_impl_test.cpp.System Table Module (
src/paimon/core/table/system/):system_table.h/.cpp.system_table_schema.h/.cpp.system_table_scan.h/.cpp.options_system_table.h/.cpp.Tests
Not run. Local compile, CMake, and gtest environment checks are not part of this PR description.
Test coverage included in this change:
commit_message_test.cppcommit_message_impl_test.cppAPI and Format
This change adds public API in
include/paimon/commit_message.h.No storage format or protocol changes.
Documentation
No documentation changes required.
Generative AI tooling
Migrate-by: Aone Copilot (Qwen3.7-Max)