Skip to content

feat: Migrate Lucene global index for fts#117

Open
lxy-9602 wants to merge 1 commit into
apache:mainfrom
lxy-9602:add-lucene-fts
Open

feat: Migrate Lucene global index for fts#117
lxy-9602 wants to merge 1 commit into
apache:mainfrom
lxy-9602:add-lucene-fts

Conversation

@lxy-9602

Copy link
Copy Markdown
Contributor

Purpose

No Linked issue.

Migrate Lucene-based global index support under src/paimon/global_index/lucene/:

Lucene adapter utilities:

  • LuceneDefs — defines common Lucene constants and aliases used by the Lucene global index (lucene_defs.h)
  • LuceneUtils — provides helper conversions between Paimon values and Lucene query/index primitives (lucene_utils.h/cpp)
  • LuceneDirectory — adapts Paimon file IO to Lucene directory operations (lucene_directory.h/cpp)
  • LuceneInput — implements Lucene index input over Paimon file reads (lucene_input.h)
  • LuceneFilter — builds Lucene filters from Paimon predicates (lucene_filter.h)
  • LuceneCollector — collects Lucene search hits for global-index lookup (lucene_collector.h)

Jieba analyzer:

  • JiebaAnalyzer — integrates cppjieba tokenization for Lucene text analysis (jieba_analyzer.h/cpp)

Lucene global index:

  • LuceneGlobalIndex — exposes Lucene as a Paimon global index implementation (lucene_global_index.h/cpp)
  • LuceneGlobalIndexFactory — creates Lucene global index instances from table options (lucene_global_index_factory.h/cpp)
  • LuceneGlobalIndexReader — reads indexed rows and performs Lucene-backed lookup/search (lucene_global_index_reader.h/cpp)
  • LuceneGlobalIndexWriter — writes row changes into Lucene index files (lucene_global_index_writer.h/cpp)

Tests

Migrated Lucene/Jieba tests:

  • lucene_directory_test.cpp — validates Lucene directory behavior over Paimon file IO
  • lucene_filter_test.cpp — validates predicate-to-Lucene filter conversion
  • jieba_analyzer_test.cpp — validates Jieba analyzer tokenization behavior
  • jieba_api_test.cpp — validates direct cppjieba API usage
  • lucene_global_index_test.cpp — validates Lucene global index read/write behavior
  • lucene_api_test.cpp — validates direct Lucene API behavior used by the global index

API and Format

Documentation

Generative AI tooling

Migrate-by: Codex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant