diff --git a/.gitignore b/.gitignore index c3e1d78..a700cb5 100644 --- a/.gitignore +++ b/.gitignore @@ -48,6 +48,7 @@ FlameGraph # Images *.svg +!docs/source/_static/prefetch.svg # Third party dependencies archives third_party/*.tar.gz diff --git a/docs/.gitignore b/docs/.gitignore new file mode 100644 index 0000000..a5909a1 --- /dev/null +++ b/docs/.gitignore @@ -0,0 +1,18 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +_build diff --git a/docs/Makefile b/docs/Makefile new file mode 100644 index 0000000..779537a --- /dev/null +++ b/docs/Makefile @@ -0,0 +1,46 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# Adapted from Apache Arrow +# https://github.com/apache/arrow/blob/main/docs/Makefile + +# +# Makefile for Sphinx documentation + +# You can set these variables from the command line, and also +# from the environment for the first two. +SPHINXOPTS = -j8 +SPHINXBUILD = sphinx-build +SOURCEDIR = source +BUILDDIR = _build + +# Put it first so that "make" without argument is like "make help". +help: + @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) + +.PHONY: help Makefile + +# Catch-all target: route all unknown targets to Sphinx using the new +# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). +%: Makefile + @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) + +.PHONY: html +html: + $(SPHINXBUILD) -b html $(SPHINXOPTS) source $(BUILDDIR)/html + @echo + @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." diff --git a/docs/code-style.md b/docs/code-style.md index b413cb4..7cf1d65 100644 --- a/docs/code-style.md +++ b/docs/code-style.md @@ -31,6 +31,21 @@ This document defines the coding conventions for the paimon-cpp project. All pul --- +## Integer Types + +Use **fixed-width integer types** from ``. Do **not** use plain `int`, `long`, `short`, or `unsigned`. + +| Use | Instead of | +|-----|-----------| +| `int8_t` / `uint8_t` | `char` (for numeric data) | +| `int16_t` / `uint16_t` | `short` | +| `int32_t` / `uint32_t` | `int` / `unsigned int` | +| `int64_t` / `uint64_t` | `long` / `long long` | + +**Exception**: Loop variables iterating over small, bounded ranges (e.g. `for (int32_t i = 0; i < n; ++i)`) must still use `int32_t`, not `int`. + +--- + ## Formatting Formatting is based on **Google C++ Style** with the following overrides (defined in `.clang-format`): diff --git a/docs/make.bat b/docs/make.bat new file mode 100644 index 0000000..ec328bf --- /dev/null +++ b/docs/make.bat @@ -0,0 +1,55 @@ +@rem Licensed to the Apache Software Foundation (ASF) under one +@rem or more contributor license agreements. See the NOTICE file +@rem distributed with this work for additional information +@rem regarding copyright ownership. The ASF licenses this file +@rem to you under the Apache License, Version 2.0 (the +@rem "License"); you may not use this file except in compliance +@rem with the License. You may obtain a copy of the License at +@rem +@rem http://www.apache.org/licenses/LICENSE-2.0 +@rem +@rem Unless required by applicable law or agreed to in writing, +@rem software distributed under the License is distributed on an +@rem "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +@rem KIND, either express or implied. See the License for the +@rem specific language governing permissions and limitations +@rem under the License. + +@rem Adapted from Apache Arrow +@rem https://github.com/apache/arrow/blob/main/docs/make.bat + +@ECHO OFF + +pushd %~dp0 + +REM Command file for Sphinx documentation + +if "%SPHINXBUILD%" == "" ( + set SPHINXBUILD=sphinx-build +) +set SOURCEDIR=. +set BUILDDIR=_build + +%SPHINXBUILD% >NUL 2>NUL +if errorlevel 9009 ( + echo. + echo.The 'sphinx-build' command was not found. Make sure you have Sphinx + echo.installed, then set the SPHINXBUILD environment variable to point + echo.to the full path of the 'sphinx-build' executable. Alternatively you + echo.may add the Sphinx directory to PATH. + echo. + echo.If you don't have Sphinx installed, grab it from + echo.https://www.sphinx-doc.org/ + exit /b 1 +) + +if "%1" == "" goto help + +%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% +goto end + +:help +%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% + +:end +popd diff --git a/docs/requirements.txt b/docs/requirements.txt new file mode 100644 index 0000000..818836b --- /dev/null +++ b/docs/requirements.txt @@ -0,0 +1,13 @@ +# +# Note: keep this file in sync with conda_env_sphinx.txt ! +# + +breathe +myst-parser[linkify] +pydata-sphinx-theme~=0.16 +sphinx-autobuild +sphinx-copybutton +sphinx-design +sphinx-lint +sphinxcontrib-mermaid +sphinx diff --git a/docs/source/_static/file-layout.png b/docs/source/_static/file-layout.png new file mode 100644 index 0000000..0fe5f43 Binary files /dev/null and b/docs/source/_static/file-layout.png differ diff --git a/docs/source/_static/prefetch.svg b/docs/source/_static/prefetch.svg new file mode 100644 index 0000000..1db39ee --- /dev/null +++ b/docs/source/_static/prefetch.svg @@ -0,0 +1,4 @@ + + +Reader 1Reader 2Reader 3Queue 1Queue 2Queue 3{0, 100}, {300, 400}, ...{100, 200}, {400, 500}, ...{200, 300}, {500, 600}, ...ConsumerPrefetchFileBatchReader diff --git a/docs/source/_static/sorted-runs.png b/docs/source/_static/sorted-runs.png new file mode 100644 index 0000000..bbc061b Binary files /dev/null and b/docs/source/_static/sorted-runs.png differ diff --git a/docs/source/_static/theme_overrides.css b/docs/source/_static/theme_overrides.css new file mode 100644 index 0000000..7b52706 --- /dev/null +++ b/docs/source/_static/theme_overrides.css @@ -0,0 +1,86 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/* Adapted from Apache Arrow */ +/* https://github.com/apache/arrow/blob/main/docs/source/_static/theme_overrides.css */ + +/* Customizing with theme CSS variables */ + +:root { + /* Make headings more bold */ + --pst-font-weight-heading: 600; +} + +/* Contributing landing page overview cards */ + +.contrib-card { + border-radius: 0; + padding: 30px 10px 20px 10px; + margin: 10px 0px; +} + +.contrib-card p.card-text { + margin: 0px; +} + +.contrib-card .sd-card-img-top { + margin: 2px; + height: 75px; + background: none !important; +} + +.contrib-card .sd-card-title { + color: var(--pst-color-primary); + font-size: var(--pst-font-size-h3); + padding: 1rem 0rem 0.5rem 0rem; +} + +.contrib-card .sd-card-footer { + border: none; +} + +/* This is the bootstrap CSS style for "table-striped". Since the theme does +not yet provide an easy way to configure this globally, it easier to simply +include this snippet here than updating each table in all rst files to +add ":class: table-striped" */ + +.table tbody tr:nth-of-type(odd) { + background-color: rgba(0, 0, 0, 0.05); +} + +/* Improve the vertical spacing in the C++ API docs +(ideally this should be upstreamed to the pydata-sphinx-theme */ + +dl.cpp dd p { + margin-bottom: .4rem; +} + +dl.cpp.enumerator { + margin-bottom: 0.2rem; +} + +p.breathe-sectiondef-title { + margin-top: 1rem; +} + +/* Keep social icons arranged horizontally in sidebar */ + +.sidebar-header-items__end { + flex-wrap: wrap; +} diff --git a/docs/source/_static/versions.json b/docs/source/_static/versions.json new file mode 100644 index 0000000..9980db1 --- /dev/null +++ b/docs/source/_static/versions.json @@ -0,0 +1,7 @@ +[ + { + "name": "0.10.0 (dev)", + "version": "dev/", + "url": "#" + } +] diff --git a/docs/source/api.rst b/docs/source/api.rst new file mode 100644 index 0000000..2f6538a --- /dev/null +++ b/docs/source/api.rst @@ -0,0 +1,40 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +************* +API Reference +************* + +.. toctree:: + :maxdepth: 3 + + api/catalog + api/write + api/commit + api/scan + api/read + api/predicate + api/file_format + api/file_system + api/io + api/data_types + api/file_index + api/global_index + api/clean + api/defs + api/executor + api/memory diff --git a/docs/source/api/catalog.rst b/docs/source/api/catalog.rst new file mode 100644 index 0000000..c3956d3 --- /dev/null +++ b/docs/source/api/catalog.rst @@ -0,0 +1,32 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +Catalog +=========== + +.. _cpp-api-catalog: + +Interface +========= + +.. doxygenclass:: paimon::Catalog + :members: + +.. doxygenclass:: paimon::Identifier + :members: + :undoc-members: diff --git a/docs/source/api/clean.rst b/docs/source/api/clean.rst new file mode 100644 index 0000000..b1611cd --- /dev/null +++ b/docs/source/api/clean.rst @@ -0,0 +1,37 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +================== +Orphan Files Clean +================== + +.. _cpp-api-clean: + +Interface +========= + +.. doxygenclass:: paimon::OrphanFilesCleaner + :members: + :undoc-members: + +.. doxygenclass:: paimon::CleanContextBuilder + :members: + :undoc-members: + +.. doxygenclass:: paimon::CleanContext + :members: + :undoc-members: diff --git a/docs/source/api/commit.rst b/docs/source/api/commit.rst new file mode 100644 index 0000000..3f03dde --- /dev/null +++ b/docs/source/api/commit.rst @@ -0,0 +1,41 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +Commit +=========== + +.. _cpp-api-commit: + +Interface +========= + +.. doxygenclass:: paimon::FileStoreCommit + :members: + :undoc-members: + +.. doxygenclass:: paimon::CommitContextBuilder + :members: + :undoc-members: + +.. doxygenclass:: paimon::CommitContext + :members: + :undoc-members: + +.. doxygenclass:: paimon::CommitMessage + :members: + :undoc-members: diff --git a/docs/source/api/data_types.rst b/docs/source/api/data_types.rst new file mode 100644 index 0000000..1876d78 --- /dev/null +++ b/docs/source/api/data_types.rst @@ -0,0 +1,37 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +Data Types +=========== + +.. _cpp-api-data-types: + +Interface +========= + +.. doxygenclass:: paimon::Blob + :members: + :undoc-members: + +.. doxygenclass:: paimon::Decimal + :members: + :undoc-members: + +.. doxygenclass:: paimon::Timestamp + :members: + :undoc-members: diff --git a/docs/source/api/defs.rst b/docs/source/api/defs.rst new file mode 100644 index 0000000..ee28685 --- /dev/null +++ b/docs/source/api/defs.rst @@ -0,0 +1,31 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +Options +=========== + +.. _cpp-api-options: + +Interface +========= + +.. doxygenenum:: paimon::FieldType + +.. doxygenstruct:: paimon::Options + :members: + :undoc-members: diff --git a/docs/source/api/executor.rst b/docs/source/api/executor.rst new file mode 100644 index 0000000..d5daed5 --- /dev/null +++ b/docs/source/api/executor.rst @@ -0,0 +1,37 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +Executor +=========== + +.. _cpp-api-executor: + +Interface +========= + +.. doxygenvariable:: paimon::DEFAULT_EXECUTOR_THREAD_COUNT + +.. doxygenfunction:: paimon::GetGlobalDefaultExecutor() + +.. doxygenfunction:: paimon::CreateDefaultExecutor() + +.. doxygenfunction:: paimon::CreateDefaultExecutor(uint32_t) + +.. doxygenclass:: paimon::Executor + :members: + :undoc-members: diff --git a/docs/source/api/file_format.rst b/docs/source/api/file_format.rst new file mode 100644 index 0000000..57b9eee --- /dev/null +++ b/docs/source/api/file_format.rst @@ -0,0 +1,57 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +File Format +=========== + +.. _cpp-api-file-format: + +Interface +========= + +.. doxygenclass:: paimon::FileFormat + :members: + :undoc-members: + +.. doxygenclass:: paimon::FormatWriter + :members: + :undoc-members: + +.. doxygenclass:: paimon::WriterBuilder + :members: + :undoc-members: + +.. doxygenclass:: paimon::BatchReader + :members: + :undoc-members: + +.. doxygenclass:: paimon::FileBatchReader + :members: + :undoc-members: + +.. doxygenclass:: paimon::ReaderBuilder + :members: + :undoc-members: + +.. doxygenclass:: paimon::FileFormatFactory + :members: + :undoc-members: + +.. doxygenclass:: paimon::FormatStatsExtractor + :members: + :undoc-members: diff --git a/docs/source/api/file_index.rst b/docs/source/api/file_index.rst new file mode 100644 index 0000000..e95cd26 --- /dev/null +++ b/docs/source/api/file_index.rst @@ -0,0 +1,53 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +File Index +=========== + +.. _cpp-api-file-index: + +Interface +========= + +.. doxygenclass:: paimon::FileIndexer + :members: + :undoc-members: + +.. doxygenclass:: paimon::FileIndexerFactory + :members: + :undoc-members: + +.. doxygenclass:: paimon::FileIndexWriter + :members: + :undoc-members: + +.. doxygenclass:: paimon::FileIndexReader + :members: + :undoc-members: + +.. doxygenclass:: paimon::FileIndexFormat + :members: + :undoc-members: + +.. doxygenclass:: paimon::FileIndexResult + :members: + :undoc-members: + +.. doxygenclass:: paimon::BitmapIndexResult + :members: + :undoc-members: diff --git a/docs/source/api/file_system.rst b/docs/source/api/file_system.rst new file mode 100644 index 0000000..0313d0b --- /dev/null +++ b/docs/source/api/file_system.rst @@ -0,0 +1,51 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +File System +=========== + +.. _cpp-api-file-system-factory: + +Interface +========= + +.. doxygenclass:: paimon::FileSystem + :members: + :undoc-members: + +.. doxygenclass:: paimon::FileSystemFactory + :members: + :undoc-members: + +.. doxygenclass:: paimon::InputStream + :members: + :undoc-members: + +.. doxygenclass:: paimon::OutputStream + :members: + :undoc-members: + +.. doxygenenum:: paimon::SeekOrigin + +.. doxygenclass:: paimon::FileStatus + :members: + :undoc-members: + +.. doxygenclass:: paimon::BasicFileStatus + :members: + :undoc-members: diff --git a/docs/source/api/global_index.rst b/docs/source/api/global_index.rst new file mode 100644 index 0000000..8be63a7 --- /dev/null +++ b/docs/source/api/global_index.rst @@ -0,0 +1,77 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +============ +Global Index +============ + +.. _cpp-api-global-index: + +Interface +========= + +.. doxygenclass:: paimon::GlobalIndexerFactory + :members: + :undoc-members: + +.. doxygenclass:: paimon::GlobalIndexer + :members: + :undoc-members: + +.. doxygenclass:: paimon::GlobalIndexReader + :members: + :undoc-members: + +.. doxygenclass:: paimon::GlobalIndexFileReader + :members: + :undoc-members: + +.. doxygenstruct:: paimon::GlobalIndexIOMeta + :members: + :undoc-members: + +.. doxygenclass:: paimon::GlobalIndexResult + :members: + :undoc-members: + +.. doxygenclass:: paimon::BitmapGlobalIndexResult + :members: + :undoc-members: + +.. doxygenclass:: paimon::BitmapScoredGlobalIndexResult + :members: + :undoc-members: + +.. doxygenclass:: paimon::GlobalIndexScan + :members: + :undoc-members: + +.. doxygenclass:: paimon::IndexedSplit + :members: + :undoc-members: + +.. doxygenclass:: paimon::GlobalIndexWriteTask + :members: + :undoc-members: + +.. doxygenclass:: paimon::GlobalIndexWriter + :members: + :undoc-members: + +.. doxygenclass:: paimon::GlobalIndexFileWriter + :members: + :undoc-members: diff --git a/docs/source/api/io.rst b/docs/source/api/io.rst new file mode 100644 index 0000000..9687fc1 --- /dev/null +++ b/docs/source/api/io.rst @@ -0,0 +1,41 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=== +IO +=== + +.. _cpp-api-io: + +Interface +========= + +.. doxygenenum:: paimon::ByteOrder + +.. doxygenfunction:: paimon::SystemByteOrder() + +.. doxygenclass:: paimon::BufferedInputStream + :members: + :undoc-members: + +.. doxygenclass:: paimon::ByteArrayInputStream + :members: + :undoc-members: + +.. doxygenclass:: paimon::DataInputStream + :members: + :undoc-members: diff --git a/docs/source/api/memory.rst b/docs/source/api/memory.rst new file mode 100644 index 0000000..e498dfb --- /dev/null +++ b/docs/source/api/memory.rst @@ -0,0 +1,37 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +======== +Memory +======== + +.. _cpp-api-memory: + +Interface +========= + +.. doxygenfunction:: paimon::GetMemoryPool() + +.. doxygenfunction:: paimon::GetDefaultPool() + +.. doxygenclass:: paimon::MemoryPool + :members: + :undoc-members: + +.. doxygenclass:: paimon::Bytes + :members: + :undoc-members: diff --git a/docs/source/api/predicate.rst b/docs/source/api/predicate.rst new file mode 100644 index 0000000..e79f6ca --- /dev/null +++ b/docs/source/api/predicate.rst @@ -0,0 +1,52 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +Predicate +=================================== + +.. _cpp-api-predicate: + +Interface +========= + +.. doxygenclass:: paimon::Predicate + :members: + :undoc-members: + +.. doxygenclass:: paimon::LeafPredicate + :members: + :undoc-members: + +.. doxygenclass:: paimon::CompoundPredicate + :members: + :undoc-members: + +.. doxygenclass:: paimon::Function + :members: + :undoc-members: + +.. doxygenclass:: paimon::FunctionVisitor + :members: + :undoc-members: + +.. doxygenclass:: paimon::Literal + :members: + :undoc-members: + +.. doxygenclass:: paimon::PredicateBuilder + :members: + :undoc-members: diff --git a/docs/source/api/read.rst b/docs/source/api/read.rst new file mode 100644 index 0000000..3c437d7 --- /dev/null +++ b/docs/source/api/read.rst @@ -0,0 +1,41 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +Read +=========== + +.. _cpp-api-read: + +Interface +========= + +.. doxygenclass:: paimon::TableRead + :members: + :undoc-members: + +.. doxygenclass:: paimon::ReadContextBuilder + :members: + :undoc-members: + +.. doxygenclass:: paimon::ReadContext + :members: + :undoc-members: + +.. doxygenclass:: paimon::BatchReader + :members: + :undoc-members: diff --git a/docs/source/api/scan.rst b/docs/source/api/scan.rst new file mode 100644 index 0000000..142a516 --- /dev/null +++ b/docs/source/api/scan.rst @@ -0,0 +1,53 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +Scan +=========== + +.. _cpp-api-scan: + +Interface +========= + +.. doxygenclass:: paimon::TableScan + :members: + :undoc-members: + +.. doxygenclass:: paimon::ScanContextBuilder + :members: + :undoc-members: + +.. doxygenclass:: paimon::ScanContext + :members: + :undoc-members: + +.. doxygenclass:: paimon::Plan + :members: + :undoc-members: + +.. doxygenclass:: paimon::Split + :members: + :undoc-members: + +.. doxygenclass:: paimon::DataSplit + :members: + :undoc-members: + +.. doxygenclass:: paimon::ScanFilter + :members: + :undoc-members: diff --git a/docs/source/api/write.rst b/docs/source/api/write.rst new file mode 100644 index 0000000..7abeb5b --- /dev/null +++ b/docs/source/api/write.rst @@ -0,0 +1,45 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +=========== +Write +=========== + +.. _cpp-api-write: + +Interface +========= + +.. doxygenclass:: paimon::FileStoreWrite + :members: + :undoc-members: + +.. doxygenclass:: paimon::WriteContextBuilder + :members: + :undoc-members: + +.. doxygenclass:: paimon::WriteContext + :members: + :undoc-members: + +.. doxygenclass:: paimon::RecordBatch + :members: + :undoc-members: + +.. doxygenclass:: paimon::RecordBatchBuilder + :members: + :undoc-members: diff --git a/docs/source/basic_concepts.rst b/docs/source/basic_concepts.rst new file mode 100644 index 0000000..10143e9 --- /dev/null +++ b/docs/source/basic_concepts.rst @@ -0,0 +1,89 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. Borrowed the file from Apache Paimon: +.. https://github.com/apache/paimon/blob/master/docs/content/concepts/basic-concepts.md + +Basic Concepts +======================== + +File Layouts +------------------------ +All files of a table are stored under one base directory. Paimon files are +organized in a layered style. The following image illustrates the file layout. +Starting from a snapshot file, Paimon readers can recursively access all records +from the table. + +.. image:: _static/file-layout.png + :alt: File Layout + :align: center + :width: 100% + +Snapshot +------------------- +All snapshot files are stored in the snapshot directory. + +A snapshot file is a JSON file containing information about this snapshot, +including the schema file in use the manifest list containing all changes of +this snapshot. A snapshot captures the state of a table at some point in time. +Users can access the latest data of a table through the latest snapshot. +By time traveling, users can also access the previous state of a table through +an earlier snapshot. + +Manifest Files +------------------- +All manifest lists and manifest files are stored in the manifest directory. A +manifest list is a list of manifest file names. +A manifest file is a file containing changes about LSM data files and changelog +files. For example, which LSM data file is created and which file is deleted in +the corresponding snapshot. + +Data Files +--------------------------- +Data files are grouped by partitions. Currently, Paimon supports using parquet +(default), orc and lance as data file’s format. + +.. note:: + avro write as a data file format is not supported yet. + +Partition +--------------------------- +Paimon adopts the same partitioning concept as Apache Hive to separate data. + +Partitioning is an optional way of dividing a table into related parts based on +the values of particular columns like date, city, and department. Each table can +have one or more partition keys to identify a particular partition. + +By partitioning, users can efficiently operate on a slice of records in the +table. + +Consistency Guarantees +--------------------------- +Paimon writers use two-phase commit protocol to atomically commit a batch of +records to the table. Each commit produces at most two snapshots at commit time. +It depends on the incremental write and compaction strategy. If only incremental +writes are performed without triggering a compaction operation, only an +incremental snapshot will be created. If a compaction operation is triggered, an +incremental snapshot and a compacted snapshot will be created. + +For any two writers modifying a table at the same time, as long as they do not +modify the same partition, their commits can occur in parallel. If they modify +the same partition, only snapshot isolation is guaranteed. That is, the final +table state may be a mix of the two commits, but no changes are lost. + +.. note:: + Paimon C++ currently does not support compaction. diff --git a/docs/source/build_system.rst b/docs/source/build_system.rst new file mode 100644 index 0000000..1f3f325 --- /dev/null +++ b/docs/source/build_system.rst @@ -0,0 +1,99 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. default-domain:: cpp +.. highlight:: cpp + +====================== +Integrating Paimon C++ +====================== + +This section assumes that you have already built and installed the Paimon C++ +libraries on your system after :ref:`building them yourself `. +Additionally, you will need `Apache Arrow for C++ `_ +as in-memory data format interface. Please ensure that Arrow C++ is installed +and available to your build system + +The recommended way to integrate the Paimon C++ libraries into your C++ project +is to use CMake’s `find_package `_ +function to locate and integrate dependencies. + +CMake +===== + +Quick Start +----------- + +This ``CMakeLists.txt`` compiles the ``my_example.cc`` source file into +an executable and links it with the Paimon C++ shared library and its plugins +for data format and file system. + +.. code-block:: cmake + + cmake_minimum_required(VERSION 3.16) + + project(MyExample) + + find_package(Arrow REQUIRED) + find_package(Paimon REQUIRED) + + add_executable(my_example my_example.cc) + target_link_libraries(my_example PRIVATE arrow_shared + paimon_shared + paimon_parquet_file_format_shared + paimon_local_file_system_shared) + +Available variables and targets +------------------------------- + +The directive ``find_package(Paimon REQUIRED)`` instructs CMake to locate a +Paimon C++ installation on your system. If successful, it sets ``Paimon_FOUND`` +to true if the Paimon C++ libraries were found. + +It also defines the following linkable targets (plain strings, not variables): + +* ``paimon_shared`` links to the Paimon shared libraries +* ``paimon_static`` links to the Paimon static libraries + +In most cases, it is recommended to use the Paimon shared libraries. + +Optional plugins (built-in file formats, file systems, and index) +----------------------------------------------------------------- + +Paimon provides a set of built-in optional plugins that you can link to as needed: + +- File format plugins: + + - ``paimon_parquet_file_format_shared`` / ``paimon_parquet_file_format_static`` + - ``paimon_orc_file_format_shared`` / ``paimon_orc_file_format_static`` + - ``paimon_avro_file_format_shared`` / ``paimon_avro_file_format_static`` + - ``paimon_blob_file_format_shared`` / ``paimon_blob_file_format_static`` + - ``paimon_lance_file_format_shared`` / ``paimon_lance_file_format_static`` + +- File system plugins: + + - ``paimon_local_file_system_shared`` / ``paimon_local_file_system_static`` + - ``paimon_jindo_file_system_shared`` / ``paimon_jindo_file_system_static`` + +- Index plugins: + + - ``paimon_file_index_shared`` / ``paimon_file_index_static`` + - ``paimon_lumina_index_shared`` / ``paimon_lumina_index_static`` + +.. note:: + + In most cases, it is recommended to use the shared variants of these plugins. diff --git a/docs/source/building.rst b/docs/source/building.rst new file mode 100644 index 0000000..1b79175 --- /dev/null +++ b/docs/source/building.rst @@ -0,0 +1,291 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. highlight:: console + +.. _building-paimon-cpp: + +=================== +Building Paimon C++ +=================== + +System setup +============ + +Paimon uses CMake as a build configuration system. We recommend building +out-of-source. For example, you could create ``paimon-cpp/build-release`` +and invoke ``cmake $CMAKE_ARGS ..`` from this directory. + +Building requires: + +* A C++17-enabled compiler. On Linux, gcc 8 and higher should be + sufficient. Windows and MacOS are not supported for now. +* At least 2GB of RAM for a minimal build, 8GB for a minimal + debug build with tests and 16GB for a full build. + +On Ubuntu/Debian you can install the requirements with: + +.. code-block:: shell + + sudo apt-get install \ + build-essential \ + cmake + +We also provide a docker template to help you get started quickly. See in +``.devcontainer`` folder for more details. + +.. _cpp-building-building: + +Building +======== + +All the instructions below assume that you have cloned the paimon-cpp git +repository: + +.. code-block:: + + $ git clone https://github.com/apache/paimon-cpp.git + $ cd paimon-cpp + $ git lfs pull + +Manual configuration +-------------------- + +The build system uses ``CMAKE_BUILD_TYPE=Release`` by default, so if this +argument is omitted then a release build will be produced. + +Two build types are possible: + +* ``Debug``: doesn't apply any compiler optimizations and adds debugging + information in the binary. +* ``Release``: applies compiler optimizations and removes debug information + from the binary. + +.. note:: + + You can also run default build with flag ``-DPAIMON_EXTRA_ERROR_CONTEXT=ON`` + for more error msg context. + +Minimal release build (2GB of RAM for building or more recommended): + +.. code-block:: + + $ mkdir build-release + $ cd build-release + $ cmake .. + $ make -j8 # if you have 8 CPU cores, otherwise adjust + $ make install + +Minimal debug build with unit tests (4GB of RAM for building or more recommended): + +.. code-block:: + + $ mkdir build-debug + $ cd build-debug + $ cmake -DCMAKE_BUILD_TYPE=Debug -DPAIMON_BUILD_TESTS=ON .. + $ make -j8 # if you have 8 CPU cores, otherwise adjust + $ make unittest # to run the tests + $ make install + +The unit tests are not built by default. After building, one can also invoke +the unit tests using the ``ctest`` tool provided by CMake. + +Faster builds with Ninja +~~~~~~~~~~~~~~~~~~~~~~~~ + +Many contributors use the `Ninja build system `_ to +get faster builds. It especially speeds up incremental builds. To use +``ninja``, pass ``-GNinja`` when calling ``cmake`` and then use the ``ninja`` +command instead of ``make``. + +.. _cpp_build_optional_components: + +Optional Components +~~~~~~~~~~~~~~~~~~~ + +By default, the C++ build system creates a fairly minimal build. We have +several optional system components which you can opt into building by passing +boolean flags to ``cmake``. + +* ``-DPAIMON_ENABLE_ORC=ON``: Paimon integration with Apache ORC +* ``-DPAIMON_ENABLE_LANCE=ON``: Paimon integration with Lance +* ``-DPAIMON_ENABLE_AVRO=ON``: Apache Avro libraries and Paimon integration +* ``-DPAIMON_ENABLE_JINDO=ON``: Support for Alibaba Jindo filesystems +* ``-DPAIMON_ENABLE_LUMINA=ON``: Support for Lumina vector index, lumina is only supported on gcc9 or higher. + +Third-party dependency source +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Paimon C++ can either build selected third-party dependencies from bundled +sources or use libraries already installed on the system. The default mode is +``AUTO``, which tries system packages first and falls back to bundled sources +when they are not found. + +.. code-block:: shell + + cmake -B build -DPAIMON_DEPENDENCY_SOURCE=AUTO + +The supported dependency source values are: + +* ``AUTO``: use a system package when available, otherwise build bundled sources. +* ``BUNDLED``: always build bundled sources. +* ``SYSTEM``: require system packages and fail if they are not found. + +You can override individual dependencies with ``_SOURCE``. The +supported dependency set includes Arrow/Parquet, ORC, Protobuf, Avro, RE2, fmt, +RapidJSON, TBB, glog, GoogleTest, and compression libraries. Arrow and ORC +require project-specific patches, so their supported source values are +``AUTO`` and ``BUNDLED``; ``AUTO`` resolves to bundled sources for them. + +.. code-block:: shell + + cmake -B build \ + -DPAIMON_DEPENDENCY_SOURCE=AUTO \ + -Dfmt_SOURCE=SYSTEM \ + -Dfmt_ROOT=/opt/fmt \ + -Dzstd_SOURCE=BUNDLED + +Use ``PAIMON_PACKAGE_PREFIX`` to provide one common prefix for dependencies +whose own ``_ROOT`` variable is not set. + +.. code-block:: shell + + cmake -B build \ + -DPAIMON_DEPENDENCY_SOURCE=SYSTEM \ + -DPAIMON_PACKAGE_PREFIX=/opt/paimon-deps + +Package-manager-specific modes are intentionally out of scope for this first +dependency source interface. They can still be used through standard CMake +mechanisms such as ``CMAKE_PREFIX_PATH`` or ``CMAKE_TOOLCHAIN_FILE``, while +Paimon keeps the dependency source values limited to ``AUTO``, ``BUNDLED``, and +``SYSTEM``. + +When ``Arrow_SOURCE`` is explicitly set to ``BUNDLED`` or left as ``AUTO``, the +compression dependencies default to bundled sources unless individually +overridden. When ``ORC_SOURCE`` is explicitly set to ``BUNDLED`` or left as +``AUTO``, ``Protobuf_SOURCE`` defaults to bundled sources unless individually +overridden. + +During configuration, CMake prints a dependency resolution summary showing the +requested source, actual source, compatibility target, and search root for each +resolved dependency. + +Optional Targets +~~~~~~~~~~~~~~~~ + +For development builds, you will often want to enable additional targets in +enable to exercise your changes, using the following ``cmake`` options. + +* ``-DPAIMON_BUILD_TESTS=ON``: Build executable unit tests. + +Optional Checks +~~~~~~~~~~~~~~~ + +The following special checks are available as well. They instrument the +generated code in various ways so as to detect select classes of problems +at runtime (for example when executing unit tests). + +* ``-DPAIMON_USE_ASAN=ON``: Enable Address Sanitizer to check for memory leaks, + buffer overflows or other kinds of memory management issues. +* ``-DPAIMON_USE_UBSAN=ON``: Enable Undefined Behavior Sanitizer to check for + situations which trigger C++ undefined behavior. + +Some of those options are mutually incompatible, so you may have to build +several times with different options if you want to exercise all of them. + +CMake version requirements +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We support CMake 3.16 and higher. + +LLVM and Clang Tools +~~~~~~~~~~~~~~~~~~~~ + +We currently use LLVM for library builds and for developer tools such as code +formatting with clang-format. LLVM can be installed via most modern package +managers (apt, yum, etc.). + +Environment variables +~~~~~~~~~~~~~~~~~~~~~ + +The build system and helper scripts accept several environment variables that +can alter fetch and build behaviour without changing CMake flags. These are +especially useful when you want to use a local or corporate mirror for +third-party archives, or to override a specific dependency's download URL. + +Common environment variables +---------------------------- + +* ``PAIMON_THIRDPARTY_MIRROR_URL`` + + When set, this string is used as a prefix for the default third-party + download URLs. For example, if a dependency would normally be downloaded + from + + ``https://github.com/fmtlib/fmt/archive/refs/tags/${PAIMON_FMT_BUILD_VERSION}.tar.gz`` + + and ``PAIMON_THIRDPARTY_MIRROR_URL`` is set to + + ``https://mirror.example.com/paimon/thirdparty/``, the build system will + attempt to download from + + ``https://mirror.example.com/paimon/thirdparty/https://github.com/fmtlib/fmt/archive/refs/tags/${PAIMON_FMT_BUILD_VERSION}.tar.gz`` + + (the exact concatenation semantics follow the third-party fetch helpers + defined in ``cmake_modules/ThirdpartyToolchain.cmake``). If you set a + mirror URL, prefer including a trailing slash to avoid accidental URL + concatenation issues. + +* Per-dependency override variables (examples) + + Many dependencies support overriding their download URL via a dedicated + environment variable. Examples implemented in the CMake helper include: + + - ``PAIMON_FMT_URL`` to override the fmt archive URL + - ``PAIMON_RAPIDJSON_URL`` to override RapidJSON download URL + - ``PAIMON_ZLIB_URL``, ``PAIMON_ZSTD_URL``, ``PAIMON_LZ4_URL`` etc. + + If one of these per-dependency environment variables is defined, it will + take precedence over the mirror prefix. Use these variables to precisely + control where a given dependency is fetched from. + +Usage examples +-------------- + +Use a mirror for all third-party downloads: + +.. code-block:: shell + + export PAIMON_THIRDPARTY_MIRROR_URL="https://mirror.example.com/paimon/thirdparty/" + mkdir build + cd build + cmake -DPAIMON_BUILD_TESTS=ON .. + +Override only a single dependency (fmt): + +.. code-block:: shell + + export PAIMON_FMT_URL="https://internal.example.com/archives/fmt-8.1.1.tar.gz" + mkdir build + cd build + cmake .. + +.. note:: + + The exact fetch behaviour (how the mirror prefix is concatenated, or whether the helper expects a full URL vs. a prefix) + is implemented in ``cmake_modules/ThirdpartyToolchain.cmake``. Consult that file when you need a custom setup. + Unset an environment variable to revert to the default upstream download locations: ``unset PAIMON_THIRDPARTY_MIRROR_URL`` diff --git a/docs/source/conf.py b/docs/source/conf.py new file mode 100644 index 0000000..052c2c2 --- /dev/null +++ b/docs/source/conf.py @@ -0,0 +1,147 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# + +# Adapted from Apache Arrow +# https://github.com/apache/arrow/blob/main/docs/source/conf.py + +# Configuration file for the Sphinx documentation builder. +# +# For the full list of built-in configuration values, see the documentation: +# https://www.sphinx-doc.org/en/master/usage/configuration.html + +# -- Project information ----------------------------------------------------- +# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information + +import os + +project = "C++ Paimon" +copyright = "2024-present The Apache Software Foundation" +author = "The Apache Software Foundation" + +# -- General configuration --------------------------------------------------- +# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration + +extensions = [ + "breathe", + "myst_parser", + "sphinx_design", + "sphinx_copybutton", + "sphinx.ext.autodoc", + "sphinx.ext.autosummary", + "sphinx.ext.doctest", + "sphinx.ext.ifconfig", + "sphinx.ext.intersphinx", + "sphinx.ext.mathjax", + "sphinx.ext.viewcode", + "sphinxcontrib.mermaid", +] + +# Show members for classes in .. autosummary +autodoc_default_options = { + "members": None, + "special-members": "__dataframe__", + "undoc-members": None, + "show-inheritance": None, + "inherited-members": None, +} + +# Breathe configuration +breathe_projects = { + "paimon_cpp": os.environ.get("PAIMON_CPP_DOXYGEN_XML", "../../apidoc/xml"), +} +breathe_default_project = "paimon_cpp" + +# Overridden conditionally below +autodoc_mock_imports = [] + +# copybutton configuration +copybutton_prompt_text = r">>> |\.\.\. |\$ |In \[\d*\]: | {2,5}\.\.\.: " +copybutton_prompt_is_regexp = True +copybutton_line_continuation_character = "\\" + +# MyST-Parser configuration +myst_enable_extensions = [ + "amsmath", + "attrs_inline", + "deflist", + "dollarmath", + "fieldlist", + "html_admonition", + "html_image", + "linkify", + "strikethrough", + "substitution", + "tasklist", +] + +templates_path = ["_templates"] +exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] + + +# -- Options for HTML output ------------------------------------------------- +# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output + +html_theme = "pydata_sphinx_theme" +html_static_path = ["_static"] + +# Custom fixes to the RTD theme +html_css_files = ["theme_overrides.css"] + +# Hide the primary sidebar (section navigation) for these pages +html_sidebars = { + "implementations": [], + "status": [], +} + +# The master toctree document. +master_doc = "index" + +version = "0.2.0" + +html_theme_options = { + "show_toc_level": 2, + "show_nav_level": 2, + "use_edit_page_button": True, + "header_links_before_dropdown": 4, + "navbar_align": "left", + "navbar_end": ["theme-switcher", "navbar-icon-links"], + "icon_links": [ + { + "name": "GitHub", + "url": "https://github.com/apache/paimon-cpp", + "icon": "fa-brands fa-square-github", + }, + ], + "logo": { + "text": "Paimon C++", + }, + "show_version_warning_banner": True, +} + +html_context = { + "github_user": "apache", + "github_repo": "paimon-cpp", + "github_version": "main", + "doc_path": "docs/source", +} + +html_title = f"C++ Paimon" + +html_show_sourcelink = False diff --git a/docs/source/documentations.rst b/docs/source/documentations.rst new file mode 100644 index 0000000..5dcc368 --- /dev/null +++ b/docs/source/documentations.rst @@ -0,0 +1,68 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. _building-docs: + +Building the Documentation +========================== + +Prerequisites +------------- + +The documentation build relies on `Doxygen `_ and +`Sphinx `_ along with a few extensions. + +First, install `Doxygen `_ +yourself (for example from your distribution's official repositories, if +using Linux). Then install the Python-based requirements with the +following command: + +.. code-block:: shell + + pip install -r paimon-cpp/docs/requirements.txt + +Building +-------- + +.. note:: + + If you are building the documentation on Windows, not all sections + may build properly. + +These two steps are mandatory and must be executed in order. + +#. Process the C++ API using Doxygen + + .. code-block:: shell + + cd paimon-cpp/apidoc + doxygen + cd - + +#. Build the complete documentation using Sphinx. + + .. code-block:: shell + + cd paimon-cpp/docs + make html + cd - + + +After these steps are completed, the documentation is rendered in HTML +format in ``paimon-cpp/docs/_build/html``. In particular, you can point your +browser at ``paimon-cpp/docs/_build/html/index.html`` to read the docs and +review any changes you made. diff --git a/docs/source/examples/clean.rst b/docs/source/examples/clean.rst new file mode 100644 index 0000000..89de5ea --- /dev/null +++ b/docs/source/examples/clean.rst @@ -0,0 +1,26 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. default-domain:: cpp +.. highlight:: cpp + +============== +Clean Example +============== + +The file ``examples/clean_demo.cpp`` located inside the source tree, it is an +example of building and using Paimon from a third-party project. diff --git a/docs/source/examples/index.rst b/docs/source/examples/index.rst new file mode 100644 index 0000000..c26b749 --- /dev/null +++ b/docs/source/examples/index.rst @@ -0,0 +1,25 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +Examples +======== + +.. toctree:: + :maxdepth: 1 + + write_commit_scan_read + clean diff --git a/docs/source/examples/write_commit_scan_read.rst b/docs/source/examples/write_commit_scan_read.rst new file mode 100644 index 0000000..1549bb1 --- /dev/null +++ b/docs/source/examples/write_commit_scan_read.rst @@ -0,0 +1,26 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. default-domain:: cpp +.. highlight:: cpp + +============================== +Write Commit Scan Read Example +============================== + +The file ``examples/read_write_demo.cpp`` located inside the source tree, it is +an example of building and using Paimon from a third-party project. diff --git a/docs/source/getting_started.rst b/docs/source/getting_started.rst new file mode 100644 index 0000000..4c5d1a3 --- /dev/null +++ b/docs/source/getting_started.rst @@ -0,0 +1,37 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. default-domain:: cpp +.. highlight:: cpp + +Getting Started +=============== + +The following articles demonstrate installation, usage, and a basic +understanding of C++ Paimon. These articles will get you set up quickly with C++ +Paimon and give you a taste of what the library is capable of. + +Start here to gain a basic understanding of Paimon, and move on to the +:doc:`/user_guide` to explore more specific topics and +underlying concepts, or the :doc:`/api` to explore Paimon's API. + +.. toctree:: + + building + build_system + basic_concepts + documentations diff --git a/docs/source/index.rst b/docs/source/index.rst new file mode 100644 index 0000000..c595d45 --- /dev/null +++ b/docs/source/index.rst @@ -0,0 +1,108 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. _implementations: + +C++ Paimon Documentation +======================== + +Paimon C++ is a high-performance C++ implementation of Apache Paimon. We aim to +provide a native, high-performance and extensible implementation that allows +native engines to access the Paimon datalake format with maximum efficiency. + +.. grid:: 1 2 2 2 + :gutter: 4 + :padding: 2 2 0 0 + :class-container: sd-text-center + + .. grid-item-card:: Getting started + :class-card: contrib-card + :shadow: none + + Start here to gain a basic understanding of Paimon with + an installation and linking guide, basic concepts etc. + + +++ + + .. button-link:: getting_started.html + :click-parent: + :color: primary + :expand: + + To Getting started + + .. grid-item-card:: User Guide + :class-card: contrib-card + :shadow: none + + Explore more specific topics and underlying concepts + of Paimon C++ + + +++ + + .. button-link:: user_guide.html + :click-parent: + :color: primary + :expand: + + To the User Guide + +.. grid:: 1 2 2 2 + :gutter: 4 + :padding: 2 2 0 0 + :class-container: sd-text-center + + .. grid-item-card:: Examples + :class-card: contrib-card + :shadow: none + + Find the description and location of the examples + using Paimon C++ library + + +++ + + .. button-link:: examples/index.html + :click-parent: + :color: primary + :expand: + + To the Examples + + .. grid-item-card:: API Reference + :class-card: contrib-card + :shadow: none + + Explore Paimon‘s API reference documentation + + +++ + + .. button-link:: api.html + :click-parent: + :color: primary + :expand: + + To the API Reference + + +.. toctree:: + :maxdepth: 2 + :hidden: + + getting_started + user_guide + Examples + api diff --git a/docs/source/user_guide.rst b/docs/source/user_guide.rst new file mode 100644 index 0000000..5b14faa --- /dev/null +++ b/docs/source/user_guide.rst @@ -0,0 +1,40 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. default-domain:: cpp +.. highlight:: cpp + +User Guide +========== + +.. toctree:: + + user_guide/catalog + user_guide/schema + user_guide/snapshot + user_guide/manifest + user_guide/data_types + user_guide/primary_key_table + user_guide/append_only_table + user_guide/write + user_guide/commit + user_guide/compaction + user_guide/read + user_guide/clean + user_guide/prefetch + user_guide/arrow + user_guide/global_index diff --git a/docs/source/user_guide/append_only_table.rst b/docs/source/user_guide/append_only_table.rst new file mode 100644 index 0000000..128053f --- /dev/null +++ b/docs/source/user_guide/append_only_table.rst @@ -0,0 +1,26 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. Borrowed the file from Apache Paimon: +.. https://github.com/apache/paimon/blob/master/docs/content/append-table/overview.md + +Append Only Table +================= +If a table does not have a primary key defined, it is an append table. Compared +to the primary key table, it does not have the ability to directly receive changelogs. +It cannot be directly updated with data through upsert. It can only receive +incoming data from append data. diff --git a/docs/source/user_guide/arrow.rst b/docs/source/user_guide/arrow.rst new file mode 100644 index 0000000..084b156 --- /dev/null +++ b/docs/source/user_guide/arrow.rst @@ -0,0 +1,125 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. _memory-format: + +Memory Format +============= + +`Paimon Java `_ uses a row-level +abstraction, ``InternalRow``, as the default data format interface. +Row-level interfaces are generally more intuitive for programming and can be +easily integrated into data processing pipelines such as filtering and merge sorting. +However, this approach introduces conversion and access overhead, +making it difficult to fully leverage the additional performance benefits provided +by modern CPU SIMD vectorization. + +Considering that the C++ implementation focuses more on end-to-end performance, +and that underlying data file formats (e.g., ORC, Parquet) are predominantly +columnar, providing a columnar-centric data abstraction can minimize overhead +during data flow, integrate more naturally with vectorized execution engines, +and ultimately deliver superior overall performance. + +Why Apache Arrow +---------------- + +Apache Arrow is currently the most widely adopted in-memory columnar format and +has strong native support for Parquet and ORC. It is well-integrated across +open-source engines including Spark, Pandas, Drill, Impala, and Velox. + +Overall, using Apache Arrow as the in-memory format for Paimon C++ allows us to: + +- Maximize columnar performance and efficient data access patterns. +- Seamlessly integrate with the Apache Arrow ecosystem and tooling. +- Benefit from mature interoperability with popular data systems and formats. + +Versioning and Dependency Concerns +---------------------------------- + +One important consideration is that Arrow is an active open-source project with +a broad and evolving C++ surface area that spans data formats, compute kernels, +and file I/O. Due to frequent releases and a large API surface, different Arrow +C++ SDK versions can introduce API incompatibilities. + +If Paimon C++ directly depends on the full Arrow C++ SDK, it may conflict with +existing Arrow C++ dependencies in other compute engines, raising integration +costs and increasing long-term maintenance complexity. + +Adopting the Arrow C Data Interface +----------------------------------- + +To leverage Arrow’s performance and ecosystem benefits while avoiding tight +coupling to specific Arrow C++ SDK versions, we use the Arrow C Data +Interface as the default in-memory format for Paimon C++. + +Key advantages: + +- Version-neutral: The C Data Interface is designed to be stable and forward-compatible + across Arrow versions. +- Compiler-neutral: C language interfaces avoid ABI friction commonly seen with + C++ compilers and standard libraries. +- Broad interoperability: The C Data Interface is supported by Arrow-based + systems and enables zero-copy or minimal-copy interchange of columnar data. + +Design Principles +----------------- + +- Columnar-first abstraction: + Paimon C++ will represent in-memory data using columnar buffers and schemas + compatible with the Arrow C Data Interface, minimizing transformation overhead. + +- Minimal dependency footprint: + Prefer stable C interfaces and lightweight utility layers; avoid linking + against the full Arrow C++ SDK unless strictly necessary and well-isolated. + +- Vectorization-aware execution: + Structure data layouts to align with SIMD processing (e.g., contiguous + buffers, clear null bitmaps and type-specific arrays), enabling efficient + filtering, projection, and aggregation. + +- Interoperability: + Ensure that data produced and consumed by Paimon C++ can be handed off to + Arrow-compatible engines and libraries without expensive conversions. + +- Compatibility with columnar storage: + Maintain efficient paths from columnar file formats (Parquet, ORC) to in-memory + columnar representations, minimizing decoding and marshaling overhead. + +Implementation Outline +---------------------- + +Schema and buffers +~~~~~~~~~~~~~~~~~~~ +- Represent schemas and arrays using Arrow C Data Interface types (e.g., ``ArrowSchema``, ``ArrowArray``) with clear ownership and lifecycle. +- Support nested types (structs, lists, maps) and common primitives (integers, floats, decimals, timestamps). + +Memory management +~~~~~~~~~~~~~~~~~~~~ +- Define consistent ownership semantics for buffers and child arrays. +- Employ reference counting or explicit release callbacks aligned with the Arrow C conventions. + +Nullability and validity +~~~~~~~~~~~~~~~~~~~~~~~~ +- Use standard validity bitmaps for nullability and adhere to Arrow’s canonical buffer organization (validity, offsets, data, etc.). + +Conversion boundaries +~~~~~~~~~~~~~~~~~~~~~~ +- Provide adapters to + * Read from columnar file formats (Parquet/ORC) into Arrow-compatible ``ArrowArray`` structures. + * Export/import data to other Arrow-compatible engines with zero or minimal copies. + +- Keep these adapters independent from heavy Arrow C++ SDK dependencies. diff --git a/docs/source/user_guide/catalog.rst b/docs/source/user_guide/catalog.rst new file mode 100644 index 0000000..8c435c1 --- /dev/null +++ b/docs/source/user_guide/catalog.rst @@ -0,0 +1,37 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. _catalog: + +Catalog +========================== +C++ Paimon provides a :ref:`Catalog abstraction ` to manage the table of contents and metadata. The Catalog +abstraction provides a series of ways to help you better integrate with computing engines. We always +recommend that you use Catalog to access the Paimon table. + +Filesystem Catalog +~~~~~~~~~~~~~~~~~~ +C++ Paimon catalog currently support one types of metastores filesystem metastore (default), +which stores both metadata and table files in filesystems. + +.. note:: + + Current C++ Paimon only supports filesystem catalog. In the future, we will + support REST catalog. + By using the Paimon REST catalog, changes to the catalog will be directly stored + in a remote catalog server which exposed through REST API. See `Java Paimon REST + Catalog `_. diff --git a/docs/source/user_guide/clean.rst b/docs/source/user_guide/clean.rst new file mode 100644 index 0000000..05e2645 --- /dev/null +++ b/docs/source/user_guide/clean.rst @@ -0,0 +1,162 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +Data Cleanup +=================================== +.. note:: + + The entire data cleanup feature supports only append tables, and cleanup of index manifests, changelog, statistics are not supported. Do not use this feature with primary key tables. + +This document describes three cleanup capabilities: + +- :ref:`orphan-file-cleanup` +- :ref:`expiring-partitions` +- :ref:`expiring-snapshots` + +.. _orphan-file-cleanup: + +Orphan File Cleanup +------------------- + +Description +~~~~~~~~~~~~~~~~~~~ + +- Orphan file cleanup currently supports only append tables. +- It runs as an independent task: construct a ``CleanContext`` and + launch an ``OrphanFilesCleaner``. + +Detailed Steps +~~~~~~~~~~~~~~ + +1. List all Paimon-specific subdirectories under the table directory + (e.g., ``manifest/``, ``snapshot/``, ``f1=10/bucket-0``, ...). +2. Based on the subdirectories from step 1, enumerate all Paimon files + in the table directory. +3. Using snapshot information, determine all in-use ``manifest`` files + and data files. +4. Compute the set of files that appear in step 2 but not in step 3 + (i.e., orphan files). Among those, delete files whose modification + time is earlier than ``older_than_ms``. + +Performance Considerations +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Steps 1, 2, and 4 use an executor to parallelize I/O operations. +- Orphan file cleanup may take a long time; you can pass an executor + with more threads to accelerate the process. + +.. admonition:: TODO + :class: tip + + - Support cleanup of index manifests, changelog, statistics, etc. + +.. _expiring-partitions: + +Expiring Partitions +------------------- + +Description +~~~~~~~~~~~~~~~~~~~ + +- Executed within the Commit task. First construct a ``FileStoreCommit``. +- ``DropPartition`` uses a mark-delete strategy. Calling ``DropPartition`` + will send an ``Overwrite`` message, marking all data files in the + specified partition as ``DELETE``. +- A new snapshot is committed afterward. Actual deletion of data files + occurs as snapshots expire. + +Detailed Steps +~~~~~~~~~~~~~~ + +1. Build a ``ScanFilter`` for the partition and use the latest snapshot + to scan the partition. +2. Iterate over the scanned data file list (``ManifestEntries``) and + rewrite each entry’s type to ``DELETE``. +3. Commit using the rewritten ``ManifestEntries``. If the commit fails, + retry a limited number of times. + +.. _expiring-snapshots: + +Expiring Snapshots +------------------ + +Description +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Executed within the Commit task. First construct a ``FileStoreCommit``. +- The following optional configuration parameters control snapshot + expiration: + + - ``snapshot.num-retained.min``: The minimum number of completed + snapshots to retain (>= 1). Default: 10. + - ``snapshot.num-retained.max``: The maximum number of completed + snapshots to retain (>= ``snapshot.num-retained.min``). Default: + ``int32`` max value. + - ``snapshot.time-retained``: The maximum age of completed snapshots + to retain. Default: 1 hour. + - ``snapshot.expire.limit``: The maximum number of snapshots allowed + to expire at a time. Default: 10. + +- The snapshot expiration interface deletes data files according to the + expiration policy and returns the number of snapshots deleted. + +Detailed Steps +~~~~~~~~~~~~~~ + +1. Use ``snapshot_manager`` to find ``earliest_snapshot_id`` and + ``latest_snapshot_id``. +2. Based on ``earliest/latest_snapshot_id`` and the expire config, + determine the range of snapshots to clean. + +.. note:: + + Consumer subscription management (consumer manager) is not + currently supported. Users must ensure that snapshots to be expired + are not in use. + +3. Verify that the snapshot range is continuous. Normally, it is + continuous. If a snapshot is missing, assume earlier snapshots were + already cleaned and the missing files are orphaned remnants due to + I/O exceptions; they are out of scope for this cleanup. +4. Clean data files for the updated expiration range. + + - To decide whether a file from a snapshot should be deleted, check if it was marked ``DELETE`` in the delta of the subsequent snapshot. + - For an expiration range ``[begin, end)``, iterate over ``(begin,end]`` and delete data files whose type is ``DELETE`` in each ``snapshot.DeltaManifestList()``. + - If a file underwent multiple ``ADD`` and ``DELETE`` operations, deletion follows the operation order: + + * ``ADD`` then ``DELETE`` → the file is deleted. + * ``DELETE`` then ``ADD`` → the initial ``DELETE`` does not apply (file did not exist yet); the subsequent ``ADD`` ensures the file remains. + +5. Clean meta files: + - Preserve manifests used by the last snapshot in the cleanup range (``end_exclusive_id``). + - Delete manifest files used by snapshots from ``begin_inclusive_id`` to ``end_exclusive_id`` (exclusive) and delete the snapshot files themselves. +6. Rewrite ``EarliestHint`` to ``end_exclusive_id``. +7. Return the number of snapshots deleted, i.e., ``end_exclusive_id - begin_inclusive_id``. + +Performance Considerations +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Step 4 uses an executor to parallelize file deletions. +- If deletion is slow, pass an executor with more threads to accelerate + the process. + +.. admonition:: TODO + :class: tip + + - Preserve tag (savepoint) data via ``tagManager``. + - Delete changelog files. + - Remove empty directories. diff --git a/docs/source/user_guide/commit.rst b/docs/source/user_guide/commit.rst new file mode 100644 index 0000000..e833991 --- /dev/null +++ b/docs/source/user_guide/commit.rst @@ -0,0 +1,126 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +Commit +========================================== + +Commit is a critical stage in Paimon’s write path. It is responsible for generating +Snapshot files that describe the current state of a Paimon table. This document +provides a detailed analysis of the Paimon Commit process. + +Commit Process Overview +-------------------------- + +The input to Commit is a ``CommitMessage``, which is produced by the write +operation through ``PrepareCommit``. It records all data files generated during +the write phase. + +The Commit process consists of the following steps: + +1. Collect file changes +2. Compact (merge) Manifest files +3. Generate the Base Manifest List +4. Generate new Manifest files and the Delta Manifest List +5. Generate the Snapshot and HINT file + +Detailed Process +------------------- + +Collect File Changes +~~~~~~~~~~~~~~~~~~~~~~~~ + +During Commit, the system extracts key information from ``CommitMessages``—such as +file name, operation type (``ADD`` or ``DELETE``), the file’s Partition and Bucket— +and converts them into ``ManifestEntry`` records. + +A ``ManifestEntry`` represents a single operation record in a manifest file and +corresponds to a change to one file. + +Paimon snapshots track two manifest list files: + +- Base Manifest List: Describes the data that existed prior to the current Snapshot. + Because there may be multiple manifest files, the base manifest list records + metadata for all original manifest files. +- Delta Manifest List: Records the changes (adds/deletes) produced by the current Commit. + +Compact (Merge) Manifest Files +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To control the number and size of manifest files, the system determines whether +existing manifest files should be compacted prior to generating a new Snapshot. + +Compaction starts by fetching the latest Snapshot and using its base and delta +manifest lists to locate all relevant manifest files. + +Two compaction strategies are used: Full Compaction and Minor Compaction. + +Full Compaction +^^^^^^^^^^^^^^^ + +Full Compaction is attempted first. The system iterates over candidate files and +classifies them as follows: + +- Base files: If a file has no ``DELETE`` operations and its size exceeds the + target file size (default 8 MB), the file is categorized as base. +- Delta files: Remaining files are categorized as delta. The system computes the + total size of delta files; if the total exceeds the Full Compaction threshold + (default 16 MB), the delta files are merged. + +Minor Compaction +^^^^^^^^^^^^^^^^ + +If Full Compaction’s conditions are not met, Minor Compaction is attempted: + +- The system iterates over all files, skipping any file larger than the target file size. +- Whenever the accumulated size of selected files exceeds the target file size, those files are merged. +- If there are still unmerged files and their count exceeds the minimum compaction trigger + threshold (default 30 files), a merge is triggered. + +Compaction Rules +^^^^^^^^^^^^^^^^ + +1. If duplicate ``ADD`` operations for the same file are discovered, an error is raised. +2. ``ADD`` and ``DELETE`` for the same file neutralize each other. + +Generate the Base Manifest List +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +After compaction (which may or may not be triggered every time), the system obtains +a consolidated set of manifest file metadata. This metadata is written into a new +manifest list file, forming the Snapshot’s base manifest list. + +Generate New Manifest Files and the Delta Manifest List +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The initially collected file change information is written into new manifest files. +Metadata for these newly created manifest files is then written into the delta +manifest list. + +Generate the Snapshot and HINT File +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +After the above steps are completed, the system generates a new Snapshot and performs +the following operations: + +1. Determine the new Snapshot ID based on the latest ``SnapshotId + 1``. +2. Record metadata such as ``schema id``, ``commit time``, and ``total record count``. +3. Atomicity guarantee: Generating a Snapshot is an atomic operation. If an exception + (e.g., an I/O error) occurs during the process, the manifest files and manifest list files + generated in Steps 2–4 are cleaned up and removed. +4. The Snapshot is written via a rename operation to ensure atomicity. +5. After the Snapshot is successfully written, the system writes the ``LATEST`` hint file + to reduce list operations when fetching the latest Snapshot. diff --git a/docs/source/user_guide/compaction.rst b/docs/source/user_guide/compaction.rst new file mode 100644 index 0000000..c2404cd --- /dev/null +++ b/docs/source/user_guide/compaction.rst @@ -0,0 +1,216 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +Compaction +========== +Compaction is the process of merging multiple small data files into fewer, larger +files. It is a resource intensive procedure which consumes CPU time and disk IO, +so too frequent compaction may result in slower writes. However, without +compaction, the accumulation of small files degrades query performance. Tuning +compaction is therefore a trade-off between write throughput and read efficiency. + +.. note:: + - There can only be one job working on the same partition's compaction, + otherwise it will cause conflicts. + - C++ Paimon does not support producing changelog for now. + - Compaction is disabled when ``write-only`` is set to ``true``, or when the + table uses dynamic bucketing (``bucket = -1``) for append-only tables. + - For a complete list of compaction-related configurations, see the + :ref:`Options API Reference `. + +Append-Only Table Compaction +---------------------------- +In append-only table, data files are simply appended in sequence order. +Over time, many small files accumulate, which degrades read performance due to the +overhead of opening and scanning numerous files. + +Append-only table compaction merges multiple small files into fewer, larger files +to improve read efficiency. The compaction is performed asynchronously and does +not block writes. + +.. note:: + Append-only table compaction is only available for fixed-bucket mode + (``bucket > 0``). Dynamic bucketing (``bucket = -1``) does not support + compaction. Tables with blob columns also skip compaction. + +Auto Compaction +~~~~~~~~~~~~~~~ +During each flush, the writer triggers a best-effort auto compaction. The +compaction picker scans the file queue ordered by sequence number and selects a +contiguous window of files for merging when the number of candidate files reaches +the ``compaction.min.file-num`` threshold. + +Full Compaction +~~~~~~~~~~~~~~~ +Full compaction rewrites all eligible files in the bucket. During full +compaction: + +- Files whose size is already at or above ``compaction.file-size`` (and have no + associated deletion vectors) are skipped to avoid unnecessary rewrites. +- When deletion vectors are enabled, all files are always eligible for + compaction regardless of size, because deletion vectors must be applied. +- When ``compaction.force-rewrite-all-files`` is ``true``, all files are + rewritten unconditionally. +- Without deletion vectors, full compaction only proceeds when the number of + small files exceeds the number of large files and the total file count is at + least 3. + +After compaction, if the last output file is still smaller than +``compaction.file-size``, it is placed back into the compaction queue for future +merging. + +Append-Only Table Compaction Options +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. list-table:: + :header-rows: 1 + :widths: 30 10 10 10 40 + + * - Option + - Required + - Default + - Type + - Description + * - ``compaction.min.file-num`` + - No + - 5 + - Integer + - The minimum number of files to trigger an auto compaction for + append-only tables. + + +Primary Key Table Compaction +---------------------------- +Primary key tables use an LSM tree (log-structured merge-tree) for file storage. +When more and more records are written, the number of sorted runs increases. +Because querying an LSM tree requires all sorted runs to be combined, too many +sorted runs will result in poor query performance, or even out of memory. + +To limit the number of sorted runs, several sorted runs are merged into one big +sorted run once in a while. Paimon currently adopts a compaction strategy similar +to RocksDB's `universal compaction +`_. + +Primary key table compaction solves: + +- Reduce Level 0 files to avoid poor query performance. +- Produce deletion vectors for MOW mode. + +Full Compaction +~~~~~~~~~~~~~~~ +Paimon uses Universal Compaction. By default, when there is too much incremental +data, Full Compaction will be automatically performed. You don't usually have to +worry about it. + +Paimon also provides configurations that allow for regular execution of Full +Compaction: + +- ``compaction.optimization-interval``: Implying how often to perform an + optimization full compaction. This configuration is used to ensure the query + timeliness of the read-optimized system table. +- ``compaction.total-size-threshold``: Full compaction will be constantly triggered + when total size is smaller than this threshold. +- ``compaction.incremental-size-threshold``: Full compaction will be constantly + triggered when incremental size is bigger than this threshold. + +Lookup Compaction +~~~~~~~~~~~~~~~~~ +When a primary key table is configured with ``lookup`` changelog producer or +``first-row`` merge engine or has enabled deletion vectors for MOW mode, Paimon +will use a radical compaction strategy to force compacting level 0 files to +higher levels for every compaction trigger. + +Paimon also provides configurations to optimize the frequency of this +compaction: + +- ``lookup-compact``: compact mode used for lookup compaction. Possible values: + + * ``radical``: will use ``ForceUpLevel0Compaction`` strategy to radically + compact new files. + * ``gentle``: will use ``UniversalCompaction`` strategy to gently compact new + files. + +- ``lookup-compact.max-interval``: The max interval for a forced L0 lookup + compaction to be triggered in ``gentle`` mode. This option is only valid when + ``lookup-compact`` mode is ``gentle``. + +By configuring ``lookup-compact`` as ``gentle``, new files in L0 will not be +compacted immediately. This may greatly reduce the overall resource usage at the +expense of worse data freshness in certain cases. + +Primary Key Table Compaction Options +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Number of Sorted Runs to Pause Writing +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +When the number of sorted runs is small, Paimon writers will perform compaction +asynchronously in separated threads, so records can be continuously written into +the table. However, to avoid unbounded growth of sorted runs, writers will pause +writing when the number of sorted runs hits the threshold. + +.. list-table:: + :header-rows: 1 + :widths: 30 10 10 10 40 + + * - Option + - Required + - Default + - Type + - Description + * - ``num-sorted-run.stop-trigger`` + - No + - (none) + - Integer + - The number of sorted runs that trigger the stopping of writes. The + default value is ``num-sorted-run.compaction-trigger + 3``. + +Write stalls will become less frequent when ``num-sorted-run.stop-trigger`` +becomes larger, thus improving writing performance. However, if this value +becomes too large, more memory and CPU time will be needed when querying the +table. + +Number of Sorted Runs to Trigger Compaction +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Paimon uses LSM tree which supports a large number of updates. LSM organizes +files in several sorted runs. When querying records from an LSM tree, all sorted +runs must be combined to produce a complete view of all records. + +One can easily see that too many sorted runs will result in poor query +performance. To keep the number of sorted runs in a reasonable range, Paimon +writers will automatically perform compactions. The following table property +determines the minimum number of sorted runs to trigger a compaction. + +.. list-table:: + :header-rows: 1 + :widths: 30 10 10 10 40 + + * - Option + - Required + - Default + - Type + - Description + * - ``num-sorted-run.compaction-trigger`` + - No + - 5 + - Integer + - The sorted run number to trigger compaction. Includes level 0 files (one + file one sorted run) and high-level runs (one level one sorted run). + +Compaction will become less frequent when ``num-sorted-run.compaction-trigger`` +becomes larger, thus improving writing performance. However, if this value +becomes too large, more memory and CPU time will be needed when querying the +table. This is a trade-off between writing and query performance. diff --git a/docs/source/user_guide/data_types.rst b/docs/source/user_guide/data_types.rst new file mode 100644 index 0000000..5759119 --- /dev/null +++ b/docs/source/user_guide/data_types.rst @@ -0,0 +1,212 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +Data Types +========== + +A data type describes the logical type of a value in the table ecosystem. It can +be used to declare input and/or output types of operations. + +All data types by Java Paimon are as follows, C++ Paimon uses Apache Arrow as +its schema representation. The following table shows the mapping between `Java +Paimon DataTypes `_ +and `Arrow DataTypes `_: + +.. list-table:: + :header-rows: 1 + :widths: 10 10 30 + + * - Java Paimon DataType + - Arrow DataType + - Description + + * - ``BOOLEAN`` + - Boolean + - Data type of a boolean with a (possibly) three-valued logic of TRUE, FALSE, and UNKNOWN. + + * - ``CHAR`` + + ``CHAR(n)`` + - Not Supported + - Data type of a fixed-length character string. + + The type can be declared using ``CHAR(n)`` where n is the number of code + points. n must have a value between 1 and 2,147,483,647 (both inclusive). + If no length is specified, n is equal to 1. + + * - ``VARCHAR`` + + ``VARCHAR(n)`` + + - Not Supported + - Data type of a variable-length character string. + + The type can be declared using ``VARCHAR(n)`` where n is the maximum + number of code points. n must have a value between 1 and 2,147,483,647 + (both inclusive). If no length is specified, n is equal to 1. + + * - ``STRING`` + - Utf8 + - Data type of a variable-length character string. ``STRING`` is a synonym for ``VARCHAR(2147483647)``. + + * - ``BINARY`` + + ``BINARY(n)`` + - Not Supported + - Data type of a fixed-length binary string (=a sequence of bytes). + + The type can be declared using ``BINARY(n)`` where n is the number of + bytes. n must have a value between 1 and 2,147,483,647 (both inclusive). + If no length is specified, n is equal to 1. + + * - ``VARBINARY`` + + ``VARBINARY(n)`` + - Not Supported + - Data type of a variable-length binary string (=a sequence of bytes). + + The type can be declared using ``VARBINARY(n)`` where n is the maximum + number of bytes. n must have a value between 1 and 2,147,483,647 + (both inclusive). If no length is specified, n is equal to 1. + + * - ``BYTES`` + - Binary + - ``BYTES`` is a synonym for ``VARBINARY(2147483647)``. + + * - ``DECIMAL`` + + ``DECIMAL(p)`` + + ``DECIMAL(p, s)`` + - Decimal128 + - Data type of a decimal number with fixed precision and scale. + + The type can be declared using ``DECIMAL(p, s)`` where p is the number of + digits in a number (precision) and s is the number of digits to the right + of the decimal point in a number (scale). p must have a value between 1 + and 38 (both inclusive). s must have a value between 0 and p + (both inclusive). The default value for p is 10. The default value for s is 0. + + * - ``TINYINT`` + - Int8 + - Data type of a 1-byte signed integer with values from -128 to 127. + + * - ``SMALLINT`` + - Int16 + - Data type of a 2-byte signed integer with values from -32,768 to 32,767. + + * - ``INT`` + - Int32 + - Data type of a 4-byte signed integer with values from -2,147,483,648 to 2,147,483,647. + + * - ``BIGINT`` + - Int64 + - Data type of an 8-byte signed integer with values from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. + + * - ``FLOAT`` + - Float + - Data type of a 4-byte single precision floating point number. + + Compared to the SQL standard, the type does not take parameters. + + * - ``DOUBLE`` + - Double + - Data type of an 8-byte double precision floating point number. + + * - ``DATE`` + - Date32 + - Data type of a date consisting of year-month-day with values ranging from 0000-01-01 to 9999-12-31. + + Compared to the SQL standard, the range starts at year 0000. + + * - ``TIME`` + + ``TIME(p)`` + - Not Supported + - Data type of a time without time zone consisting of hour:minute:second[.fractional] with up to nanosecond precision and values ranging from 00:00:00.000000000 to 23:59:59.999999999. + + The type can be declared using ``TIME(p)`` where p is the number of digits of + fractional seconds (precision). p must have a value between 0 and 9 + (both inclusive). If no precision is specified, p is equal to 0. + + * - ``TIMESTAMP`` + + ``TIMESTAMP(p)`` + - Timestamp + - Data type of a timestamp without time zone consisting of year-month-day hour:minute:second[.fractional] with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999. + + The type can be declared using ``TIMESTAMP(p)`` where p is the number of + digits of fractional seconds (precision). p must have a value between 0 + and 9 (both inclusive). If no precision is specified, p is equal to 6. + + * - ``TIMESTAMP WITH LOCAL TIME ZONE`` + + ``TIMESTAMP(p) WITH LOCAL TIME ZONE`` + - Timestamp + - Data type of a timestamp with local time zone consisting of year-month-day hour:minute:second[.fractional] zone with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 +14:59 to 9999-12-31 23:59:59.999999999 -14:59. + + This type fills the gap between time zone free and time zone mandatory + timestamp types by allowing the interpretation of UTC timestamps according + to the configured session time zone. A conversion from and to int describes + the number of seconds since epoch. A conversion from and to long describes the number of milliseconds since epoch. + + * - ``ARRAY`` + - List + - Data type of an array of elements with same subtype. + + Compared to the SQL standard, the maximum cardinality of an array cannot be specified but is fixed at 2,147,483,647. Also, any valid type is supported as a subtype. + + The type can be declared using ``ARRAY`` where t is the data type of the contained elements. + + * - ``MAP`` + - Map + - Data type of an associative array that maps keys to values (including NULL). A map cannot contain duplicate keys; each key can map to at most one value. + + There is no restriction of element types; it is the responsibility of the user to ensure uniqueness. + + The type can be declared using ``MAP`` where kt is the data type of the key elements and vt is the data type of the value elements. + + **Note:** In C++ Paimon, map keys must be explicitly marked as ``NOT NULL``. + Apache Arrow does not support nullable map keys. If the key type is not + marked as ``NOT NULL`` in the schema, parsing will fail with an error. + + * - ``MULTISET`` + - Not Supported + - Data type of a multiset (=bag). Unlike a set, it allows for multiple instances for each of its elements with a common subtype. Each unique value (including NULL) is mapped to some multiplicity. + + There is no restriction of element types; it is the responsibility of the user to ensure uniqueness. + + The type can be declared using ``MULTISET`` where t is the data type of the contained elements. + + * - ``ROW`` + + ``ROW`` + - Struct + - Data type of a sequence of fields. + + A field consists of a field name, field type, and an optional description. + The most specific type of a row of a table is a row type. In this case, + each column of the row corresponds to the field of the row type that has + the same ordinal position as the column. + + Compared to the SQL standard, an optional field description simplifies + the handling with complex structures. + + A row type is similar to the ``STRUCT`` type known from other non-standard-compliant frameworks. + + The type can be declared using ``ROW`` where n + is the unique name of a field, t is the logical type of a field, d is the description of a field. diff --git a/docs/source/user_guide/global_index.rst b/docs/source/user_guide/global_index.rst new file mode 100644 index 0000000..5cb057a --- /dev/null +++ b/docs/source/user_guide/global_index.rst @@ -0,0 +1,95 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +Global Index +============ + +Global Index is a powerful indexing mechanism for append-only tables. +It enables efficient row-level lookups and filtering without full-table scans. +Paimon supports multiple global index types: + +- **Bitmap Index**: A bitmap-based index. Each distinct value is mapped to a compressed bitmap (RoaringBitmap) that records which rows contain that value, enabling extremely fast set membership tests. +- **BTree Index**: An efficient index based on multi-level SST files for scalar column lookups. +- **Range Bitmap Index**: A range bitmap index optimized for range predicates on ordered scalar columns. Extends the bitmap approach by encoding value ordering, enabling efficient less-than, greater-than, and range conditions. +- **Lucene Index**: A full-text search index powered by Lucene++. Supports tokenized text search with multiple modes including match-all, match-any, phrase, prefix, and wildcard queries. +- **Vector Index (Lumina)**: An approximate nearest neighbor (ANN) index powered by Lumina for vector similarity search with configurable distance metrics. + +Global indexes work on top of Data Evolution tables. To use global indexes, your table must have: + +- ``'bucket' = '-1'`` (unaware-bucket mode) +- ``'row-tracking.enabled' = 'true'`` +- ``'data-evolution.enabled' = 'true'`` + +Bitmap Index +------------ + +A bitmap-based index for Equal and In predicates. Each distinct value in the indexed column +is mapped to a compressed bitmap (RoaringBitmap) that records which rows contain that value. +This allows extremely fast set membership tests. + +BTree Index +----------- + +BTree is an efficient index based on multi-level SST files, supporting rich predicate pushdown, block cache, file-level min/max key pruning, lazy loading, and block compression. + +**Special Configuration:** + +- **Option**: ``btree-index.read-buffer-size`` + + - **Description**: Optional. Specifies the read buffer size for the B-tree index. This setting can be tuned based on query patterns: + + - For **range queries** (e.g., ``VisitLessThan``, ``VisitGreaterOrEqual``), increasing the buffer size (e.g., to 1MB) may improve I/O bandwidth and sequential read performance. + - For **point queries** (e.g., ``VisitEqual``), buffering can introduce negative effects due to read amplification; it is recommended to leave this option unset. + +Range Bitmap Index +------------------ + +A range bitmap index optimized for range predicates on ordered scalar columns. It extends the +bitmap approach by encoding value ordering information, enabling efficient evaluation of +less-than, greater-than, and range conditions without scanning all bitmaps. + + +Lucene Index +------------ + +A full-text search index powered by Lucene++. It supports tokenized text search with multiple +search modes including match-all, match-any, phrase, prefix, and wildcard queries. + +**Supported search types:** + +- ``MATCH_ALL``: All terms in the query must be present (AND semantics). +- ``MATCH_ANY``: Any term in the query can match (OR semantics). +- ``PHRASE``: Matches the exact sequence of words (with proximity). +- ``PREFIX``: Matches terms starting with the given string (e.g., "run*" → running, runner). +- ``WILDCARD``: Supports wildcards ``*`` and ``?`` (e.g., "ap*e", "app?e" → "apple"). + +**Special Configuration:** + +- **Option**: ``lucene-fts.write.tmp.directory`` + + - **Description**: Specifies the temporary directory used during Lucene index writing. No default value; must be explicitly set. + +- **Environment Variable**: ``PAIMON_JIEBA_DICT_DIR`` + + - **Description**: Specifies the directory containing Jieba dictionary files for Chinese text tokenization. At runtime, the system first checks this environment variable; if not set, it falls back to the compile-time ``JIEBA_TEST_DICT_DIR`` macro (only available in test builds). If neither is available, will fail with an error. + +Vector Index (Lumina) +--------------------- + +An approximate nearest neighbor (ANN) index powered by Lumina for vector similarity search. +Supports high-dimensional vector search with configurable distance metrics and encoding strategies. +For more configurations, please refer to the third_party/lumina/reference directory. diff --git a/docs/source/user_guide/manifest.rst b/docs/source/user_guide/manifest.rst new file mode 100644 index 0000000..9b69ff5 --- /dev/null +++ b/docs/source/user_guide/manifest.rst @@ -0,0 +1,132 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. Borrowed the file from Apache Paimon: +.. https://github.com/apache/paimon/blob/master/docs/content/concepts/spec/manifest.md + +Manifest +======== + +Manifest List +------------- + +.. code-block:: shell + + ├── manifest + └── manifest-list-51c16f7b-421c-4bc0-80a0-17677f343358-1 + +Manifest List includes metadata of several manifest files. Its name contains a UUID. It is an Avro file with the following schema: + +1. ``_FILE_NAME``: STRING — manifest file name. +2. ``_FILE_SIZE``: BIGINT — manifest file size. +3. ``_NUM_ADDED_FILES``: BIGINT — number of added files in the manifest. +4. ``_NUM_DELETED_FILES``: BIGINT — number of deleted files in the manifest. +5. ``_PARTITION_STATS``: SimpleStats — partition stats. The minimum and maximum values of partition fields in this manifest are beneficial for skipping certain manifest files during queries. +6. ``_SCHEMA_ID``: BIGINT — schema id used when writing this manifest file. + +Manifest +-------- + +Manifest includes metadata of several data files, changelog files, or table-index files. Its name contains a UUID, and it is an Avro file. + +The changes of the file are saved in the manifest, and a file can be added or deleted. Manifests should be in an orderly manner, and the same file may be added or deleted multiple times. The last version should be read. This design makes commit lighter to support file deletion generated by compaction. + +Data Manifest +------------- + +Data Manifest includes metadata of several data files or changelog files. + +.. code-block:: shell + + ├── manifest + └── manifest-6758823b-2010-4d06-aef0-3b1b597723d6-0 + +Schema: + +1. ``_KIND``: TINYINT — ``ADD`` or ``DELETE``. +2. ``_PARTITION``: BYTES — partition spec, a BinaryRow. +3. ``_BUCKET``: INT — bucket of this file. +4. ``_TOTAL_BUCKETS``: INT — total buckets when writing this file; used for verification after bucket changes. +5. ``_FILE``: data file metadata. + +Data file metadata: + +1. ``_FILE_NAME``: STRING — file name. +2. ``_FILE_SIZE``: BIGINT — file size. +3. ``_ROW_COUNT``: BIGINT — total number of rows (including add & delete) in this file. +4. ``_MIN_KEY``: STRING — minimum key of this file. +5. ``_MAX_KEY``: STRING — maximum key of this file. +6. ``_KEY_STATS``: SimpleStats — statistics of the key. +7. ``_VALUE_STATS``: SimpleStats — statistics of the value. +8. ``_MIN_SEQUENCE_NUMBER``: BIGINT — minimum sequence number. +9. ``_MAX_SEQUENCE_NUMBER``: BIGINT — maximum sequence number. +10. ``_SCHEMA_ID``: BIGINT — schema id when writing this file. +11. ``_LEVEL``: INT — level of this file in LSM. +12. ``_EXTRA_FILES``: ARRAY — extra files for this file (e.g., data file index file). +13. ``_CREATION_TIME``: TIMESTAMP_MILLIS — creation time of this file. +14. ``_DELETE_ROW_COUNT``: BIGINT — rowCount = addRowCount + deleteRowCount. +15. ``_EMBEDDED_FILE_INDEX``: BYTES — if the data file index is small, store the index in the manifest. +16. ``_FILE_SOURCE``: TINYINT — indicates whether this file is generated as an ``APPEND`` or ``COMPACT`` file. +17. ``_VALUE_STATS_COLS``: ARRAY — statistical columns in metadata. +18. ``_EXTERNAL_PATH``: STRING — external path of this file; ``null`` if it is in the warehouse. + +Index Manifest +-------------- + +Index Manifest includes metadata of several table-index files. + +.. code-block:: shell + + ├── manifest + └── index-manifest-5d670043-da25-4265-9a26-e31affc98039-0 + +Schema: + +1. ``_KIND``: TINYINT — ``ADD`` or ``DELETE``. +2. ``_PARTITION``: BYTES — partition spec, a BinaryRow. +3. ``_BUCKET``: INT — bucket of this file. +4. ``_INDEX_TYPE``: STRING — ``HASH`` or ``DELETION_VECTORS``. +5. ``_FILE_NAME``: STRING — file name. +6. ``_FILE_SIZE``: BIGINT — file size. +7. ``_ROW_COUNT``: BIGINT — total number of rows. +8. ``_DELETIONS_VECTORS_RANGES``: Metadata only used by ``DELETION_VECTORS``; an array of deletion vector metadata. Each deletion vector metadata has: + - ``f0``: the data file name corresponding to this deletion vector. + - ``f1``: the starting offset of this deletion vector in the index file. + - ``f2``: the length of this deletion vector in the index file. + - ``_CARDINALITY``: the number of deleted rows. + +Appendix +-------- + +SimpleStats +~~~~~~~~~~~ + +SimpleStats is a nested row with the following schema: + +1. ``_MIN_VALUES``: BYTES — BinaryRow; the minimum values of the columns. +2. ``_MAX_VALUES``: BYTES — BinaryRow; the maximum values of the columns. +3. ``_NULL_COUNTS``: ARRAY — the number of nulls in the columns. + +BinaryRow +~~~~~~~~~ + +BinaryRow is backed by bytes instead of ``Object``. It can significantly reduce the serialization/deserialization of Java objects. + +A row has two parts: fixed-length part and variable-length part. + +- Fixed-length part contains a 1-byte header, null bit set, and field values. The null bit set is used for null tracking and is aligned to 8-byte word boundaries. +- Field values hold fixed-length primitive types and variable-length values that can be stored in 8 bytes. If the variable-length field does not fit in 8 bytes, then the fixed-length part stores the length and the offset of the variable-length part. diff --git a/docs/source/user_guide/prefetch.rst b/docs/source/user_guide/prefetch.rst new file mode 100644 index 0000000..15ae9d7 --- /dev/null +++ b/docs/source/user_guide/prefetch.rst @@ -0,0 +1,48 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +Prefetch +======== + +.. image:: ../_static/prefetch.svg + :alt: File Layout + :align: center + :width: 100% + +In C++ Paimon, we use a multi-producer, single-consumer model to optimize file reading. The core idea is to split a file into line-based ReadRanges and assign them to multiple reader threads (producers). Each reader thread owns an independent result queue that holds its processed RecordBatches. In the main reader thread (the consumer), we sort the heads of all queues by the ReadRange start offset in ascending order and select the RecordBatch with the smallest start offset to ensure globally ordered results. + +Read Range Splitting Strategy +============================= + +Designing an efficient ReadRange splitting strategy requires balancing two key objectives: + +- Minimize read amplification: Ensure the data fetched from storage is used effectively, avoiding unnecessary I/O overhead. +- Reduce ReadRange span: Ideally, the size of a ReadRange should match a single read batch size to enable fine-grained parallel control. + +Below we detail how these strategies are applied to formats Parquet. + +Parquet +======== + +Parquet files are organized into RowGroups and Pages. Since C++ Parquet does not support row-level seeking, prefetching can only be done at the RowGroup level. This naturally avoids read amplification, but introduces a new challenge: if a file contains only a small number of RowGroups, parallelism is severely limited. Therefore, we recommend users reduce RowGroup size when writing Parquet files to increase opportunities for parallel processing. + +Another critical difference is the read behavior compared to Orc. Orc strictly returns RecordBatches aligned to Stripe boundaries, whereas C++ Parquet may return a RecordBatch containing data from multiple RowGroups. This can lead to output order confusion during parallel reads. We modified C++ Parquet internals to return results strictly aligned to RowGroup boundaries, matching Orc’s behavior. With this change, parallel reading no longer requires complex seek operations, improving overall read efficiency. + +.. admonition:: TODO + :class: tip + + Support prefetch for Orc. diff --git a/docs/source/user_guide/primary_key_table.rst b/docs/source/user_guide/primary_key_table.rst new file mode 100644 index 0000000..7a7e25a --- /dev/null +++ b/docs/source/user_guide/primary_key_table.rst @@ -0,0 +1,82 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. Borrowed the file from Apache Paimon: +.. https://github.com/apache/paimon/blob/master/docs/content/primary-key-table/overview.md + +Primary Key Table +================= +If you define a table with primary key, you can insert, update or delete records +in the table. + +Primary keys consist of a set of columns that contain unique values for each +record. Paimon enforces data ordering by sorting the primary key within each +bucket, allowing users to achieve high performance by applying filtering +conditions on the primary key. + + +Bucket +------- +Unpartitioned tables, or partitions in partitioned tables, are sub-divided into +buckets, to provide extra structure to the data that may be used for more +efficient querying. + +Each bucket directory contains an LSM tree and its changelog files. + +.. note:: + Changelog is not supported yet for C++ Paimon primary key table write. + +The range for a bucket is determined by the hash value of one or more columns in +the records. Users can specify bucketing columns by providing the bucket-key option. +If no bucket-key option is specified, the primary key (if defined) or the complete +record will be used as the bucket key. + +A bucket is the smallest storage unit for reads and writes, so the number of +buckets limits the maximum processing parallelism. This number should not be too +big, though, as it will result in lots of small files and low read performance. +In general, the recommended data size in each bucket is about 200MB - 1GB. + +Also, see rescale bucket if you want to adjust the number of buckets after a +table is created. + + +LSM Trees +------------- +Paimon adopts the LSM tree (log-structured merge-tree) as the data structure for +file storage. This documentation briefly introduces the concepts about LSM trees. + +Sorted Runs +~~~~~~~~~~~~~~ +LSM tree organizes files into several sorted runs. A sorted run consists of one +or multiple data files and each data file belongs to exactly one sorted run. + +Records within a data file are sorted by their primary keys. Within a sorted run, +ranges of primary keys of data files never overlap. + +.. image:: ../_static/sorted-runs.png + :alt: Sorted Runs + :align: center + :width: 100% + +As you can see, different sorted runs may have overlapped primary key ranges, +and may even contain the same primary key. When querying the LSM tree, all +sorted runs must be combined and all records with the same primary key must be +merged according to the user-specified merge engine and the timestamp of each record. + +New records written into the LSM tree will be first buffered in memory. When the +memory buffer is full, all records in memory will be sorted and flushed to disk. +A new sorted run is now created. diff --git a/docs/source/user_guide/read.rst b/docs/source/user_guide/read.rst new file mode 100644 index 0000000..9440773 --- /dev/null +++ b/docs/source/user_guide/read.rst @@ -0,0 +1,302 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +Read +==== +Paimon by functionality can be divided into two layers: + +- Control Plane: Responsible for accessing and managing Meta (snapshot, manifest, etc.), including: + - Catalog / Database access + - Table retrieval + - Collection and resolution of data files + +- Data Plane: Responsible for accessing actual data files, including: + - Readers for various file formats + - Coordinated reading of file collections + +The control plane and data plane interact primarily via DataSplit (the query plan). C++ Paimon currently supports a standard +DataSplit protocol which includes the necessary meta information to access data files. With DataSplit, a high-performance +data access path can be integrated. + +At compute time, the execution engine (reader) does not need to be aware of the concrete table type or its metadata details. +It only needs to follow the instructions within the DataSplit (query plan) to perform data reading operations. + +With the layered abstraction of the control plane and data plane, and the use of DataSplit as a stable protocol interface, +the two layers can evolve their functionality and optimize code relatively independently. This design also enables +cross-language task scheduling and interaction (e.g., Java and C++), substantially reducing engineering maintenance costs +across the two language ecosystems. + + +Schema Evolution +----------------------- +Scope and Compatibility +~~~~~~~~~~~~~~~~~~~~~~~~ + +C++ Paimon supports all evolution kinds available in Java Paimon for non-nested types: + +- Add column +- Drop column +- Reorder columns +- Rename column +- Change column type + +.. note:: + + - Only non-nested type evolution is supported. Nested columns (struct, array, map) are not supported. + - Partition keys: Only column reordering is supported; other operations are not supported (consistent with Java Paimon). + - Primary key: + + - Adding or dropping columns is not supported. + - Other operations are supported (consistent with Java Paimon). + +Per-File Schema via Field IDs +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In DataSplit, each file may have a completely different data schema. Paimon uses field IDs to uniquely identify fields. + +Overflow Behavior Disclaimer +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Overflow behavior is undefined for C++ and Java Paimon. Results in overflow scenarios may: + +- Be incorrect values, +- Return an error status, +- Or be null. + +C++ Paimon does not guarantee identical results to Java Paimon in overflow scenarios. Users should not rely on identical +return values between implementations. + +Type Change Support Matrix +~~~~~~~~~~~~~~~~~~~~~~~~~~ +The table below indicates support for changing a column type from ``source`` to ``target``. Refer to the numbered notes below the table +for caveats. + +.. list-table:: + :header-rows: 1 + :widths: 12 10 10 10 10 10 10 8 12 10 8 18 10 + + * - src \\ target + - tinyint + - smallint + - int + - bigint + - float + - double + - bool + - string + - binary + - date + - timestamp (without tz) + - decimal + * - tinyint + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ❌ + - ❌ + - ❌ + - ✅ + * - smallint + - ✅ 1️⃣ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ❌ + - ❌ + - ❌ + - ✅ + * - int + - ✅ 1️⃣ + - ✅ 1️⃣ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ❌ + - ❌ + - ✅ 1️⃣ + - ✅ + * - bigint + - ✅ 1️⃣ + - ✅ 1️⃣ + - ✅ 1️⃣ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ❌ + - ❌ + - ✅ 6️⃣ + - ✅ + * - float + - ✅ 2️⃣ + - ✅ 2️⃣ + - ✅ 2️⃣ + - ✅ 2️⃣ + - ✅ + - ✅ + - ✅ + - ✅ 3️⃣ 4️⃣ + - ❌ + - ❌ + - ❌ + - ✅ + * - double + - ✅ 2️⃣ + - ✅ 2️⃣ + - ✅ 2️⃣ + - ✅ 2️⃣ + - ✅ 2️⃣ + - ✅ + - ✅ + - ✅ 3️⃣ 4️⃣ + - ❌ + - ❌ + - ❌ + - ✅ + * - bool + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + - ❌ + - ❌ + - ❌ + - ✅ + * - string + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ 3️⃣ + - ✅ 3️⃣ + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ 5️⃣ + - ✅ 7️⃣ + * - binary + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + - ✅ + - ✅ + - ❌ + - ❌ + - ❌ + * - date + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + - ✅ + - ❌ + - ✅ + - ✅ 5️⃣ + - ❌ + * - timestamp (without tz) + - ❌ + - ❌ + - ✅ 1️⃣ + - ✅ + - ❌ + - ❌ + - ❌ + - ✅ + - ❌ + - ✅ + - ✅ + - ❌ + * - decimal + - ✅ 1️⃣ + - ✅ 1️⃣ + - ✅ 1️⃣ + - ✅ 1️⃣ + - ✅ + - ✅ + - ❌ + - ✅ + - ❌ + - ❌ + - ❌ + - ✅ + +.. admonition:: Overflow Behavior Notes + :class: note + + 1️⃣ Integer downcast overflow behavior matches Java in specific cases. + Example: smallint -> tinyint, 32767 becomes -1; int -> smallint, -2147483648 becomes 0. + + 2️⃣ Floating-point overflow behavior is partially consistent with Java and partially different. + Example: float -> tinyint + - Java: MAX_FLOAT -> -1, INFINITY -> -1 + - C++: MAX_FLOAT -> 0, INFINITY -> 0 + + 3️⃣ Keyword differences for special float/double values: + - Java: Infinity, -Infinity, NaN + - C++: inf, -inf, nan + + 4️⃣ Printing difference: + - C++ Paimon prints 1.0 as ``1`` + - Java Paimon prints 1.0 as ``1.0`` + + 5️⃣ Timestamp precision and range differences: + - Java Paimon: 0000-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999 + - C++ Paimon: 1677-09-21 00:12:43.145224192 to 2262-04-11 23:47:16.854775807 + - C++ only supports nanosecond precision; range is smaller. + + 6️⃣ bigint -> timestamp range differences: + - Java Paimon (ms): ``[MIN_INT64/1000, MAX_INT64/1000]`` seconds + - C++ Paimon (ns): ``[MIN_INT64/1e9, MAX_INT64/1e9]`` seconds + + 7️⃣ string -> decimal with precision > 38: + - C++ returns ``null`` if parsing would overflow 128-bit arithmetic. + - Java may rescale and return a value based on the rescaled precision. + - Example input: ``1111111111111111111111111111111111111.15``, Java returns: ``1111111111111111111111111111111111111.2``, C++ returns: ``null`` + +Implementation Guidance +~~~~~~~~~~~~~~~~~~~~~~~ + +- Use DataSplit as the sole interface between control and data planes. Treat it as the canonical query plan contract. +- Resolve field types and IDs per file; prefer inline data file metadata, fallback to table schema files when necessary. +- Expect per-file schema variability; design readers to align by field IDs rather than positional indices. +- Do not assume identical overflow semantics across C++ and Java; tests should validate acceptable ranges and nullability. +- For timestamp handling, consider precision/range constraints in C++ when interoperating with Java-produced data splits. diff --git a/docs/source/user_guide/schema.rst b/docs/source/user_guide/schema.rst new file mode 100644 index 0000000..6c716af --- /dev/null +++ b/docs/source/user_guide/schema.rst @@ -0,0 +1,135 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. Borrowed the file from Apache Paimon: +.. https://github.com/apache/paimon/blob/master/docs/content/concepts/spec/schema.md + +Schema +====== + +The version of the schema file starts from 0 and currently retains all versions of the schema. +There may be old files that rely on the old schema version, so its deletion should be done with caution. + +Schema File is JSON, it includes: + +1. ``fields``: data field list. Each data field contains ``id``, ``name``, ``type``. Field ``id`` is used to support schema evolution. +2. ``partitionKeys``: field name list. Partition definition of the table; it cannot be modified. +3. ``primaryKeys``: field name list. Primary key definition of the table; it cannot be modified. +4. ``options``: ``map``, unordered. Options of the table, including a lot of capabilities and optimizations. + +Example +------- + +.. code-block:: json + + { + "version" : 3, + "id" : 0, + "fields" : [ { + "id" : 0, + "name" : "order_id", + "type" : "BIGINT NOT NULL" + }, { + "id" : 1, + "name" : "order_name", + "type" : "STRING" + }, { + "id" : 2, + "name" : "order_user_id", + "type" : "BIGINT" + }, { + "id" : 3, + "name" : "order_shop_id", + "type" : "BIGINT" + } ], + "highestFieldId" : 3, + "partitionKeys" : [ ], + "primaryKeys" : [ "order_id" ], + "options" : { + "bucket" : "5" + }, + "comment" : "", + "timeMillis" : 1720496663041 + } + +Compatibility +------------- + +For old versions: + +- Version 1: should put ``bucket -> 1`` to ``options`` if there is no ``bucket`` key. +- Versions 1 & 2: should put ``file.format -> orc`` to ``options`` if there is no ``file.format`` key. + +DataField +--------- + +DataField represents a column of the table. + +1. ``id``: int, column id, automatic increment; it is used for schema evolution. +2. ``name``: string, column name. +3. ``type``: data type, very similar to SQL type string. +4. ``description``: string. + +Limitations +----------- + +MAP Key Must Be NOT NULL +^^^^^^^^^^^^^^^^^^^^^^^^ + +Apache Arrow does not support nullable map keys. When defining a ``MAP`` type in the schema, +the key must be explicitly marked as ``NOT NULL``. If the key is not marked as ``NOT NULL``, +schema parsing will fail with an error. + +For example, the following is **valid**: + +.. code-block:: json + + { + "type": "MAP", + "key": "TINYINT NOT NULL", + "value": "SMALLINT" + } + +The following is **invalid** and will be rejected: + +.. code-block:: json + + { + "type": "MAP", + "key": "TINYINT", + "value": "SMALLINT" + } + +Update Schema +------------- + +Updating the schema should generate a new schema file. + +.. code-block:: text + + warehouse + └── default.db + └── my_table + ├── schema + ├── schema-0 + ├── schema-1 + └── schema-2 + +There is a reference to schema in the snapshot. The schema file with the highest numerical value is usually the latest schema file. + +Old schema files cannot be directly deleted because there may be old data files that reference old schema files. When +reading the table, it is necessary to rely on them for schema evolution reading. diff --git a/docs/source/user_guide/snapshot.rst b/docs/source/user_guide/snapshot.rst new file mode 100644 index 0000000..a2aa1e7 --- /dev/null +++ b/docs/source/user_guide/snapshot.rst @@ -0,0 +1,67 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. Borrowed the file from Apache Paimon: +.. https://github.com/apache/paimon/blob/master/docs/content/concepts/spec/snapshot.md + +Snapshot +======== + +Each commit generates a snapshot file, and the version of the snapshot file starts from 1 and must be continuous. +``EARLIEST`` and ``LATEST`` are hint files at the beginning and end of the snapshot list, and they can be inaccurate. +When hint files are inaccurate, the reader will scan all snapshot files to determine the beginning and end. + +Directory Layout +---------------- + +.. code-block:: shell + + warehouse + └── default.db + └── my_table + ├── snapshot + ├── EARLIEST + ├── LATEST + ├── snapshot-1 + ├── snapshot-2 + └── snapshot-3 + +Writing commit will preempt the next snapshot id, and once the snapshot file is successfully written, this commit will +become visible. + +Snapshot File +------------- + +Snapshot file is JSON and includes: + +1. ``version``: Snapshot file version, current is 3. +2. ``id``: Snapshot id, same as the file name. +3. ``schemaId``: The corresponding schema version for this commit. +4. ``baseManifestList``: A manifest list recording all changes from the previous snapshots. +5. ``deltaManifestList``: A manifest list recording all new changes occurred in this snapshot. +6. ``changelogManifestList``: A manifest list recording all changelog produced in this snapshot; ``null`` if no changelog is produced. +7. ``indexManifest``: A manifest recording all index files of this table; ``null`` if no table index file exists. +8. ``commitUser``: Usually generated by UUID; used for recovery of streaming writes—one stream write job with one user. +9. ``commitIdentifier``: Transaction id corresponding to streaming write; each transaction may result in multiple commits for different ``commitKind`` values. +10. ``commitKind``: Type of changes in this snapshot, including ``append``, ``compact``, ``overwrite`` and ``analyze``. +11. ``timeMillis``: Commit time in milliseconds. +12. ``logOffsets``: Commit log offsets. +13. ``totalRecordCount``: Record count of all changes occurred in this snapshot. +14. ``deltaRecordCount``: Record count of all new changes occurred in this snapshot. +15. ``changelogRecordCount``: Record count of all changelog produced in this snapshot. +16. ``watermark``: Watermark for input records, from Flink watermark mechanism; ``Long.MIN_VALUE`` if there is no watermark. +17. ``statistics``: Stats file name for statistics of this table. diff --git a/docs/source/user_guide/write.rst b/docs/source/user_guide/write.rst new file mode 100644 index 0000000..da2120b --- /dev/null +++ b/docs/source/user_guide/write.rst @@ -0,0 +1,166 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +Write +===== +Batch writing requires the compute engine to pre-bucket data (bucket), using the +same bucketing strategy as Paimon to ensure correct ``Scan`` behavior, and to +specify the target ``partition``. Data should be accumulated into ``RecordBatch`` +and written to Paimon. + +Paimon C++ uses Apache Arrow as the :ref:`in-memory columnar format` +to more efficiently support writing to disk columnar formats such as ORC and +Parquet, thereby improving write throughput. + +.. note:: + Currently supported table types: + - Append table + - Primary Key table + + Not supported in the current scope: + - Changelog + - Indexes + +Bucketing Modes +--------------- + +- Append tables: + + * Support ``bucket = -1`` (dynamic bucket mode) + * Support ``bucket > 0`` (fixed bucket mode) + +- PK tables: + + * Support ``bucket = -2`` (postpone bucket mode) + * Support ``bucket > 0`` (fixed bucket mode) + +.. note:: + PK tables do not support dynamic bucketing (``bucket = -1``). + +RecordBatch Construction +------------------------ + +- The compute engine must: + + - Apply the Paimon-consistent bucketing function to each row prior to batching. + - Assign the correct ``partition`` for each row. + - Group rows into Arrow ``RecordBatch`` per partition-bucket combination to minimize writer state changes and I/O overhead. + +- Recommended practices: + + - Use schema-aligned Arrow arrays with explicit validity bitmaps and offsets. + - Prefer batch sizes tuned for I/O throughput (e.g., tens to hundreds of MB per flush, depending on filesystem and cluster configuration). + - Maintain stable sort orders within a batch only if required by downstream merge or compaction logic; otherwise avoid unnecessary ordering costs. + +Prepare Commit +---------------- + +The compute engine is responsible for triggering the writer nodes' ``PrepareCommit``. +Triggering conditions depend on the engine’s business needs and can follow either: + +- Streaming mode: time-based or periodic triggers (e.g., every N seconds). +- Batch mode: trigger after all data in the batch has been written. + +Once the compute engine collects ``CommitMessages`` from all writer nodes, it +can issue a ``Commit`` request to the control plane (management path) to create +a new ``Snapshot``. + +Compatibility Goals +~~~~~~~~~~~~~~~~~~~ + +To ensure interoperability, the ``PrepareCommit`` result produced by Paimon C++ +must be consumable by Paimon Java. Therefore: + +- The structure and semantics of ``CommitMessage`` must remain consistent with + Java Paimon. +- Any evolution of the Java-side ``CommitMessage`` schema must be tracked and + validated on the C++ side to maintain cross-language compatibility. + +Interface Design in Paimon C++ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Unlike Java Paimon, Paimon C++ does not expose ``BinaryRow``-like types in its +public interfaces. To preserve compatibility without leaking internal row +representations, Paimon C++ provides ``CommitMessage`` only through: + +- Serialization: convert the internal commit state into a well-defined binary + representation that matches Java Paimon’s expectations. +- Deserialization: parse the Java-compatible binary representation back into + C++ commit structures for validation, replay, or tooling needs. + +This design ensures that: + +- Public APIs are independent of Java-specific row abstractions. +- Cross-language commit payloads remain stable and versionable. +- Internal data layouts can evolve without breaking external consumers. + +CommitMessage Contract +~~~~~~~~~~~~~~~~~~~~~~ + +The ``CommitMessage`` must encode all information required by the coordinator to +produce a correct ``Snapshot``, which commonly includes (but is not limited to): + +- Partition and bucket identifiers associated with written data. +- New data files, delete files (as applicable to the table type). +- File-level metadata required for manifest and index updates (e.g., row counts, min/max statistics where applicable). +- Transactional markers and sequence numbers as required by table semantics. +- Any per-writer state necessary for deduplication or idempotent commits. + +.. note:: + + Current C++ scope supports Append and PK tables. Changelog is out of + scope and should not be emitted in ``CommitMessage`` until + explicitly supported. + +Serialization and Deserialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Binary Format: + - The binary payload must strictly conform to Java Paimon’s ``CommitMessage`` encoding. + - Version tags or schema identifiers should be included to enable forwards/backwards compatibility and safe upgrades. + +- Serialization API: + - Provide a function to serialize the writer’s commit state into a byte buffer (or stream) consumable by Java Paimon. + +- Deserialization API: + - Provide a function to parse a Java-produced ``CommitMessage`` binary payload back into C++ commit structures for verification, replay, and testing. + +- Validation: + - Include conformance tests to assert that C++ serialized payloads are accepted by Java Paimon. + - Include round-trip tests to ensure C++ can parse Java-produced payloads and vice versa for supported message versions. + +Operational Flow +~~~~~~~~~~~~~~~~~~~~~~~ + +1. Writer nodes perform data ingestion and produce Arrow ``RecordBatch`` + organized by partition and bucket. + +2. Writers flush batches into ORC/Parquet files via registered ``file.format`` + and ``file-system`` backends, producing file-level metadata and per-batch + commit state. + +3. Each writer invokes ``PrepareCommit``, which: + - Aggregates per-writer state into a ``CommitMessage``. + - Serializes the message into a Java-compatible binary payload. + +4. The compute engine gathers ``CommitMessages`` from all writers. + +5. The compute engine issues a ``Commit`` request to the control plane with the + collected messages, resulting in a new ``Snapshot``. + +6. The coordinator validates the messages, updates manifests/metadata, and + finalizes the snapshot atomically.