Antalya 26.5: Parallelize reads from a single Parquet file in StorageFile by zvonand · Pull Request #1970 · Altinity/ClickHouse

zvonand · 2026-06-26T00:09:55Z

Changelog category (leave one):

Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Reading a single large local Parquet file via file() / File engine is now parallelised across multiple sources, each handling a subset of row groups. This eliminates a Resize 1 → N bottleneck in the pipeline and brings single-file ClickBench performance close to the partitioned variant — Q23 goes from ~1.4s to ~0.55s, Q22 from ~0.9s to ~0.48s, Q27 from ~1.6s to ~0.54s on 96 vCPUs (ClickHouse#104251 by @alexey-milovidov).

Cherry-picked from ClickHouse#104251.

On ClickBench, single-file Parquet runs are 3–9× slower than the 100-file partitioned runs on the same data (e.g. on c7a.metal-48xl, Q23 is 8.90s vs 0.99s, Q22 1.82s vs 0.41s, Q27 1.21s vs 0.45s). The cause is in StorageFile: when reading a single splittable file it creates exactly one ParquetV3BlockInputFormat source, so the pipeline becomes File 0 → 1 followed by Resize 1 → 96. That fan-out is a serialization point — every chunk has to leave the single source through one read before any of the 96 aggregators can touch it, so most cores sit idle.

The bucket-splitting machinery (ParquetBucketSplitter, setBucketsToRead, FileBucketInfo) already existed for cluster mode but was never wired into StorageFile. This PR wires it in:

New IBucketSplitter::splitToBucketsByCount returning roughly N contiguous row-group ranges; Parquet implements it.
New FormatFactory::checkFormatHasSplitter so callers can probe without throwing.
StorageFile::ReadFromFile::initializePipeline, when reading exactly one local splittable file, asks the splitter for max_num_streams buckets and creates one StorageFileSource per bucket. Each source carries fixed_file_path + file_bucket_info and skips the shared FilesIterator.
ParquetV3BlockInputFormat::read honours buckets_to_read in the trivial-count path so each bucket only reports its own row count.
The count cache (keyed by file path) is bypassed for bucketed reads — otherwise every bucket would report the file's total and counts would be multiplied by the number of buckets.

Pipeline becomes File × N 0 → 1 straight into the aggregators, matching the partitioned variant (#1806 by @zvonand).

Cherry-picked from #1806.

Results

96-vCPU box, hits.parquet (14 GiB, 226 row groups):

	Single (master)	Single (this PR)	Partitioned
Q21	0.40–0.66s	0.44s	0.34s
Q22	0.93–1.36s	0.48s	0.41s
Q23	1.33–1.45s	0.55s	0.42s
Q26	0.50s	0.35s	0.19s
Q27	1.6s	0.54s	0.45s

CPU utilisation on Q23 jumped from ~6× to ~18× of 96 cores. Aggregate results (count, sum(UserID), sum(length(URL)), Q21, Q23) match the partitioned variant exactly. The remaining ~1.3× gap to partitioned is per-source initialization overhead: each bucket source still reads the 14 GB file's footer separately. Sharing parsed metadata for local files is the obvious next step but a much bigger change.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

…next commit) --- Original cherry-pick message follows: Merge pull request #1806 from Altinity/feature/antalya-26.3/ClickHouse-ClickHouse-pr-104251 Antalya 26.3: Parallelize reads from a single Parquet file in StorageFile # Conflicts: # src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp # src/Processors/Formats/Impl/ParquetBlockInputFormat.h

Both ParquetBlockInputFormat.cpp and ParquetBlockInputFormat.h were new to antalya-26.5 (they existed on antalya-26.3 but had not been ported yet). Git reported a conflict because the cherry-pick diff tried to modify an existing file at line 1418, but the file was absent on the target branch. The resolution takes the full file content from the merge commit (antalya-26.3 base + PR#1806 additions), which is the correct outcome: - splitToBucketsByCount is added to both files (bucket-1: in the source PR's first-parent diff). - filterByMatchingRowGroups is preserved (the first-parent diff does NOT remove it; only the feature-branch-vs-base diff does; keeping it is required because IBucketSplitter::filterByMatchingRowGroups is pure virtual on antalya-26.5).

github-actions · 2026-06-26T00:10:57Z

Workflow [PR], commit [4cfc7cf]

zvonand added 2 commits June 26, 2026 01:37

zvonand added releasy Created/managed by RelEasy antalya-26.5 ai-resolved Port conflict auto-resolved by Claude labels Jun 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Antalya 26.5: Parallelize reads from a single Parquet file in StorageFile#1970

Antalya 26.5: Parallelize reads from a single Parquet file in StorageFile#1970
zvonand wants to merge 2 commits into
antalya-26.5from
feature/antalya-26.5/pr-1806

zvonand commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

zvonand commented Jun 26, 2026

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Results

Documentation entry for user-facing changes

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant