When given input containing multiple concatenated zstd frames, Zstd.decompress returns only the decompressed output of the first frame and silently drops the rest.
The zstd format explicitly supports frame concatenation (e.g. cat a.zst b.zst | zstd -d yields the concatenation of both), and the underlying ZSTD_decompress / ZSTD_decompressStream decode all frames.
Environment
- OS: Linux 7.1.1-2
- Ruby: ruby 4.0.5
- zstd-ruby: 2.0.6
Reproduction
require 'zstd-ruby'
a = Zstd.compress("Hello, ")
b = Zstd.compress("World!")
concatenated = a + b
p Zstd.decompress(concatenated)
# => "Hello, World!" (expected)
# => "Hello, " (actual)
Three or more frames behave the same way — everything after the first frame is lost.
Affected code
ext/zstdruby/zstdruby.c
decode_one_frame stops as soon as one frame finishes (ZSTD_decompressStream
returns 0):
if (ret == 0) {
break; // end of ONE frame
}
and rb_decompress returns immediately after decoding the first data frame, instead of looping back to process any remaining frames:
if (magic == ZSTD_MAGIC) {
ZSTD_DCtx *dctx = ZSTD_createDCtx();
VALUE out = decode_one_frame(dctx, in + off, in_size - off, kwargs);
ZSTD_freeDCtx(dctx);
RB_GC_GUARD(input_value);
return out; // <- returns; trailing frames are never read
}
The skippable-frame branch already advances off and continues the loop; the data-frame branch does not.
Suggested fix (direction)
Keep scanning until the whole input is consumed, accumulating output across frames:
- Have
decode_one_frame report how many input bytes it consumed (the final
ZSTD_inBuffer.pos) so the caller can advance off.
- In
rb_decompress, append each frame's output to a single result buffer and
advance off by the consumed length instead of returning after the first frame.
When given input containing multiple concatenated zstd frames,
Zstd.decompressreturns only the decompressed output of the first frame and silently drops the rest.The zstd format explicitly supports frame concatenation (e.g.
cat a.zst b.zst | zstd -dyields the concatenation of both), and the underlyingZSTD_decompress/ZSTD_decompressStreamdecode all frames.Environment
Reproduction
Three or more frames behave the same way — everything after the first frame is lost.
Affected code
ext/zstdruby/zstdruby.cdecode_one_framestops as soon as one frame finishes (ZSTD_decompressStreamreturns
0):and
rb_decompressreturns immediately after decoding the first data frame, instead of looping back to process any remaining frames:The skippable-frame branch already advances
offand continues the loop; the data-frame branch does not.Suggested fix (direction)
Keep scanning until the whole input is consumed, accumulating output across frames:
decode_one_framereport how many input bytes it consumed (the finalZSTD_inBuffer.pos) so the caller can advanceoff.rb_decompress, append each frame's output to a single result buffer andadvance
offby the consumed length instead of returning after the first frame.