Add C-level buffered array writer with direct file descriptor writes by Copilot · Pull Request #210 · static-frame/arraykit

Copilot · 2026-06-29T15:51:34Z

NPYConvert.to_py currently writes array payloads via Python np.nditer chunks and chunk.tobytes('C'), creating many short-lived Python objects. This change adds a C-extension path that streams buffered array bytes directly to file objects using OS-level writes when possible, while preserving C/Fortran write-order semantics used by NPY serialization.

New C API for buffered file writes
- Added write_array_to_file(array, file, *, fortran_order=False, buffersize=8192) to arraykit.
- Implementation uses NumPy NpyIter (EXTERNAL_LOOP | BUFFERED | ZEROSIZE_OK).
- For real file objects: Uses PyObject_AsFileDescriptor() to get the file descriptor and writes directly via C write() system call (_write() on Windows), completely eliminating Python method calls and object creation for the I/O path.
- For file-like objects (BytesIO, StringIO, custom classes): Falls back to calling file.write() with PyMemoryView (zero-copy for contiguous data).
- For non-unit inner strides, bytes are packed into a reusable contiguous buffer in C before writing.
Module + typing surface
- Registered and exported the new function through _arraykit and arraykit.__init__.
- Added stub signature in __init__.pyi with matching defaults.
Contract and edge handling
- Preserves prior order behavior: Fortran iteration is used only when requested and the array is not C-contiguous.
- Validates buffersize >= 1 with explicit error messaging.
- Handles partial writes with explicit error reporting.
- Properly checks PyErr_Occurred() after iteration (standard NumPy pattern).

Performance characteristics:

Best case (contiguous data + real file): Pure C path with zero Python objects or method calls
Good case (non-contiguous data + real file): Single reusable buffer + direct file descriptor writes
Fallback (file-like objects): PyMemoryView (zero-copy) + Python method call

# before (Python-level per-chunk allocation pattern)
for chunk in np.nditer(array, flags=('external_loop', 'buffered', 'zerosize_ok'),
                       buffersize=buffersize, order='C'):
    file.write(chunk.tobytes('C'))

# after (C-level buffered streaming with direct fd writes in arraykit)
arraykit.write_array_to_file(array, file, fortran_order=False, buffersize=buffersize)

Copilot

Pull request overview

This PR adds a new C-extension API (write_array_to_file) to stream NumPy array bytes to a file-like object using NumPy’s iterator API, reducing Python-level per-chunk tobytes() allocations while preserving C/Fortran iteration semantics used by NPY-style serialization.

Changes:

Implemented write_array_to_file(array, file, *, fortran_order=False, buffersize=8192) in the C extension using NpyIter with EXTERNAL_LOOP | BUFFERED | ZEROSIZE_OK.
Exported the new function via _arraykit, arraykit.__init__, and added a matching stub in __init__.pyi.
Added unit tests validating C-order and Fortran-order output against a Python np.nditer(...).tobytes('C') reference.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
test/test_util.py	Adds tests for `write_array_to_file` output and basic argument validation.
src/methods.h	Declares the new C-extension entry point.
src/methods.c	Implements buffered iteration + streaming writes via `file.write(...)`.
src/_arraykit.c	Registers `write_array_to_file` in the module method table.
src/init.pyi	Adds typed stub signature for the new API.
src/init.py	Re-exports the new API from the top-level package.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

flexatone · 2026-06-29T18:50:07Z

@copilot : In this implementation we are still calling Python methods on the Python file object, passing it Python byte objects. Is there any more direct way that we can write the bytes to the file, without creating Python Bytes objects or calling Python file methods?

Copilot · 2026-06-29T18:56:53Z

@flexatone Unfortunately I hit an unexpected error while processing your comment. I've automatically reported this to GitHub.

You can ask me to try again later by mentioning me in a new comment.