Skip to content

Add the cchar_t (complex character) API to the curses module #152233

Description

@serhiy-storchaka

Feature or enhancement

gh-151757 added wide and combining character support to the curses character-cell write methods (they now accept a str cell), but the module still has no representation for a styled wide cell — a curses cchar_t: a spacing character plus combining marks together with its attributes and color pair. As a result none of the wide-character (*_wch) functions, which take or return a cchar_t or a cchar_t array, are exposed.

The gap is most visible on the read side: window.inch() and window.getbkgd() return a packed chtype integer (8-bit character + A_* attributes + a color pair clamped to COLOR_PAIR()), which cannot represent a wide or combining character and clamps the color pair. There is currently no way to read back a styled wide cell at all. This is the long-standing request in gh-83395 ("Add curses.window.in_wch", with the now-stale PR #17825).

This issue tracks adding the full cchar_t API to the curses module. All of it requires building against a wide-character version of the curses library (ncursesw). The work will land step by step.

1. The cchar_t wrapper and the functions that take/return a single cell

A new immutable curses.complexchar(text, attr=0, pair=0) type. str(cc) is the cell's text; cc.attr and cc.pair are its rendition (read-only). The color pair is stored separately, not packed via COLOR_PAIR(), so it is not limited to the value that fits in a chtype.

Methods that take or return one cell:

  • read (return a complexchar): window.in_wch([y, x]) (wide inch), window.getbkgrnd() (wide getbkgd). These are the only genuinely new entry points needed -- the existing inch/getbkgd return a packed chtype int that cannot represent a wide/combining cell or an unclamped color pair. (This mirrors the existing getch vs get_wch split, justified by the different return type.)
  • write: no new methods. Every existing single-cell method simply also accepts a complexchar (its rendition then comes from the cell, and the method's own attr argument, if any, is ignored): addch, insch, echochar, bkgd, bkgdset, border, box, hline, vline. A complexchar is built explicitly with complexchar(text, attr, pair). This is also how the wide cchar_t form of the line-drawing functions (border_set/box_set/hline_set/vline_set) is reached, without adding *_set methods. Dedicated add_wch/ins_wch/echo_wchar/bkgrnd/bkgrndset are deliberately not added: once the chtype method accepts a complexchar, a parallel wide writer carries no extra capability.

(setcchar/getcchar need not be exposed separately: the complexchar object already packs/unpacks a cell.)

2. Arrays of cells (cchar_t array functions)

The same split applies one level up:

  • read (new entry point): window.in_wchstr([y, x,] n) (a single method with an optional count, like instr/in_wstr), returning the run of styled cells. There is no existing method that returns an array of styled cells (instr returns bytes, in_wstr returns str, both stripping rendition), so it is needed. It returns an immutable curses.complexstr -- the "complex character string" of the X/Open spec, the string counterpart of complexchar (as str is to a single character). It is a dedicated packed type owning the contiguous cchar_t buffer that win_wchnstr() fills directly (no per-cell object allocation on read): it decodes a complexchar lazily on indexing (arr[i]), len(arr) is the cell count, str(arr) joins the cells' text, and slicing/concatenation produce new complexstr instances. Immutable like str, so it is hashable and its raw buffer can be handed straight back to add_wchnstr() -- an array read and re-written is a zero-copy round-trip.
  • write: no new methods. Extend the existing addstr/addnstr/insstr/insnstr to also accept a sequence of cells, in addition to a plain str. A plain str keeps its current meaning; a sequence of cells is written via add_wchnstr -- a complexstr via its raw buffer (zero-copy), or any generic sequence (each item a complexchar or a str) packed into a temporary cchar_t array. So add_wchstr/add_wchnstr are not added, for the same reason add_wch was not. (To build or edit a run, use an ordinary list of complexchar; the packed complexstr is the immutable form you get back from a read.)

Notes

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    extension-modulesC modules in the Modules dirtype-featureA feature request or enhancement
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions