fix(encoding/json): type-drive Unmarshal into Map/Slice/Pointer containers#145
Conversation
…iners
Unmarshal only special-cased struct fields/targets whose runtime value was
already an instance of the right shape; anything else fell through to a
generic "objects become Maps" decode with no awareness of Go element
types. Three related gaps, all traced to the same missing piece — the
decoder needs to walk a field/element's declared TypeInfo, not just the
JSON shape it sees:
1. map[string]Struct / map[string][]Struct fields (and further nesting,
e.g. map[string][]Struct) left plain JS objects/arrays in place instead
of real struct instances, so generated struct-field accessors read back
undefined:
type Ref struct{ Name string `json:"name"` }
type Container struct {
Contexts map[string][]Ref `json:"contexts"`
}
json.Unmarshal([]byte(`{"contexts":{"a":[{"name":"x"}]}}`), &c)
// c.Contexts["a"][0].Name read back "" (plain object, not a Ref)
2. An interface{}-typed array of objects left its element objects as raw
plain JS objects instead of this override's Map representation, so a
`case map[string]any:` type switch on an element panicked calling
Map-only methods like .entries():
var v any
json.Unmarshal([]byte(`[{"x":1},{"y":2}]`), &v)
// v.([]any)[0] was a plain object, not the Map every other
// interface{}-typed object value decodes to
3. json.RawMessage wasn't recognized as a map/slice ELEMENT type (only as
a direct field type), and there was no Pointer branch in the
type-driven path at all, so pointer elements/fields
(map[string]*Property, Items *ItemsSpec) leaked plain objects with
undefined Go field reads.
Fix: a recursive decodeValueForType(decoded, typeInfo, opts) that decodes
a raw JSON value per its Go TypeInfo — Struct constructs a real instance
via the registered ctor, Slice/Map recurse per elemType, Pointer decodes
as its pointee, RawMessage captures the raw fragment as bytes, and
anything else (including interface{}) falls back to a new
decodeInterfaceValue that replaces objectToMap and additionally recurses
into array elements (fixing case 2 for nested and top-level arrays alike).
Wired in at both call sites: assignDecodedFieldValue (struct fields) and
assignDecodedValue's new top-level branch, which resolves a var's own
type via the __goType spelling the runtime boxes onto Unmarshal's `any`
target — needed for e.g. `var m map[string]json.RawMessage` where there
is no struct field to hang type info off of.
Extends gs/encoding/json/index.test.ts with one case per bug (map[string]
[]Struct field, interface{} array-of-objects, pointer-to-struct map value
+ field, and RawMessage as both a map and slice element). All four fail
on master and pass with this fix; verified by temporarily reverting just
the source change and confirming the new tests fail with the same errors
described above.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR improves the gs/encoding/json override’s Unmarshal decode path by decoding JSON values according to Go TypeInfo (including nested map/slice element types), instead of relying purely on the runtime JSON shape. It targets correctness for nested map[string]T / []T containers, pointer-to-struct elements/fields, interface{} arrays-of-objects, and json.RawMessage as an element type.
Changes:
- Add a recursive, type-driven decoder (
decodeValueForType) and use it for struct fields and top-level map/slice targets when Go type spellings are available via__goType. - Replace the previous object-only recursion with
decodeInterfaceValue, which also recurses into arrays so objects inside[]anydecode into Maps consistently. - Extend
index.test.tswith new fixtures and test cases covering the reported nested map/slice, interface-array, pointer, and RawMessage scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| gs/encoding/json/index.ts | Adds type-driven decoding helpers and wires them into Unmarshal’s assignment paths. |
| gs/encoding/json/index.test.ts | Adds regression tests for nested typed decode (map/slice), interface arrays, pointers, and RawMessage elements. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if ($.isPointerTypeInfo(info) && isPlainObject(decoded)) { | ||
| const pointee = resolveTypeInfo(info.elemType) | ||
| if (pointee !== null && $.isStructTypeInfo(pointee)) { | ||
| target.value = decodeValueForType(decoded, info.elemType, opts) | ||
| return | ||
| } |
There was a problem hiding this comment.
Good catch, this was a real bug. Fixed in 1cc7a96: the field-routing path now passes the Pointer TypeInfo itself to decodeValueForType (not elemType), so it goes through the Pointer branch's now-correct unmarked construction instead of bypassing it.
| if ($.isPointerTypeInfo(info)) { | ||
| return decodeValueForType(decoded, info.elemType, opts) | ||
| } |
There was a problem hiding this comment.
Fixed in 1cc7a96: the Pointer branch now constructs struct pointees directly instead of recursing into the Struct branch, so the result stays unmarked. Added a test verifying decoded pointer values match *Struct (not Struct) via $.is().
| let s = spelling.trim() | ||
| while (s.startsWith('*')) { | ||
| s = s.slice(1).trim() | ||
| } |
There was a problem hiding this comment.
Good catch, fixed in 1cc7a96: goTypeDescriptor now returns a Pointer descriptor for a leading '*' instead of stripping it, so nested pointer element types (e.g. the *test.Property in map[string]*test.Property) are preserved. Added a regression test for this exact spelling.
Addresses review feedback on s4wave#145: pointer-to-struct fields/elements decoded to instances marked via markAsStructValue, but the runtime's pointer type matching (matchesPointerType in gs/builtin/type.ts) requires a struct pointee to stay UNMARKED, since marking is reserved for genuine non-pointer struct values. A decoded *Struct value therefore matched a Struct-by-value type assertion and failed a *Struct assertion, the inverse of correct Go semantics, which would break type switches and pointer-receiver method-set checks on JSON-decoded values. Three related spots, all part of the same gap: - decodeValueForType's Pointer branch recursed into its elemType and let the generic Struct branch mark the result. Now it constructs the pointee directly (mirroring the Struct branch's construction, minus the mark) for struct pointees, and only recurses for non-struct pointees (which carry no marking). - assignDecodedFieldValue's pointer-to-struct field routing called decodeValueForType with the field's elemType, bypassing the Pointer branch (and its now-correct handling) entirely. Now it passes the Pointer TypeInfo itself, so both call sites share one code path. - goTypeDescriptor unconditionally stripped every leading '*' while recursively parsing a __goType spelling, so a nested pointer element type (e.g. the *test.Property in map[string]*test.Property) lost its pointer-ness before a TypeInfo was ever built for it. Now a leading '*' produces a Pointer descriptor instead of being discarded; only the outermost address-of star (the Unmarshal target's own &var) is stripped, by unmarshalTargetGoType, before this function ever runs. Extends gs/encoding/json/index.test.ts with two cases verifying the decoded value matches *Struct and not Struct via the public $.is() API (one for inline struct-field TypeInfo, one for a __goType spelling with a nested pointer element). Both fail on the prior commit and pass with this fix, verified by temporarily reverting just the source change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
paralin
left a comment
There was a problem hiding this comment.
Requesting changes: the new pointer-to-struct decode path bypasses UnmarshalJSON for pointer fields.
At gs/encoding/json/index.ts:1116, assignDecodedFieldValue returns through the new pointer-to-struct branch before it can call assignDecodedValue. That skips the existing unmarshalJSONTarget(target.value) hook for a non-nil *T field whose pointee implements UnmarshalJSON. The result is field population into a fresh T instead of invoking the custom decoder that Go’s encoding/json would prioritize.
Focused repro on this head:
expected hook:{"value":"decoded"}
received decoded
Fix direction: check unmarshalJSONTarget(target.value) before the pointer-to-struct early return. For nil pointer fields where the pointee type implements UnmarshalJSON, allocate the pointee and invoke the hook on that new instance instead of directly field-populating it.
paralin's review on s4wave#145 flagged that the pointer-to-struct decode path added in this branch bypasses UnmarshalJSON: assignDecodedFieldValue routed straight into decodeValueForType and returned, so a *T field whose T implements UnmarshalJSON got field-by-field populated into a fresh instance instead of decoded through the custom hook Go's encoding/json would always prefer. Two spots needed the hook check, matching the fix direction from the review: - assignDecodedFieldValue now checks unmarshalJSONTarget(target.value) before routing into the type-driven constructor. A field that already holds a non-nil pointer whose pointee implements UnmarshalJSON falls through to assignDecodedValue, which invokes the hook on that existing instance in place (matching Go, which decodes into the pointee rather than allocating a replacement). - decodeValueForType's Pointer branch now checks the freshly allocated pointee for UnmarshalJSON before falling back to assignStructFields. This covers the nil-field case (nothing exists yet to check the hook against until after allocation) and also map/slice pointer-to-struct elements, which go through this same branch. Adds two regression tests to gs/encoding/json/index.test.ts: a nil pointer field decoding through the hook, and a non-nil pointer field decoding into the existing pointee in place (checked via object identity, not just field content). Both fail on the prior commit and pass with this fix, verified by temporarily reverting just the source change. Verify: bunx vitest run gs/encoding/json/index.test.ts Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
|
Thanks for catching that. Fixed in c18d36a. The pointer-to-struct branch had two spots that needed the hook check, matching your fix direction. assignDecodedFieldValue now checks unmarshalJSONTarget(target.value) before routing into the type-driven constructor: a field that already holds a non-nil pointer whose pointee implements UnmarshalJSON falls through to assignDecodedValue, which invokes the hook on that existing instance in place, the same way Go decodes into the pointee rather than allocating a replacement. decodeValueForType's Pointer branch also checks the freshly allocated pointee for UnmarshalJSON before falling back to assignStructFields, which covers the nil-field case (nothing exists to check the hook against until after allocation) and map/slice pointer-to-struct elements, which share this same branch. Added two regression tests: a nil pointer field decoding through the hook, and a non-nil pointer field decoding into the existing pointee in place, checked via object identity so a fix that just re-hooks a fresh replacement instance would not pass it. Both fail on the parent commit and pass here, verified by temporarily reverting just the source change. Verify: bunx vitest run gs/encoding/json/index.test.ts |
…gets Part of s4wave#142 (item 5). Stacked on s4wave#145 (fix/json-unmarshal-typed-containers): both need the __goType-spelling parser (goTypeDescriptor) and the type-driven decoder (decodeValueForType) that PR introduces, so this PR extends them rather than duplicating them. Please merge s4wave#145 first. Unmarshal only knew how to populate a target carrying struct-field runtime metadata (a registered, named struct type). An anonymous (inline, unnamed) struct target carries no such metadata: var loose struct { Name string `json:"name"` Age int `json:"age"` } json.Unmarshal([]byte(`{"name":"Ada","age":30}`), &loose) // loose.Name and loose.Age read back "" and 0 -- every field empty so it fell through to the untyped interface{} decode, which replaces the plain-object struct value with this override's Map representation entirely, dropping every field. Fix: an anonymous struct target still carries its Go type spelling (including field names, types, and json tags) via __goType, e.g. `*struct{Name string "json:\"name\""; Age int "json:\"age\""}`. Parses it into the same fieldMetadata shape a registered struct's fields use (parseAnonymousStructFields, splitting on the tag's Go-quoted string literal so a ';' inside a tag can't split a field early), then decodes each field through the existing per-field path (assignDecodedFieldValue/assignAnonymousStructFields) -- tags, nested slices/maps, and RawMessage fields included. Wired into both assignDecodedValue (a top-level anonymous-struct var) and decodeValueForType (an anonymous struct used as a map/slice element type, e.g. `map[string]struct{...}`). Extends gs/encoding/json/index.test.ts with two cases: a top-level anonymous struct target, and one used as a map value type. Both fail on the parent commit (empty fields / a Map instead of the plain object) and pass with this fix. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
paralin
left a comment
There was a problem hiding this comment.
No findings on the current head.
I re-checked the prior pointer-to-struct UnmarshalJSON issue. The current code checks the existing field value's hook before routing into the type-driven pointer constructor, and the Pointer branch checks the freshly allocated pointee before default struct population. That covers both non-nil and nil pointer field cases from the prior review.
Local focused checks on current head:
bun run vitest run gs/encoding/json/index.test.ts
# 26 passed
bun run typecheck
# passed
CI metadata also shows CodeQL, DCO, dependency review, and Tests green; browser-smoke is skipped.
|
Merged through to The final
Verified locally after applying the changes: |
Part of #142 (items 3, 4 and 6).
Touches the same file as #144 but a different area (
Unmarshal's decode path vsMarshal's encode path).gs/encoding/json/index.tsitself merges cleanly between the two branches; only the shared test file (index.test.ts) has a textual conflict, since both PRs happen to insert new test fixture classes right after the existingFieldAliasclass, so whichever merges second needs a trivial resolution there (keep both class blocks). No logic overlap. #146 (item 5, anonymous struct json tags) is stacked directly on this branch, since it extends the samegoTypeDescriptor/decodeValueForTypethis PR introduces, so please merge this one first.Unmarshalonly recognized a struct field/target as needing special handling when its runtime value was already an instance of the right shape (an existing struct instance, aUint8Array, etc). Everything else fell through to a generic "objects become Maps" decode with no awareness of the field's declared Go element type. Three related gaps, all fixed by the same missing piece: decoding needs to walk the field/element's TypeInfo, not just the JSON shape it happens to see.Gap 1:
map[string]Structandmap[string][]Structfields (and further nesting) left plain JS objects/arrays in place instead of real struct instances, so generated struct-field accessors read backundefined.Gap 2: an
interface{}-typed array of objects left its element objects as raw plain JS objects instead of this override's Map representation, so acase map[string]any:type switch on an element panicked calling Map-only methods like.entries().Gap 3:
json.RawMessagewasn't recognized as a map/slice element type (only as a direct field type), and there was noPointerbranch in the type-driven path at all, so pointer elements and fields (map[string]*Property,Items *ItemsSpec) leaked plain objects with undefined Go field reads.The fix is a recursive
decodeValueForType(decoded, typeInfo, opts)that decodes a raw JSON value per its GoTypeInfo:Structconstructs a real instance via the registered ctor,Slice/Maprecurse perelemType,Pointerdecodes as its pointee,json.RawMessagecaptures the raw fragment as bytes, and anything else (includinginterface{}) falls back to a newdecodeInterfaceValue, which replacesobjectToMapand additionally recurses into array elements, fixing gap 2 for both nested and top-level arrays. It is wired in at both call sites:assignDecodedFieldValue(struct fields) andassignDecodedValue's new top-level branch, which resolves a var's own type via the__goTypespelling the runtime boxes ontoUnmarshal'sanytarget, needed for e.g.var m map[string]json.RawMessagewhere there is no struct field to hang type info off of.A pointer-to-struct pointee is constructed directly in the Pointer branch (not by recursing into the Struct branch, which always marks its result via
$.markAsStructValue): the runtime's pointer type matching (matchesPointerTypeings/builtin/type.ts) requires struct pointees to stay unmarked, so bothassignDecodedFieldValue's field routing anddecodeValueForType's own Pointer branch share this construction path.goTypeDescriptoralso preserves a nested pointer element type (e.g. the*test.Propertyinmap[string]*test.Property) instead of stripping every leading*while parsing; only the outermost address-of star (theUnmarshaltarget's own&var) is stripped, byunmarshalTargetGoType, before this function ever runs.Tests extend
gs/encoding/json/index.test.tswith one case per gap:map[string][]Structfield,interface{}array-of-objects, pointer-to-struct map value and field,RawMessageas both a map and slice element, and two cases verifying decoded pointer values match*Struct(notStructby value) via the public$.is()API. All new tests fail onmaster(or the prior commit, for the pointer-marking cases) and pass with this fix, verified by temporarily reverting just the source change and confirming they fail with the errors described above.Verify:
24 passed. Full suite:
bun run typecheckclean,bun run test:js1277 passed / 4 skipped / 0 failed,go test ./...all packages green,bun run lint:jsclean.Found while transpiling and running a real ~100k-LOC Go interpreter/parser package through goscript.