Skip to content

MDEV-39690: UBSAN: signed integer overflow in my_strntoll_8bit, my_strntoll_mb2_or_mb4 during BLOB-to-integer conversion#5295

Open
raghunandanbhat wants to merge 1 commit into
10.11from
10.11-mdev-39690
Open

MDEV-39690: UBSAN: signed integer overflow in my_strntoll_8bit, my_strntoll_mb2_or_mb4 during BLOB-to-integer conversion#5295
raghunandanbhat wants to merge 1 commit into
10.11from
10.11-mdev-39690

Conversation

@raghunandanbhat

Copy link
Copy Markdown
Contributor

fixes MDEV-39690

Problem:

When converting a string like '-9223372036854775808' to an integer, the parsed magnitude (2^63) equals (ulonglong) LONGLONG_MIN and is accepted as valid. The return expression then cast it to signed (LONGLONG_MIN) and negated it. Negating LONGLONG_MIN is signed integer overflow, i.e. undefined behaviour.

Fix:

Negate in unsigned arithmetic, where wrap-around is well defined, and convert to signed once: (longlong) (0ULL - i). 0ULL - 2^63 wraps to 0x8000000000000000, which as signed longlong is LONGLONG_MIN.

…_strntoll_mb2_or_mb4` during BLOB-to-integer conversion

Problem:
  When converting a string like '-9223372036854775808' to an integer,
  the parsed magnitude (2^63) equals `(ulonglong) LONGLONG_MIN` and is
  accepted as valid. The return expression then cast it to signed
  (LONGLONG_MIN) and negated it. Negating LONGLONG_MIN is signed integer
  overflow, i.e. undefined behaviour.

Fix:
  Negate in unsigned arithmetic, where wrap-around is well defined, and
  convert to signed once: (longlong) (0ULL - i). 0ULL - 2^63 wraps to
  `0x8000000000000000`, which as signed longlong is LONGLONG_MIN.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses MDEV-39690 by fixing a signed integer overflow in my_strntoll_8bit and my_strntoll_mb2_or_mb4 during BLOB-to-integer conversions, and adds corresponding test cases. While the proposed fix uses unsigned arithmetic to prevent signed overflow, casting the resulting out-of-range unsigned value to a signed longlong is implementation-defined in C. It is recommended to explicitly check for the LONGLONG_MIN boundary case to ensure fully well-defined behavior.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread strings/ctype-simple.c
}

return (negative ? -((longlong) i) : (longlong) i);
return (negative ? (longlong) (0ULL - i) : (longlong) i);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While negating in unsigned arithmetic (0ULL - i) avoids signed integer overflow, casting the resulting out-of-range unsigned value (0x8000000000000000ULL when i is LONGLONG_MIN's magnitude) to a signed longlong is technically implementation-defined behavior in C (C99/C11 §6.3.1.3p3).

We can completely avoid both signed integer overflow and implementation-defined behavior by explicitly checking for the LONGLONG_MIN boundary case. Since i is already validated to be within range, if i is equal to (ulonglong) LONGLONG_MIN, we can return LONGLONG_MIN directly. Otherwise, i is guaranteed to fit within a signed longlong, making the cast and negation fully well-defined.

  if (negative)
    return i == (ulonglong) LONGLONG_MIN ? LONGLONG_MIN : -((longlong) i);
  return (longlong) i;

Comment thread strings/ctype-ucs2.c
}

return (negative ? -((longlong)res) : (longlong)res);
return (negative ? (longlong) (0ULL - res) : (longlong)res);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While negating in unsigned arithmetic (0ULL - res) avoids signed integer overflow, casting the resulting out-of-range unsigned value (0x8000000000000000ULL when res is LONGLONG_MIN's magnitude) to a signed longlong is technically implementation-defined behavior in C (C99/C11 §6.3.1.3p3).

We can completely avoid both signed integer overflow and implementation-defined behavior by explicitly checking for the LONGLONG_MIN boundary case. Since res is already validated to be within range, if res is equal to (ulonglong) LONGLONG_MIN, we can return LONGLONG_MIN directly. Otherwise, res is guaranteed to fit within a signed longlong, making the cast and negation fully well-defined.

  if (negative)
    return res == (ulonglong) LONGLONG_MIN ? LONGLONG_MIN : -((longlong) res);
  return (longlong) res;

@raghunandanbhat raghunandanbhat requested a review from abarkov June 29, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant