Skip to content

Optimize RedisValue / Equals#3116

Open
pairbit wants to merge 2 commits into
StackExchange:mainfrom
pairbit:EqualsAndGetHashCode
Open

Optimize RedisValue / Equals#3116
pairbit wants to merge 2 commits into
StackExchange:mainfrom
pairbit:EqualsAndGetHashCode

Conversation

@pairbit

@pairbit pairbit commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

I suggest converting the string to utf8 bytes and comparing the blobs. The tests work.
My position is that if a person uses strings, let their productivity suffer, not the one who chose the blobs.

I understand that the hash of a string and the hash of a string inside a RedisValue are different, but I don't think this is necessary.

var str = "my string";
RedisValue orig = str;
var hashCode = orig.GetHashCode();
Assert.That(hashCode, Is.Not.EqualTo(str.GetHashCode()));

I also found a bug in the StartsWith(ReadOnlySpan<byte> value) method for non-ASCII characters.
The test below is failing.

var str = "моя строка";
RedisValue orig = str;

Assert.That(orig.StartsWith(str), Is.EqualTo(orig.StartsWith(Encoding.UTF8.GetBytes(str))));

@mgravell

mgravell commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

One word: Marvin

You may already know, but "Marvin" is the randomising voodoo inside most recent string hashing / equality implementations that prevents dictionary attacks using precalculated predictable known values to force collisions (hash-flooding). Marvin introduces entropy per-process, preventing that. It is hugely desirable to retain Marvin semantics by default.


Longer version: if you're using BLOBs of any non-trivial value, you're unlikely to be using them as keys on dictionaries, or otherwise performing equality checks, so honestly: IMO: don't worry about it, it doesn't adversely affect you. You're probably just slinging BLOBs around, (de)serializing them, etc.

Perhaps what we really want here is a couple of pre-rolled IEqualityComparer<RedisValue> implementations: one optimized for string-like (the default), one optimized for BLOB-like (which you'd use, in your hypothetical use-case). I think giving people that option is a much safer and more versatile proposition.

Note: it would actually not be a bad idea to try to retain some semblance of Marvin in a BLOB-based implementation, but that's a stretch goal.

@pairbit

pairbit commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

I suggest separating the tasks. Hash calculation depends on the algorithm, and UTF-8 hash differs from Unicode hash. However, this doesn't affect the Equals and StartsWith methods.

@pairbit

pairbit commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Another question is, why don't you use Marvin in RedisKey?

@pairbit pairbit changed the title Optimize RedisValue / GetHashCode and Equals Optimize RedisValue / Equals Jun 26, 2026
@mgravell

mgravell commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Fair question. I suspect there's a lot of validity in the suggestion here. We need to be careful, but: string -> byte is at least well defined, where-as byte -> string isn't (in the general case).

I might see if I can drag the private Marvin core over - I did the same when over at aspnet for netfx, so I'll see if I can find it.

@pairbit

pairbit commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

The method int AddHashCode(ReadOnlySpan<byte> span, int acc) did not work correctly with Sequence, so I rolled back the changes and left only Equals

@pairbit

pairbit commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

The GetHashCode question turned out to be complicated, so I suggest not considering it in this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants