Skip to content

Conversation

@juliannguyen4
Copy link
Collaborator

@juliannguyen4 juliannguyen4 commented Jan 21, 2026

This also fixes a bug where passing in a digest with embedded NULL characters will raise a ValueError exception.

Extra changes

Add missing test cases for aerospike.get_partition_id()

TODO

  • Review https://docs.python.org/3/c-api/buffer.html
  • Code coverage looks ok
  • Build artifacts passes
  • Valgrind shows no memory errors or leaks from these changes
  • Massif usage looks ok
  • Add to breaking changes docs page for incoming client release

Docs

https://aerospike-python-client--928.org.readthedocs.build/en/928/aerospike.html#aerospike.get_partition_id

@codecov-commenter
Copy link

codecov-commenter commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 89.47368% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.38%. Comparing base (144f859) to head (53aa749).
⚠️ Report is 8 commits behind head on dev.

Files with missing lines Patch % Lines
src/main/calc_digest.c 89.47% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #928      +/-   ##
==========================================
+ Coverage   83.31%   83.38%   +0.06%     
==========================================
  Files          99       99              
  Lines       14422    14438      +16     
==========================================
+ Hits        12016    12039      +23     
+ Misses       2406     2399       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@juliannguyen4 juliannguyen4 changed the title [CLIENT-4086] Have aerospike.get_partition_id() take in a bytearray as argument instead of str [CLIENT-4086] Have aerospike.get_partition_id() take in a bytearray as argument instead of a tuple containing a str Jan 21, 2026
@juliannguyen4 juliannguyen4 changed the title [CLIENT-4086] Have aerospike.get_partition_id() take in a bytearray as argument instead of a tuple containing a str [CLIENT-4086] (Breaking change) Add missing documentation for aerospike.get_partition_id(). Have aerospike.get_partition_id() take in a bytearray as argument instead of a tuple containing a str. Jan 22, 2026
@juliannguyen4 juliannguyen4 changed the title [CLIENT-4086] (Breaking change) Add missing documentation for aerospike.get_partition_id(). Have aerospike.get_partition_id() take in a bytearray as argument instead of a tuple containing a str. [CLIENT-4086] (Breaking change) Have aerospike.get_partition_id() take in a bytearray as argument instead of a tuple containing a str. Add missing documentation for aerospike.get_partition_id(). Jan 27, 2026
@juliannguyen4 juliannguyen4 marked this pull request as ready for review January 27, 2026 23:23
Copy link
Collaborator

@DomPeliniAerospike DomPeliniAerospike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looked good! Left a few suggestions/comments.

// Python Function Argument Parsing
if (PyArg_Parse(args, "(s)", &digest) == false) {
return NULL;
if (PyArg_Parse(arg, "y*", &py_buffer) == false) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To optimize, you should change to "y#" in order to avoid creating an entire py_buffer object:

Since you don't need to write to the buffer, using read only mode should be faster.
Snippet:

`
const uint8_t *buf;
Py_ssize_t len;

if (!PyArg_ParseTuple(args, "y#", &buf, &len)) {
    return NULL;  // PyArg_ParseTuple already sets an exception
}

`

Also requires changing METH_O back to METH_VARARGS.

Copy link
Collaborator Author

@juliannguyen4 juliannguyen4 Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think switching to "y#" would make get_partition_id() and calc_digest() harder to use. The user can no longer pass a digest from aerospike.calc_digest() directly into aerospike.get_partition_id(). calc_digest() returns a bytearray which is mutable, and "y#" only accepts a "read-only bytes-like object"

y# documentation: https://docs.python.org/3/c-api/arg.html#:~:text=accept%20binary%20data.-,y#,-(read%2Donly%20bytes

aerospike.calc_digest() has returned a bytearray for a while now: https://aerospike-python-client.readthedocs.io/en/8.0.0/aerospike.html#aerospike.calc_digest

To give more context, this PR is for aerospike.get_partition_id() debugging QE's bank test. (the Jira ticket explains more about what aerospike.get_partition_id() is used for) I don't believe customers are using this in production.

We could make a breaking change to aerospike.calc_digest() to return a bytes object instead of bytearray; bytes is immutable whereas the latter is mutable. But I feel like that's outside the scope of this PR


.. py:function:: get_partition_id(digest) -> int

Calculate the partition id for a record using its digest.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the partition_id associated with key's digest? It doesn't have to be a record right?

Also Nit: Capital ID looks better in documentation than id.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, will fix

import pytest

# This isn't a correctness test. It's only for code coverage purposes
# and to make sure the API is aligned with the documentation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will QE be testing the correctness of this? Should be right, but we should confirm with testing. Should be easy to verify with AS_POLICY_KEY_DIGEST.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just manually check using gdb. I don't think it's necessary to write a correctness test since customers shouldn't be using this (it's only meant for internal testing)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to see that maybe this API call isn't necessary.. we can just use gdb to print the partition id...


Calculate the partition id for a record using its digest.

:param bytes-like object digest: the record's digest calculated by :py:meth:`aerospike.calc_digest`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description suggest calc_digest is mandatory, but if you have a digest from a record result, you don't need calc_digest.
Might be better to say:
The records digest. To calculate the digest, use :py:meth...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants