What version of protobuf and what language are you using?
Version: 7.34.1 (pip package), also verified against current main branch source at python/ directory
Language: Python (upb C backend)
What supported operating system version are you using?
Linux (Ubuntu, x86_64)
What supported runtime / compiler version are you using?
Python 3.14.3+ (also affects 3.10-3.13)
What did you do?
We performed a systematic code review of the 10 C source files in python/ (the upb-based Python backend) using static analysis and manual review. We found multiple confirmed bugs, 7 of which we reproduced from pure Python scripts. Full report with analysis details: https://gist.github.com/devdanzin/b0eca1ded22efac92de4efe928916d46
The reproducible bugs are:
1. Segfault via strcmp(NULL) in numpy type detection (convert.c:213-222)
from google.protobuf import descriptor_pb2, descriptor_pool, message_factory
pool = descriptor_pool.DescriptorPool()
pool.Add(descriptor_pb2.FileDescriptorProto(
name="test.proto", syntax="proto3",
message_type=[descriptor_pb2.DescriptorProto(
name="TestMsg",
field=[descriptor_pb2.FieldDescriptorProto(
name="val", number=1, type=2, label=1)],
)],
))
TestMsg = message_factory.GetMessageClass(pool.FindMessageTypeByName("TestMsg"))
class NamelessMeta(type):
def __getattribute__(cls, name):
if name == "__name__":
raise AttributeError("no __name__")
return super().__getattribute__(name)
class Nameless(metaclass=NamelessMeta):
def __float__(self):
return 1.0
msg = TestMsg()
msg.val = Nameless() # Segmentation fault (core dumped)
2. MessageMeta_GetAttr swallows KeyboardInterrupt/MemoryError/SystemExit (message.c:2019)
# (same setup as above to get TestMsg)
class BombDescriptor:
def __get__(self, obj, objtype=None):
raise KeyboardInterrupt("should not be swallowed")
TestMsg.bomb = BombDescriptor()
try:
TestMsg.bomb
except AttributeError:
print("BUG: KeyboardInterrupt swallowed, replaced with AttributeError")
3. extension_ranges leaks memory on every access (descriptor.c:340-342)
import tracemalloc, gc
from google.protobuf import descriptor_pb2, descriptor_pool
pool = descriptor_pool.DescriptorPool()
pool.Add(descriptor_pb2.FileDescriptorProto(
name="test.proto", syntax="proto3",
message_type=[descriptor_pb2.DescriptorProto(
name="TestMsg",
field=[descriptor_pb2.FieldDescriptorProto(name="x", number=1, type=5, label=1)],
extension_range=[
descriptor_pb2.DescriptorProto.ExtensionRange(start=100, end=200),
descriptor_pb2.DescriptorProto.ExtensionRange(start=200, end=300),
],
)],
))
desc = pool.FindMessageTypeByName("TestMsg")
tracemalloc.start()
gc.collect()
before = tracemalloc.get_traced_memory()[0]
for _ in range(10000):
_ = desc.extension_ranges
gc.collect()
after = tracemalloc.get_traced_memory()[0]
print(f"Leaked {after - before} bytes over 10000 accesses")
# Leaked 321184 bytes over 10000 accesses
4. Descriptor container RichCompare swallows exceptions (descriptor_containers.c:249,527,752)
# (same setup to get desc)
fields = desc.fields_by_name
class BadEq:
def __eq__(self, other): raise RuntimeError("comparison error")
def __iter__(self): return iter([])
def __len__(self): return 0
print(fields == BadEq())
# Prints False -- RuntimeError silently swallowed
5. pop() clamps out-of-range index instead of raising IndexError (repeated.c:423-430)
# (same setup to get TestMsg with repeated int32 'values' field)
msg = TestMsg()
msg.values.extend([10, 20, 30])
print(msg.values.pop(999999))
# Prints 30 -- should raise IndexError
6-7. C/Python parity gaps: extended slice assignment works on C but raises ValueError on pure Python fallback; RepeatedComposite.__eq__ with lists returns True on C but raises TypeError on pure Python.
What did you expect to see
TypeError or ValueError instead of segfault
KeyboardInterrupt propagated, not replaced with AttributeError
- No memory growth from repeated
extension_ranges access
RuntimeError propagated from comparison, not silently swallowed
IndexError for out-of-range pop() index
6-7. Consistent behavior between C and Python backends
What did you see instead?
Segmentation fault (core dumped) (exit code 139)
AttributeError raised instead of KeyboardInterrupt
- ~32 bytes leaked per access (321KB over 10K calls with 2 extension ranges)
False returned silently
- Last element silently removed and returned
6-7. C backend accepts operations that Python backend rejects
Anything else we should know about your project / environment
The full analysis report covers approximately 30 unique bugs found across all 10 C files, including additional issues that require OOM to trigger (unchecked malloc/realloc, unchecked PyType_GenericAlloc at 15+ sites, double-free in descriptor container Items error paths) and a Py_DECREF on a stack-allocated C array (map.c:549) that is undefined behavior on every module import. The complete report with detailed analysis per bug is at the gist linked above.
The code review also identified 9 unguarded PyErr_Clear() calls, missing m_traverse/m_clear for module state with ~26 PyObject* members, and C/Python parity gaps in MergeFrom input handling and ExtensionDict.__eq__ semantics.
The code review was done with https://github.com/devdanzin/cext-review-toolkit.
What version of protobuf and what language are you using?
Version: 7.34.1 (pip package), also verified against current
mainbranch source atpython/directoryLanguage: Python (upb C backend)
What supported operating system version are you using?
Linux (Ubuntu, x86_64)
What supported runtime / compiler version are you using?
Python 3.14.3+ (also affects 3.10-3.13)
What did you do?
We performed a systematic code review of the 10 C source files in
python/(the upb-based Python backend) using static analysis and manual review. We found multiple confirmed bugs, 7 of which we reproduced from pure Python scripts. Full report with analysis details: https://gist.github.com/devdanzin/b0eca1ded22efac92de4efe928916d46The reproducible bugs are:
1. Segfault via
strcmp(NULL)in numpy type detection (convert.c:213-222)2.
MessageMeta_GetAttrswallowsKeyboardInterrupt/MemoryError/SystemExit(message.c:2019)3.
extension_rangesleaks memory on every access (descriptor.c:340-342)4. Descriptor container
RichCompareswallows exceptions (descriptor_containers.c:249,527,752)5.
pop()clamps out-of-range index instead of raisingIndexError(repeated.c:423-430)6-7. C/Python parity gaps: extended slice assignment works on C but raises
ValueErroron pure Python fallback;RepeatedComposite.__eq__with lists returnsTrueon C but raisesTypeErroron pure Python.What did you expect to see
TypeErrororValueErrorinstead of segfaultKeyboardInterruptpropagated, not replaced withAttributeErrorextension_rangesaccessRuntimeErrorpropagated from comparison, not silently swallowedIndexErrorfor out-of-rangepop()index6-7. Consistent behavior between C and Python backends
What did you see instead?
Segmentation fault (core dumped)(exit code 139)AttributeErrorraised instead ofKeyboardInterruptFalsereturned silently6-7. C backend accepts operations that Python backend rejects
Anything else we should know about your project / environment
The full analysis report covers approximately 30 unique bugs found across all 10 C files, including additional issues that require OOM to trigger (unchecked
malloc/realloc, uncheckedPyType_GenericAllocat 15+ sites, double-free in descriptor containerItemserror paths) and aPy_DECREFon a stack-allocated C array (map.c:549) that is undefined behavior on every module import. The complete report with detailed analysis per bug is at the gist linked above.The code review also identified 9 unguarded
PyErr_Clear()calls, missingm_traverse/m_clearfor module state with ~26PyObject*members, and C/Python parity gaps inMergeFrominput handling andExtensionDict.__eq__semantics.The code review was done with https://github.com/devdanzin/cext-review-toolkit.