-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
Description
Feature or enhancement
Proposal:
Proposal: Add strict=True
Parameter to int()
for Stricter String Parsing
Summary
This proposal suggests adding a strict=True
parameter to Python’s built-in int()
function. This feature would ensure that the input string contains only valid numeric characters, making input validation more robust and preventing accidental conversions from malformed strings.
Motivation
Currently, int()
allows parsing numeric strings but does not provide a built-in way to strictly enforce that a string contains only valid digits. This can lead to unexpected behavior when handling user input or processing data.
Current Behavior
print(int("42")) # ✅ 42 (valid)
print(int("42abc")) # � Raises ValueError
print(int(" 42 ")) # ✅ 42 (leading/trailing spaces are ignored)
- Problem: The current behavior allows
int(" 42 ")
to succeed by ignoring spaces, which may lead to unexpected results in applications that require strictly formatted numbers.
Proposed Behavior with strict=True
print(int("42", strict=True)) # ✅ 42 (valid)
print(int("42abc", strict=True)) # � Raises ValueError
print(int(" 42 ", strict=True)) # � Raises ValueError (spaces not allowed)
print(int("", strict=True)) # � Raises ValueError (empty string)
- By default (
strict=False
),int()
behaves as it does today. - With
strict=True
, the function ensures the input is a fully numeric string, rejecting anything with non-digit characters or spaces.
Use Cases
-
User Input Validation:
- When accepting numeric input from users,
strict=True
ensures the input is correctly formatted.
user_input = " 42 " num = int(user_input, strict=True) # Raises ValueError instead of silently converting
- When accepting numeric input from users,
-
Data Processing:
- When working with data files, logs, or APIs,
strict=True
prevents accidental conversion of malformed strings.
data = ["100", "200 ", "30a"] numbers = [int(d, strict=True) for d in data] # Ensures only clean data is processed
- When working with data files, logs, or APIs,
-
Security & Error Prevention:
- Prevents cases where unexpected whitespace or special characters get silently ignored.
- Useful in financial applications where every input must be strictly validated.
Proposed Implementation
The int()
function’s implementation is in Objects/longobject.c
. The change would involve:
- Adding a
strict=True
keyword argument. - Modifying
PyLong_FromString()
to enforce strict parsing.
Modified Code (Objects/longobject.c
)
static PyObject *
long_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
static char *kwlist[] = {"x", "base", "strict", NULL};
PyObject *x = NULL;
int base = 10;
int strict = 0; // Default: strict=False
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|Oii", kwlist, &x, &base, &strict))
return NULL;
if (strict && (!PyUnicode_Check(x) || !PyUnicode_IsNumeric(x))) {
PyErr_SetString(PyExc_ValueError, "strict=True requires a valid numeric string.");
return NULL;
}
return PyLong_FromString(PyUnicode_AsUTF8(x), NULL, base);
}
Backward Compatibility
This proposal maintains full backward compatibility by making strict=False
the default.
- Existing code that relies on
int()
will work without modification. - Developers can opt-in to
strict=True
only when needed.
Testing Plan
New test cases will be added in Lib/test/test_int.py
:
import unittest
class IntStrictTests(unittest.TestCase):
def test_valid_numbers(self):
self.assertEqual(int("42", strict=True), 42)
self.assertEqual(int("100", strict=True), 100)
def test_invalid_numbers(self):
with self.assertRaises(ValueError):
int("42abc", strict=True)
with self.assertRaises(ValueError):
int(" 42 ", strict=True) # Spaces should be invalid
def test_default_behavior(self):
self.assertEqual(int(" 42 "), 42) # Default remains unchanged
These tests ensure:
strict=True
enforces clean numeric input.strict=False
maintains existing behavior.
Documentation Updates
The int()
function documentation in Doc/library/functions.rst
will be updated:
int(x=0, base=10, strict=False)
--------------------------------
...
If `strict=True`, `x` must be a fully numeric string with no extra characters.
This ensures input validation without affecting the default behavior.
Next Steps
- Modify CPython source code and submit a Pull Request.
- Discuss with core developers for feedback.
- Merge changes into an upcoming Python version.
Conclusion
Adding strict=True
to int()
enhances input validation, improves security, and provides a simple yet effective way to enforce stricter parsing—without breaking backward compatibility.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response