BUG: inconsistent comparison results when using PyArrow backend.

### Pandas version checks

- [x] I have checked that this issue has not already been reported.

- [x] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.

- [x] I have confirmed this bug exists on the [main branch](https://pandas.pydata.org/docs/dev/getting_started/install.html#installing-the-development-version-of-pandas) of pandas.


### Reproducible Example

```python
for do_parq_roundtrip in [False, True]:
    for do_fillna in [False, True]:
        df = df_comparison.copy().reset_index(drop=True)

        print("=" * 100)
        parq_str = "with" if do_parq_roundtrip else "without"
        fillna_str = "with" if do_fillna else "without"
        print(f"{parq_str} parquet roundtrip, {fillna_str} fillna")
        print("=" * 100)

        # Do parquet roundtrip
        if do_parq_roundtrip:
            df.to_parquet("df.parquet", index=True)
            df = pd.read_parquet("df.parquet")

        print("Column datatypes:")
        print(df.dtypes)
        print()

        # Detect mismatch
        if do_fillna:
            is_mismatch = df.col_a.fillna(2) != df.col_b.fillna(2)
        else:
            is_mismatch = df.col_a != df.col_b

        print("Mismatch detected @:")
        print(np.argwhere(is_mismatch.fillna(False)))
        print()

        df["is_mismatch"] = is_mismatch

        # Print rows where there is a detected mismatch
        print("Detected mismatch rows:")
        print(df[df.is_mismatch])
        print("=" * 100)
        print()
```

### Issue Description

I get inconsistent results when comparing two `int64[pyarrow]` columns, depending on whether I use `fillna` or first store the dataframe in parquet and read it again. 

The input data is the result of merging two dataframes resulting from `df = pd.read_sql(f"select * from some.Table", some_connection, dtype_backend="pyarrow")`. I've redacted the results by selecting only certain columns and renaming them. The dataframe contains two columns (`col_a` and `col_b`) that are compared. These columns are of dtype `int64[pyarrow]` and contain either 0, 1, or NA. There is a third column `ManualIndex` which I've added to make sure nothing cheeky is happening with the index, but which is probably useless.

I've attached a .parquet and .feather export of the data. But keep in mind storing and then re-loading the data apparently has an effect on the output of the comparison, so loading this data and running the script probably gives a different output then the one I've posted below.

[df.zip](https://github.com/user-attachments/files/22336312/df.zip)

The output of the script gives me the following results:
```
====================================================================================================
without parquet roundtrip, without fillna
====================================================================================================
Column datatypes:
ManualIndex             int64
col_a          int64[pyarrow]
col_b          int64[pyarrow]
dtype: object

Mismatch detected @:
[[252518]
 [252519]]

Detected mismatch rows:
        ManualIndex  col_a  col_b is_mismatch
252518       252518      1   <NA>        <NA>
252519       252519      1   <NA>        <NA>
====================================================================================================

====================================================================================================
without parquet roundtrip, with fillna
====================================================================================================
Column datatypes:
ManualIndex             int64
col_a          int64[pyarrow]
col_b          int64[pyarrow]
dtype: object

Mismatch detected @:
[[252512]
 [252513]
 [252518]
 [252519]]

Detected mismatch rows:
        ManualIndex  col_a  col_b is_mismatch
252512       252512      1      1        True
252513       252513      1      1        True
252518       252518      1   <NA>        True
252519       252519      1   <NA>        True
====================================================================================================

====================================================================================================
with parquet roundtrip, without fillna
====================================================================================================
Column datatypes:
ManualIndex             int64
col_a          int64[pyarrow]
col_b          int64[pyarrow]
dtype: object

Mismatch detected @:
[]

Detected mismatch rows:
Empty DataFrame
Columns: [ManualIndex, col_a, col_b, is_mismatch]
Index: []
====================================================================================================

====================================================================================================
with parquet roundtrip, with fillna
====================================================================================================
Column datatypes:
ManualIndex             int64
col_a          int64[pyarrow]
col_b          int64[pyarrow]
dtype: object

Mismatch detected @:
[[252505]
 [252506]]

Detected mismatch rows:
        ManualIndex  col_a  col_b is_mismatch
252505       252505      1      1        True
252506       252506      1      1        True
====================================================================================================

```

As you can see the results are different for each combination of settings.

### Expected Behavior


My expectations:
- I expect the roundtrip to a parquet file and back to have no bearing on the comparison at all.
- I expect `is_mismatch` to be `True` only at `ManualIndex` `252518` and `252519`  when using `fillna(2)` before comparing.
- I expect `is_mismatch` to be `NA` at `ManualIndex` `252518` and `252519` when not using `fillna` before comparing.
- I expect `is_mismatch` to be `False` everywhere else.
- I expect that when `is_mismatch` is `NA` that it does not get found by `np.argwhere` or result in returned rows when using it to index into a dataframe. 

I don't understand how the rows where `col_a` and `col_b` are both `1` can ever result in them not being equal according to the comparison. When I filter on these rows and do the comparison manually suddenly the comparison evalues to them being equal.

In short I am very confused, am I doing something wrong here?


### Installed Versions

<details>


INSTALLED VERSIONS
------------------
commit                : 4665c10899bc413b639194f6fb8665a5c70f7db5
python                : 3.13.4
python-bits           : 64
OS                    : Windows
OS-release            : 11
Version               : 10.0.26100
machine               : AMD64
processor             : Intel64 Family 6 Model 106 Stepping 6, GenuineIntel
byteorder             : little
LC_ALL                : None
LANG                  : en_US.UTF-8
LOCALE                : English_Netherlands.1252

pandas                : 2.3.2
numpy                 : 2.2.6
pytz                  : 2025.2
dateutil              : 2.9.0.post0
pip                   : None
Cython                : None
sphinx                : None
IPython               : 9.2.0
adbc-driver-postgresql: None
adbc-driver-sqlite    : None
bs4                   : 4.13.4
blosc                 : None
bottleneck            : None
dataframe-api-compat  : None
fastparquet           : None
fsspec                : None
html5lib              : 1.1
hypothesis            : None
gcsfs                 : None
jinja2                : 3.1.6
lxml.etree            : 5.4.0
matplotlib            : 3.10.3
numba                 : None
numexpr               : None
odfpy                 : None
openpyxl              : None
pandas_gbq            : None
psycopg2              : None
pymysql               : None
pyarrow               : 21.0.0
pyreadstat            : None
pytest                : 8.3.5
python-calamine       : None
pyxlsb                : None
s3fs                  : None
scipy                 : None
sqlalchemy            : 2.0.42
tables                : None
tabulate              : None
xarray                : None
xlrd                  : None
xlsxwriter            : None
zstandard             : None
tzdata                : 2025.2
qtpy                  : None
pyqt5                 : None


</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: inconsistent comparison results when using PyArrow backend. #62342

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

BUG: inconsistent comparison results when using PyArrow backend. #62342

Description

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions