Memory leak in Inngest long running workers

# `PydanticSerializer` creates new `TypeAdapter` on every serialize/deserialize call, causing memory leak in long-running workers

## Summary

`PydanticSerializer` in `serializer_lib.py` creates a new `pydantic.TypeAdapter` instance on every call to `serialize()` and `deserialize()`. Each `TypeAdapter` instantiation triggers Pydantic's core schema generation, which registers type metadata in Pydantic's internal global registry that is never freed. In long-running connect workers processing many functions, this causes a steady memory leak (~22 MB/hour in our production environment).

## Environment

- `inngest` version: 0.5.18 (with `inngest[connect]`)
- `pydantic` version: 2.11.7
- `pydantic-core` version: 2.33.2
- Python version: 3.11.15
- Deployment: 5 connect workers running on Kubernetes, processing background jobs continuously

## The problem

In [`inngest/_internal/serializer_lib.py`](https://github.com/inngest/inngest-py/blob/main/inngest/_internal/serializer_lib.py):

```python
class PydanticSerializer(Serializer):
    def serialize(self, obj: object, typ: object) -> object:
        adapter = pydantic.TypeAdapter(object)        # new instance every call
        return adapter.dump_python(obj, mode="json")

    def deserialize(self, obj: object, typ: object) -> object:
        adapter = pydantic.TypeAdapter[object](typ)   # new instance every call
        return adapter.validate_python(obj)
```

`TypeAdapter.__init__` is not a lightweight operation. It triggers Pydantic's full core schema generation pipeline, which:

1. Builds a core schema representation for the type
2. Creates `SchemaValidator` and `SchemaSerializer` objects (Rust-backed via pydantic-core)
3. Registers type metadata (`FieldInfo`, `ModelMetaclass`, `MockValSer`, etc.) in Pydantic's internal type registry
4. Creates associated Python objects: `type` metaclasses, `dict`s, `function`s, `set`/`frozenset`s, `ReferenceType` weakrefs

The key issue is that **Pydantic's internal type registry holds strong references** to the generated schema objects. Even though the `adapter` local variable goes out of scope, the validators, serializers, and type metadata created during schema generation are retained for the lifetime of the process. They are never garbage collected.

This is called from the step execution path — every `step.run()`, `step.invoke()`, and function return triggers serialize/deserialize, so a single function with N steps creates ~2N `TypeAdapter` instances.

## Reproduction

Minimal script demonstrating the leak:

```python
import gc
import os
import pydantic

def get_rss_mb():
    """Get current RSS in MB from /proc (Linux) or resource module."""
    try:
        with open(f"/proc/{os.getpid()}/status") as f:
            for line in f:
                if line.startswith("VmRSS:"):
                    return int(line.split()[1]) / 1024
    except FileNotFoundError:
        import resource
        return resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1024

gc.collect()
baseline_rss = get_rss_mb()
baseline_objects = len(gc.get_objects())

# Simulate what PydanticSerializer does on every call
for i in range(10_000):
    adapter = pydantic.TypeAdapter(object)
    adapter.dump_python({"key": "value"}, mode="json")

gc.collect()
final_rss = get_rss_mb()
final_objects = len(gc.get_objects())

print(f"RSS:        {baseline_rss:.1f} MB -> {final_rss:.1f} MB (+ {final_rss - baseline_rss:.1f} MB)")
print(f"GC objects: {baseline_objects:,} -> {final_objects:,} (+ {final_objects - baseline_objects:,})")
```

## Production data

We profiled a live production connect worker (5,529 execution requests in 47 minutes on one pod) by injecting GC audit scripts via `gdb` and comparing two snapshots:

**GC object growth in 47 minutes (single pod):**

| Object type | Delta | What it is |
|-------------|-------|------------|
| `type` | +156 | New validator/serializer type objects |
| `ModelMetaclass` | +82 | Pydantic model metaclasses |
| `FieldInfo` | +270 | Pydantic field descriptors |
| `MockValSer` | +158 | Pydantic mock validators/serializers |
| `ReferenceType` | +3,860 | Weakrefs to all the above |
| `dict` | +1,094 | Module/class `__dict__`s |
| `function` | +906 | Methods on new classes |

**Total GC-tracked growth:** 1.89 MB in 47 minutes (2.4 MB/hour)
**Total RSS growth:** ~22 MB/hour (the gap is CPython arena fragmentation from the constant alloc/free churn)

Over 72 hours without a deploy, individual pods climb from ~700 MB to ~1.5 GB RSS. Pods have a 2 GiB memory limit. Currently relying on periodic deploys as an accidental memory pressure relief valve.

## Suggested fix

Cache `TypeAdapter` instances since they are deterministic and reusable for a given type:

```python
from functools import lru_cache

import pydantic


@lru_cache(maxsize=256)
def _get_type_adapter(typ: type) -> pydantic.TypeAdapter:  # type: ignore[type-arg]
    return pydantic.TypeAdapter(typ)


class PydanticSerializer(Serializer):
    def serialize(self, obj: object, typ: object) -> object:
        adapter = _get_type_adapter(object)
        return adapter.dump_python(obj, mode="json")

    def deserialize(self, obj: object, typ: object) -> object:
        adapter = _get_type_adapter(typ)
        return adapter.validate_python(obj)
```

This is safe because:
- `TypeAdapter` is stateless after construction — `dump_python()` and `validate_python()` are pure functions of their input
- Python types are hashable, so `lru_cache` works directly
- `serialize()` always passes `object` as the type, so there's exactly 1 cached adapter for it
- `deserialize()` passes function output types, which are a small fixed set per application
- `maxsize=256` is more than sufficient and bounds memory usage

An alternative approach would be to instantiate the adapters once in `__init__` for the common case (`object` type), and lazily cache others.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak in Inngest long running workers #347

`PydanticSerializer` creates new `TypeAdapter` on every serialize/deserialize call, causing memory leak in long-running workers

Summary

Environment

The problem

Reproduction

Production data

Suggested fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Object type	Delta	What it is
`type`	+156	New validator/serializer type objects
`ModelMetaclass`	+82	Pydantic model metaclasses
`FieldInfo`	+270	Pydantic field descriptors
`MockValSer`	+158	Pydantic mock validators/serializers
`ReferenceType`	+3,860	Weakrefs to all the above
`dict`	+1,094	Module/class `__dict__`s
`function`	+906	Methods on new classes

Memory leak in Inngest long running workers #347

Description

PydanticSerializer creates new TypeAdapter on every serialize/deserialize call, causing memory leak in long-running workers

Summary

Environment

The problem

Reproduction

Production data

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`PydanticSerializer` creates new `TypeAdapter` on every serialize/deserialize call, causing memory leak in long-running workers