02 Nov 23:21

akihironitta

d998aae

0.3.0: Broader Compatibility & Usability Enhancements Latest

Latest

What's Changed

Add Amphibians to torch_frame/datasets.__init__.py by @akihironitta in #504
Update docs by @akihironitta in #506
Fix minor code formatting in docs by @akihironitta in #507
return_stype argument in TensorFrame.get_col_feat by @rusty1s in #509
Add simple TabPFN example by @zechengz in #510
Add num_bytes utility by @rusty1s in #516
Update .gitignore by @akihironitta in #518
Support 0-dim tensors in tensor slicing by @rusty1s in #519
Add default dim to cat (similar to torch.cat) by @rusty1s in #521
Fix NumPy and PyTorch incompatibility error in CI by @akihironitta in #525
Support CatBoost in Python 3.13 by @akihironitta in #523
Only trigger automerge workflow on opening a PR by @akihironitta in #527
Support PyTorch 2.7 by @akihironitta in #528
Enable flake8-bugbear by @akihironitta in #530
Migrate to modern logger interface by @emmanuel-ferdman in #537
Fix auto-merge workflow by @akihironitta in #539
Fix device mismatch in computing num_rows on empty tensor frames by @rusty1s in #546
Support PyTorch 2.8 by @akihironitta in #551
Update dependabot.yml by @akihironitta in #552
Drop support for Python 3.9 by @akihironitta in #558
Add license information by @jamesmyatt in #534
Support PyTorch 2.9 by @akihironitta in #574

New Contributors

@emmanuel-ferdman made their first contribution in #537
@jamesmyatt made their first contribution in #534

Full Changelog: 0.2.5...0.3.0

Contributors

jamesmyatt, rusty1s, and 3 other contributors

Assets 2

12 Feb 12:55

akihironitta

0.2.5

2dd1547

0.2.5: Python 3.13 and PyTorch 2.6 support

What's Changed

Add support for PyTorch 2.6 by @akihironitta in #494
Support Python 3.12 and Python 3.13 by @akihironitta in #496
Add copy button for code in docs by @akihironitta in #489
CI: Consolidate unit test CI workflows by @akihironitta in #493
CI: Add concurrency to workflows triggered on PRs by @akihironitta in #495
Let pre-commit fix formatting issue in master by @akihironitta in #498
Automate package build and release by @akihironitta in #497
Fix auto-merging bot PRs by @akihironitta in #501
lint: switch pyupgrade to Ruff's rule UP by @Borda in #499
Prepare 0.2.5 release by @akihironitta in #502

New Contributors

@Borda made their first contribution in #499

Full Changelog: 0.2.4...0.2.5

Contributors

Borda and akihironitta

Assets 2

17 Jan 07:05

weihua916

0.2.4

1456fab

PyTorch Frame 0.2.4

What's Changed

fix multicategorical stype inference and add test case by @yiweny in #420
coorectly infer boolean stypes by @yiweny in #421
support xgboost early stopping by @yiweny in #424
Update testing torch version by @zechengz in #428
Update Excelformer benchmark results on small binary and regression tasks by @zechengz in #427
update xgboost numbers by @yiweny in #425
Update excelformer benchmark results by @zechengz in #431
Remove CUDA synchronizations by slicing input tensor with int instead of CUDA tensors in nn.LinearEmbeddingEncoder by @akihironitta in #432
Don't put assertions on N/A imputation correctness by @akihironitta in #433
Don't create the same tensor every iteration in N/A handling by @akihironitta in #434
chore: Update pre-commit by @akihironitta in #435
Add benchmark results for large-scale multiclass classification task by @akihironitta in #436
Fixed warning and added safe globals by @NeelKondapalli in #423
fix error in xgboost by @puririshi98 in #443
Add is_floating_point() to multi tensors by @akihironitta in #445
Fix size mismatch error when CatToNumTransform sees only a subset of labels at test time by @akihironitta in #446
add pytorch tabular benchmark by @yiweny in #398
Compare more models across frame and tabular by @wsad1 in #444
Add benchmark result from ExcelFormer on a large-scale multi-class classification task by @akihironitta in #447
Fail torch.load(weights=True) gracefully by @akihironitta in #448
Fix offset in LinearEmbeddingEncoder by @toenshoff in #455
Fix docs build in CI by @akihironitta in #456
Removing the deprecated categorical_feature parameter from lightgbm.train(...) function calls. by @drivanov in #454
Tighten assert condition in graph break tests by @akihironitta in #458
Update pytorch_tabular_benchmark.py by @wsad1 in #457
Drop support for Python 3.8 by @akihironitta in #462
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #461
Update benchmark numbers by @yiweny in #411
Add support for PyTorch 2.5 by @akihironitta in #464
Allow empty TensorFrame with non-zero number of rows by @rusty1s in #466
Support index select for empty TensorFrame by @rusty1s in #467
Consistent PyPI name pytorch-frame by @akihironitta in #468
Raise a friendly message when a str is provided to TensorFrame(col_names_dict) instead of a list[str] by @akihironitta in #469
Update README.md by @akihironitta in #471
Materialize train test by @HoustonJ2013 in #472
Add an example of training a tabular model on multiple GPUs by @akihironitta in #474
Support pin_memory() in Multi{Embedding,Nested}Tensor and TensorFrame by @akihironitta in #437
Run MultiNestedTensor tests on both CPU and GPU by @akihironitta in #476
Optimize the Trompt example to reduce training time by ~30% by @akihironitta in #477
Add dependabot and auto-merge PRs by dependabot once CI passes by @akihironitta in #478
Bump tj-actions/changed-files from 41 to 45 by @dependabot in #479
Bump codecov/codecov-action from 2 to 5 by @dependabot in #481
Bump dangoslen/changelog-enforcer from 2 to 3 by @dependabot in #480
Bump actions/labeler from 4 to 5 by @dependabot in #482
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #483
Update .pre-commit-config.yaml weekly by @akihironitta in #484
Fix documentation build by @akihironitta in #486
Label bot PRs skip-changelog by @akihironitta in #487
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #485
update version to 0.2.4 by @weihua916 in #488

New Contributors

@NeelKondapalli made their first contribution in #423
@puririshi98 made their first contribution in #443
@wsad1 made their first contribution in #444
@HoustonJ2013 made their first contribution in #472

Full Changelog: 0.2.3...0.2.4

Contributors

rusty1s, wsad1, and 11 other contributors

Assets 2

08 Jul 22:06

weihua916

0.2.3

d81b8f7

PyTorch Frame 0.2.3

What's Changed

Fix test_trompt.py by @weihua916 in #373
Add torchmetrics to pyproject.py full dependencies by @zechengz in #374
Add light-weight MLP by @weihua916 in #372
Handle label imbalance in binary classification tasks on text benchmark by @vid-koci in #376
Fix MLP normalization argument by @weihua916 in #377
Add retry to get OpenAI embeddings by @zechengz in #378
Make DataFrameTextBenchmark script pos_weight optional by @zechengz in #379
Fix text dataset stats and benchmark materialize return by @zechengz in #380
Add citation by @weihua916 in #383
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #382
Update RAEDME by @zechengz in #384
Fix README image size by @zechengz in #385
Add PyTorch Frame paper link to readme by @zechengz in #386
Make sure binary classification FakeDataset has both pos/neg labels by @weihua916 in #392
Update the key implementation and corresponding compatibility for ExcelFormer by @jyansir in #391
Better error message for CatToNumTransform by @weihua916 in #394
Fix split_by_sep in multicategorical stype by @weihua916 in #395
add support for autoinfer bool type by @yiweny in #399
Add R2 metric by @rishabh-ranjan in #403
[FutureWarn] Fix FutureWarning in CategoricalTensorMapper. by @drivanov in #401
fix readme link by @yiweny in #407
update benchmark by @yiweny in #400
Add MovieLens 1M dataset by @xnuohz in #397
Fixing Bug in Version Handling. by @drivanov in #410
update benchmark numbers by @yiweny in #408
[UserWarning] Fixing UserWarnings in two tests. by @drivanov in #409
fix embedding script by @yiweny in #412
Allow column indexing with custom stypes by @rusty1s in #413
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #414
Fix ExcelFormer Example Link by @crunai in #415
Towards supporting MultiCategorical encoder for target in torchframe by @XinweiHe in #417
Update to version 0.2.3 by @weihua916 in #418

New Contributors

@jyansir made their first contribution in #391
@rishabh-ranjan made their first contribution in #403
@drivanov made their first contribution in #401
@crunai made their first contribution in #415

Full Changelog: 0.2.2...0.2.3

Contributors

rusty1s, rishabh-ranjan, and 10 other contributors

Assets 2

04 Mar 20:37

weihua916

0.2.2

56f687d

PyTorch Frame 0.2.2

This release introduces image_embedded stype to handle image columns, fixes bugs on MultiNestedTensor indexing, and makes efficiency improvements in terms of missing value imputation and categorical column encoders.

Added

Avoided for-loop in EmbeddingEncoder (#366)
Added image_embedded and one tabular image dataset (#344)
Added benchmarking suite for encoders (#360)
Added dataframe text benchmark script (#354, #367)
Added DataFrameTextBenchmark dataset (#349)
Added support for empty TensorFrame (#339)

Changed

Changed a workflow of Encoder's na_forward method resulting in performance boost (#364)
Removed ReLU applied in FCResidualBlock (#368)

Fixed

Fixed bug in empty MultiNestedTensor handling (#369)
Fixed the split of DataFrameTextBenchmark (#358)
Fixed empty MultiNestedTensor col indexing (#355)

Assets 2

17 Jan 01:15

weihua916

0.2.1

f746974

PyTorch Frame 0.2.1

This PR makes the following fixes and extensions to 0.2.0.

Added

Support more stypes in LinearModelEncoder (#325)
Added stype_encoder_dict to some models (#319)
Added HuggingFaceDatasetDict (#287)

Changed

Supported decoder embedding model in examples/transformers_text.py (#333)
Removed implicit clones in StypeEncoder (#286)

Fixed

Fixed TimestampEncoder not applying CyclicEncoder to cyclic features (#311)
Fixed NaN masking in multicateogrical stype (#307)

Assets 2

15 Dec 23:20

yiweny

0.2.0

3f1a695

PyTorch Frame 0.2.0

We are excited to announce the second release of PyTorch Frame 🐶

PyTorch Frame 0.2.0 is the cumulation of work from many contributors from and outside Kumo who have worked on features and bug-fixes for a total of over 120 commits since torch-frame==0.1.0.

PyTorch Frame is featured in the Relational Deep Learning paper and used as the encoding layer for PyG.

Kumo is also hiring interns working on cool deep learning projects. If you are interested, feel free to apply through this link.

If you have any questions or would like to contribute to PyTorch Frame, feel free to send a question at our slack channel.

Highlights

Support for `multicategorical`, `timestamp`,`text_tokenized` and `embedding` stypes

We have added support for four more semantic types. Adding the new stypes allows for more flexibility to encode raw data. To understand how to specify different semantic types for your data, you can take a look at the tutorial. We also added many new StypeEncoder for the different new semantic types.

Integration with Large Language Models

We now support two types of integration with LLMs--embedding and fine-tuning.

You can use any embeddings generated by LLMs with PyTorch Frame, either by directly feeding the embeddings as raw data of embedding stype or using text as raw data of text_embedded stype and specifying the text_embedder for each column. Here is an example of how you can use PyTorch Frame with text embeddings generated by OpenAI, Cohere, VoyageAI and HuggingFace transformers.

text_tokenized enables users to fine-tune Large Language Models on text columns, along with other types of raw tabular data, on any downstream task. In this example, we fine-tuned both the full distilbert-base-uncased model and with LoRA.

More Benchmarks

We added more benchmark results in the benchmark section. LightGBM is included in the list of GBDTs that we compare with the deep learning models. We did initial experiments on various LLMs as well.

Breaking Changes

text_tokenized_cfg and text_embedder_cfg are renamed to col_to_text_tokenized_cfg and col_to_text_embedder_cfg respectively (#257). This allows users to specify different embedders, tokenizers for different text columns.
Now Trompt outputs 2-dim embeddings in forward.

Features

We now support the following new encoders: LinearEmbeddingEncoder for embedding stype, TimestampEncoder for timestamp stype and MultiCategoricalEmbeddingEncoder for multicategorical stype.
LightGBM is added to GDBTs module.
Auto-inference of stypes from raw DataFrame columns is supported through infer_df_stype function. However, the correctness of the inference is not guaranteed and we suggest you to double-check.

Bugfixes

We fixed the in_channels calculation of ResNet(#220) and improved the overall user experience on handling dirty data (#171 #234 #264).

Full Changelog

Full Changelog: 0.1.0...0.2.0

Assets 2

23 Oct 23:09

akihironitta

0.1.0

90a54f4

PyTorch Frame 0.1.0

We are excited to announce the initial release of PyTorch Frame 🎉🎉🎉

PyTorch Frame is a deep learning extension for PyTorch, designed for heterogeneous tabular data with different column types, including numerical, categorical, time, text, and images.

To get started, please refer to:

our README.md for the overview of PyTorch Frame,
"Introduction by Example" tutorial and its code at examples/tutorial.py to get started with using PyTorch Frame, and
"Modular Design of Deep Tabular Models" tutorial in our documentation and the existing implementations in torch_frame/nn/models/ directory to create your own PyTorch Frame model for tabular data.

Highlights

Models, datasets and examples

In our initial release, we introduce 6 models, 9 feature encoders, 5 table convolution layers, 3 decoders, and 14 datasets.

Benchmarks

With our initial set of models and datasets under torch_frame.nn and torch_frame.datasets, we benchmarked their performance on binary classification and regression tasks. The row denotes the model names and the column denotes the dataset idx. In each cell, we include the mean and standard deviation of the model performance, as well as the total time spent, including Optuna-based hyper-parameter search and final model training.

Note

For the latest benchmark scripts and results, see [benchmark/](https://github.com/pyg-team/pytorch-frame/tree/master/benchmark#leaderbo...

Assets 2

Releases: pyg-team/pytorch-frame

0.3.0: Broader Compatibility & Usability Enhancements

What's Changed

New Contributors

Contributors

Uh oh!

0.2.5: Python 3.13 and PyTorch 2.6 support

What's Changed

New Contributors

Contributors

Uh oh!

PyTorch Frame 0.2.4

What's Changed

New Contributors

Contributors

Uh oh!

PyTorch Frame 0.2.3

What's Changed

New Contributors

Contributors

Uh oh!

PyTorch Frame 0.2.2

Added

Changed

Fixed

Uh oh!

PyTorch Frame 0.2.1

Uh oh!

PyTorch Frame 0.2.0

Highlights

Support for multicategorical, timestamp,text_tokenized and embedding stypes

Integration with Large Language Models

More Benchmarks

Breaking Changes

Features

Bugfixes

Full Changelog

Uh oh!

PyTorch Frame 0.1.0

Highlights

Models, datasets and examples

Benchmarks

Uh oh!

Support for `multicategorical`, `timestamp`,`text_tokenized` and `embedding` stypes