All notable changes to this project will be documented in this file. The format is based on Keep a Changelog.
- Added support for PyTorch 2.10 and Python 3.14 (#590)
- Added support for PyTorch 2.9 (#574)
- Added support for PyTorch 2.8 (#551)
- Added support for PyTorch 2.7 (#528)
- Added a classification example script for TabPFN (#510)
- Removed support for Python 3.9 (#558)
- Added an example for training
Trompton multiple GPUs (#474) - Added support for materializing dataset for train and test dataframe separately(#470)
- Added support for PyTorch 2.5 (#464)
- Added a benchmark script to compare PyTorch Frame with PyTorch Tabular (#398, #444)
- Added
is_floating_pointmethod toMultiNestedTensorandMultiEmbeddingTensor(#445) - Added support for inferring
stype.categoricalfrom boolean columns inutils.infer_series_stype(#421) - Added
pin_memory()toTensorFrame,MultiEmbeddingTensor, andMultiNestedTensor(#437)
- Set
weights_only=Trueintorch_frame.loadfrom PyTorch 2.4 (#423)
- Dropped support for Python 3.8 (#462)
- Fixed size mismatch
RuntimeErrorintransforms.CatToNumTransform(#446) - Removed CUDA synchronizations from
nn.LinearEmbeddingEncoder(#432) - Removed CUDA synchronizations from N/A imputation logic in
nn.StypeEncoder(#433, #434)
- Updated
ExcelFormerimplementation and related scripts (#391)
- Avoided for-loop in
EmbeddingEncoder(#366) - Added
image_embeddedand one tabular image dataset (#344) - Added benchmarking suite for encoders (#360)
- Added dataframe text benchmark script (#354, #367)
- Added
DataFrameTextBenchmarkdataset (#349) - Added support for empty
TensorFrame(#339)
- Changed a workflow of Encoder's
na_forwardmethod resulting in performance boost (#364) - Removed ReLU applied in
FCResidualBlock(#368)
- Fixed bug in empty
MultiNestedTensorhandling (#369) - Fixed the split of
DataFrameTextBenchmark(#358) - Fixed empty
MultiNestedTensorcol indexing (#355)
- Support more stypes in
LinearModelEncoder(#325) - Added
stype_encoder_dictto some models (#319) - Added
HuggingFaceDatasetDict(#287)
- Supported decoder embedding model in
examples/transformers_text.py(#333) - Removed implicit clones in
StypeEncoder(#286)
- Fixed
TimestampEncodernot applyingCyclicEncoderto cyclic features (#311) - Fixed NaN masking in
multicateogricalstype (#307)
- Added support for Boolean masks in
index_selectof_MultiTensor334 - Added more text documentation (#291)
- Added
col_to_model_cfg(#270) - Support saving/loading of GBDT models (#269)
- Added documentation on handling different stypes (#271)
- Added
TimestampEncoder(#225) - Added
LightGBM(#248) - Added time columns to the
MultimodalTextBenchmark(#253) - Added
CyclicEncoding(#251) - Added
PositionalEncoding(#249) - Added optional
col_namesargument inStypeEncoder(#247) - Added
col_to_text_embedder_cfgand useMultiEmbeddingTensorfortext_embedded(#246) - Added
col_encoder_dictinStypeWiseFeatureEncoder(#244) - Added
LinearEmbeddingEncoderforembeddingstype (#243) - Added support for
torch_frame.text_embeddedinGBDT(#239) - Support
MetricinGBDT(#236) - Added auto-inference of
stype(#221) - Enabled
listinput inmulticategoricalstype (#224) - Added
Timestampstype (#212) - Added
multicategoricaltoMultimodalTextBenchmark(#208) - Added support for saving and loading of
TensorFramewith complexstypes. (#197) - Added
stype.embedding(#194) - Added
TensorFrameconcatenation of complex stypes. (#190) - Added
text_tokenizedexample (#174) - Added Cohere embedding example (#186)
- Added
AmazonFineFoodReviewsdataset and OpenAI embedding example (#182) - Added save and load logic for
FittableBaseTransform(#178) - Added
MultiEmbeddingTensor(#181, #193, #198, #199, #217) - Added
to_dense()forMultiNestedTensor(#170) - Added example for
multicategoricalstype (#162) - Added
sequence_numericalstype (#159) - Added
MultiCategoricalEmbeddingEncoder(#155) - Added advanced indexing for
MultiNestedTensor(#150, #161, #163, #165) - Added
multicategoricalstype (#128, #151) - Added
MultiNestedTensor(#149)
- Set
stype.embeddingas the parent ofstype.text_embeddedand unifiedstype.text_embeddedwith its parent in :obj:tensor_frame(#277) - Renamed
torch_frame.stypemodule totorch_frame._stype(#275) - Renamed
text_tokenized_cfgintocol_to_text_tokenized_cfg(#257) - Made
Tromptoutput 2-dim embeddings inforward - Renamed
text_embedder_cfgintocol_to_text_embedder_cfg
- No manual passing of
in_channelstoLinearEmbeddingEncoderforstype.text_embedded(#222)
- Added basic
text_tokenized(#157) - Added
Mercaridataset (#123) - Added the model performance benchmark script (#114)
- Added
DataFrameBenchmark(#107) - Added concat and equal ops for
TensorFrame(#100) - Use ROC-AUC for binary classification in GBDT (#98)
- Infer
task_typein dataset (#97) - Added
text_embeddedexample (#95) - Added
MultimodalTextBenchmark(#92, #117) - Renamed
x_dicttofeat_dictinTensorFrame(#86) - Added
TabTransformerexample (#82) - Added
TabNetexample (#85) - Added dataset
tensorframeandcol_statscaching (#84) - Added
TabTransformer(#74) - Added
TabNet(#35) - Added text embedded stype, mapper and encoder. (#78)
- Added
ExcelFormerexample (#46) - Added support for inductive
DataFrametoTensorFrametransformation (#75) - Added
CatBoostbaseline and tunedCatBoostexample. (#73) - Added
na_strategyas argument inStypeEncoder. (#69) - Added
NAStrategyclass and impute NaN values inMutualInformationSort. (#68) - Added
XGBoostbaseline and updated tunedXGBoostexample. (#57) - Added
CategoricalCatBoostEncoderandMutualInformationSorttransforms needed by ExcelFromer (#52) - Added tutorial example script (#54)
- Added
ResNet(#48) - Added
ExcelFormerEncoder(#42) - Made
FTTransformertakeTensorFrameas input (#45) - Added
Tomptexample (#39) - Added
post_moduleinStypeEncoder(#43) - Added
FTTransformer(#40, #41) - Added
ExcelFormer(#26) - Added
Yandexcollections (#37) - Added
TabularBenchmarkcollections (#33) - Added the
Bank Marketingdataset (#34) - Added the
Mushroom,Forest Cover Type, andPoker Handdatasets (#32) - Added
PeriodicEncoder(#31) - Added
NaNhandling inStypeEncoder(#28) - Added
LinearBucketEncoder(#22) - Added
Trompt(#25) - Added
TromptDecoder(#24) - Added
TromptConv(#23) - Added
StypeWiseFeatureEncoder(#16) - Added indexing/shuffling and column select functionality in
Dataset(#18, #19) - Added
Adult Census Incomedataset (#17) - Added column-level statistics and dataset materialization (#15)
- Added
FTTransformerConvs(#12) - Added
DataLoadercapabilities (#11) - Added
TensorFrame.index_select(#10) - Added
Dataset.to_tensor_frame(#9) - Added base classes
TensorEncoder,FeatureEncoder,TableConv,Decoder(#5) - Added
TensorFrame(#4) - Added
Titanicdataset (#3) - Added
Datasetbase class (#3)