Skip to content

v0.7.0

Choose a tag to compare

@benfred benfred released this 24 Sep 03:45
· 354 commits to main since this release
b55c57c

NVTabular v0.7.0

Improvements

  • Add column tagging API #943
  • Export dataset schema when writing out datasets #948
  • Make dataloaders aware of schema #947
  • Standardize a Workflows representation of its output columns #372
  • Add multi-gpu training example using PyTorch Distributed #775
  • Speed up reading Parquet files from remote storage like GCS or S3 #1119
  • Add utility to convert TFRecord datasets to Parquet #1085
  • Add multi-gpu training example using PyTorch Distributed #775
  • Add multihot support for PyTorch inference #719
  • Add options to reserve categorical indices in the Categorify() op #1074
  • Update notebooks to work with CPU only systems #960
  • Save output from Categorify op in a single table for HugeCTR #946
  • Add a keyset file for HugeCTR integration #1049

Bug Fixes

  • Fix category counts written out by the Categorify op #1128
  • Fix HugeCTR inference example #1130
  • Fix make_feature_column_workflow bug in Categorify if features have vocabularies of varying size. #1062
  • Fix TargetEncoding op on CPU only systems #976
  • Fix writing empty partitions to Parquet files #1097