Releases: Watts-Lab/team_comm_tools
All Caps Fix and Timestamp Robustness - 0.1.7
Release Notes
BUG FIX: Errors in num_all_caps
The function for counting all caps previous had an error in which the logic counted single-letter words. These have now been removed, and the resulting output should more accurately reflect words that are deliberately placed in "ALL CAPS," rather than portions of emojis (e.g., :D) or single-letter capitals (e.g., I, A).
Updated handling to timestamp formats
We were previously fairly lenient in how we handled time stamps; we allowed the time stamp column to contain null or unparseable values, which sometimes led to the TCT throwing uncaught errors. We now validate the formatting of timestamps upfront upon instantiation of the FeatureBuilder, and we do not allow users to proceed without correctly-formatted timestamps. The change in logic also cleans up some code redundancy (in which we were previously validating timestamps separately whenever we were using the timestamp column, rather than once at the beginning).
The following is the revised expected behavior in this version:
- If the user does not pass in a timestamp column, timestamp-related features ['Time Difference', 'Team Burstiness'] are removed from the output but execution proceeds normally.
- If the timestamp column is provided, we verify that there are no null values and that all values are parsable by pd.to_datetime(). Otherwise, halt execution and raise an error for the user to correct issues with the time column.
v0.1.6 - Hot Fix
v0.1.6 - 2025-03-10
What's Changed
- Yuxuan/turn id hot fix by @sundy1994 in #347
- Release 0.1.6 hotfix by @xehu in #348
Full Changelog: v0.1.5...v0.1.6
v0.1.5 - Patch Release
Patch Release - v0.1.4.post2
Defaults in FeatureBuilder were incorrectly specified, leading to undesirable behavior when relying on the default values of the vector_directory. The format is now correct.
Patch Release - v0.1.4.post1
Post-Release: Update README with new function call example for output_file_base.
Patch Release - v0.1.4
- Progress Bars: Loading bar during feature generation helps users better understand the status of the features/estimated completion time.
- Vector Updates: Vectors are now batched for faster generation.
- Vectors are now generated by default WITH punctuation, allowing for more accurate gauging of sentiment.
- Denormalizing LIWC: LIWC is no longer normalized as a rate (i.e., per 100 words), and is instead returned as raw count per utterance. This ensures that the aggregated values of LIWC are more sensible/interpretable. (#306)
- Labeling Feature Columns with the Source: Politeness and Receptiveness features are labeled with the source (e.g., “politeness_convokit” and “receptiveness_yeomans”) (#300).
- Easy Access to List of Generated Features: There is now an easier way to access the names of features and columns generated (by using my_feature_builder.feature_names) as well as to access the names of the columns generated (
my_feature_builder.chat_features,my_feature_builder.conv_features_base,my_feature_builder.conv_features_all) (#304) - More Defaults for Input Params: Input parameters have defaults, so it’s easier than ever to use the FeatureBuilder (all you need is the input dataframe). (#286)
- One File Path to Rule Them All: You can generate files at all three levels automatically using the “output_file_base” parameter, rather than separately specifying the output paths for all three files. This also creates a more streamlined workaround than the current way of specifying paths, which is a bit counterintuitive. (However, we maintain full backwards-compatibility; this patch release does not change the behavior in which outputs are saved in the
output/chat/...path. (#299). - Website Auto-Updating: The project website will now auto-update alongside changes in dev, and documentation has been updated alongside these changes.
- Other Bug Fixes: Bug fixes for NLTK punkt: #302
v0.1.3 - Patch Release - Dependency and Documentation Updates
v0.1.3 - 2024-09-16
Added
- Documentation: Updated documentation and our requirement files
Fixed
- Dependency: Our current required version of torch (2.4.0) has a known issue that causes an error for Windows users; we have updated the requirement to 2.4.1 to resolve this issue.
What's Changed
Full Changelog: v0.1.2...v0.1.3
Patch Release - v0.1.2
Security patch ensuring privacy of LIWC lexicons.
v0.1.1 - Patch Release: Dependency and Documentation Updates
v0.1.1 - 2024-08-09
Added
- Documentation: Updated the user guide to properly install the package from pip and dependencies.
- Performance: Updated a CLI command
download_resourcesthat downloads spacy'sen_core_web_smmodel and NLTK data
Fixed
- Dependency: Resolved the issue that
en-core-web-smand NLTK resources can't be downloaded upon installation. Now they are automatically added when the user runs Feature Builder for the first time if they're missing in the environment.
What's Changed
- Release v0.1.1 by @xehu and @sundy1994 in #274
- Yuxuan/dependency issue by @xehu and @sundy1994 in #275
Full Changelog: v0.1.0...v0.1.1