Skip to content

Conversation

@Sadiyahafroz
Copy link

This pull request updates the get_df_parts_ function to handle duplicate entries in the COL_ENTRY column. A new parameter handle_duplicates is introduced, allowing the user to specify how duplicates should be handled (either adding serial numbers or leaving them unchanged). If the parameter is not provided, duplicates will be handled by default with serial numbers to ensure unique indexing. Additionally, minor updates were made to the tutorial for clarity.

…get_df_parts

Updated the get_df_parts function in utils_feature.py to handle duplicate entries in the COL_ENTRY column. If duplicates are found, a serial number (S.No) is added to the COL_ENTRY values to ensure uniqueness before indexing. This prevents errors and ensures smooth processing of sequence parts without data loss, while maintaining the original behavior if no duplicates are present.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant