Skip to content

Comments

Feature/event keyword filtering#21

Open
DominikaStefaniak wants to merge 1 commit intodevfrom
feature/events_keyword_filtering
Open

Feature/event keyword filtering#21
DominikaStefaniak wants to merge 1 commit intodevfrom
feature/events_keyword_filtering

Conversation

@DominikaStefaniak
Copy link
Member

Creates three functions that add new columns to the events DataFrame indicating whether an event takes place on the PWr campus. The determination is based on the event's location, summary, and a combined check.


df_events = df_events.copy()
df_events["location_filtered"] = df_events["location"].str.contains(
location_regex, case=False, regex=True, na=False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if regex = False would work (it is faster). If not - stay with current approach

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the location_regex is a regex not a normal string, so as far as I'm concerned regex=True is mandatory here

"""Creates a boolean column "location_filtered" in the input dataframe df_events,
indicating whether the "location" column matches any of the specified regex patterns."""

df_events = df_events.copy()
Copy link

@GregW04 GregW04 Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as making copy is costly for Memory let's make it optional throught the functions -> add parameter: copy: bool = False

r"strefa kultury studenckiej",
r"geocentrum"
]

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reapeated use try to compile patterns once e.g. location_pattern = re.compile(location_combined_regex, re.IGNORECASE

df_events = filter_event_locations(df_events, location_regex)
df_events = filter_event_summaries(df_events, summary_regex)

df_events["filter_events"] = (df_events["location_filtered"] | df_events["summary_filtered"])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change the name for the new column to sth like: is_event_filtered

@GregW04
Copy link

GregW04 commented Feb 13, 2026

@DominikaStefaniak PR review rdy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants