Skip to content

Pesty russian detection#76

Merged
JonPurvis merged 8 commits intopestphp:masterfrom
faissaloux:russian-detection
Jun 28, 2025
Merged

Pesty russian detection#76
JonPurvis merged 8 commits intopestphp:masterfrom
faissaloux:russian-detection

Conversation

@faissaloux
Copy link
Contributor

@faissaloux faissaloux commented Jun 26, 2025

In this PR I made russian detection more Pesty.

Before

  • We should use RussianNormalizer.
  • Add .php extension at the end of file name.
  • Does not use pest-plugin-profanity syntax and does not feel as Pest.
  • In case of exception it throws the global \Exception.
  • Does not display the profanity the way the package do (See image).
RussianNormalizer::assertNoRussianProfanity('Tests\Fixtures\HasExplicitRussianProfanity.php');

image

After

  • More Pesty and clean.
expect('Tests\Fixtures\HasExplicitRussianProfanity')->toHaveNoProfanity(language: 'ru');

image

PS: I am not sure about the masked profanities on HasMaskedRussianProfanity should they be detected or not. They are not detected at the time. If it's the case I think I should remove that file.

@JonPurvis
Copy link
Collaborator

Russian ideally should work the same way all other languages work IMO. Technically, couldn't any of the languages have a masked version or is this something specific to Russian?

@faissaloux
Copy link
Contributor Author

Russian ideally should work the same way all other languages work IMO. Technically, couldn't any of the languages have a masked version or is this something specific to Russian?

Yeah I agree!

I don't think it's specific to Russian, this could be done on any other language. So if we support it on Russian we should support it on any other language. So I agree with not supporting masked profanities (At least for now).

Should I work on this?

@faissaloux
Copy link
Contributor Author

One more thing that I am not sure about is if we need to normalize the russian text, so I am testing a bit to be sure about it, before removing or keeping it.

@JonPurvis
Copy link
Collaborator

Russian ideally should work the same way all other languages work IMO. Technically, couldn't any of the languages have a masked version or is this something specific to Russian?

Yeah I agree!

I don't think it's specific to Russian, this could be done on any other language. So if we support it on Russian we should support it on any other language. So I agree with not supporting masked profanities (At least for now).

Should I work on this?

Yes please!

@faissaloux
Copy link
Contributor Author

I did keep the normalisation as I found that it is necessary in some cases, as Russian uses Cyrillic system which have characters close to Latin that can be used.

Example

  • б => 6
  • д => A

ебло can be written as е6ло
говно can be written as roBHo

On normalisation I relied just on similarities between letters but I am not sure if it's used like this by Russians, so I'll keep it like this for now, 'till it's found by the right person to correct it.

@JonPurvis
Copy link
Collaborator

Sounds good to me, is this ready to be merged @faissaloux?

@faissaloux
Copy link
Contributor Author

Yes 🚀

@JonPurvis JonPurvis merged commit a2dc6d2 into pestphp:master Jun 28, 2025
20 checks passed
@JonPurvis
Copy link
Collaborator

Thanks, new release tagged! 🏷️

@faissaloux faissaloux deleted the russian-detection branch June 28, 2025 21:08
JonPurvis pushed a commit that referenced this pull request Jun 29, 2025
In this PR I made russian detection more Pesty.

- We should use `RussianNormalizer`.
- Add `.php` extension at the end of file name.
- Does not use `pest-plugin-profanity` syntax and does not feel as Pest.
- In case of exception it throws the global `\Exception`.
- Does not display the profanity the way the package do (See image).
```php
RussianNormalizer::assertNoRussianProfanity('Tests\Fixtures\HasExplicitRussianProfanity.php');
```

![image](https://github.com/user-attachments/assets/fb51e337-768d-49d3-b0f8-a88287515751)

- More Pesty and clean.
```php
expect('Tests\Fixtures\HasExplicitRussianProfanity')->toHaveNoProfanity(language: 'ru');
```

![image](https://github.com/user-attachments/assets/5e4b785f-f95c-49f9-820e-379691df1de1)

PS: I am not sure about the masked profanities on
`HasMaskedRussianProfanity` should they be detected or not. They are not
detected at the time. If it's the case I think I should remove that
file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants