Skip to content

Conversation

@KristijanArmeni
Copy link
Collaborator

Pull Request type

Please check the type of change your PR introduces:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no API changes)
  • Build-related changes
  • Documentation content changes
  • Other (please describe):

What is the current behavior?

Issue Number: N/A

What is the new behavior?

Does this introduce a breaking change?

  • Yes
  • No

Other information

KristijanArmeni and others added 30 commits October 15, 2025 10:57
…and edge case fixes (civictechdc#248)

* add search features as a test
* Add search functionality to ngram web app
* Fix ngram web app issues
* chore: format code with black and isort
* refactor: cleanup and merging search functionalities
* fix: move click information to a separate routine
* feat: update the data viewer panel to work with filtering
* feat: add optional df arg to get_top_n_stats
* feat: handle reset button in data viewer
* ux: formating in data_info

---------

Co-authored-by: jaehoon <[email protected]>
…test suite refactoring (civictechdc#237)

Fix Korean script classification to use space-separated tokenization. Korean Hangul uses spaces to separate words (like Latin and Arabic), not scriptio continua (character-level) like Chinese/Japanese/Thai.

Co-authored-by: Kristijan Armeni <[email protected]>
* dev: add github workflow to build apps for preview

* fix: try remove reference to .sha variables

* fix: try remove reference to .sha variables
* Add time flexible case

* Remove print statements

* Change naming convention

* Adjust spacing

* Fix formatting

* Update naming convention to military time

* Remove unnecessary test case.

* test file properly formatted
Relates to civictechdc#200 module rename from mangotango to cibmangotree
* enh: drop within message ngram counting

* enh: update ngram analyzer interface version

* feat: remove within message summation in ngram_stats

* chore: update ngram_stats interface

* initial commit: add helper script to generate ngrams_test_input.csv

* test: update ngrams_base test data with a smaller, human readable synthetic test dataset

* test: update the input test file

* test: use params to test for 3 and 4-grams only

* chore: cleanup

* enh: add configurable parameters in the sidebar

* chore: rename ngram_stats and *web folder to match ngrams_base

* chore: update imports after module renaming

* docs: add README.md to ngrams_base root with some data tables

* docs: add input data frame descrption

* fix: use pl.len instead of deprecated pl.count

* fix: remove deprecated MacOS 13 runner in build_exe.yml

* test: add _preprocess_messages wrapper

* refactor: add _extract_ngrams_from_messages() wrapper

* refactor: add _create_ngram_definitions() wrapper

* test: add pytest fixtures for loading data

* test: add remaining unit tests for ngrams_base/main.py

* refactor: add _compute_ngram_statistics wrapper

* refactor: add _create_summary_table

* refactor: add _create_full_report_slice()

* test: add unit test for _compute_ngram_statistics

* chore: cleanup unused fixtures
Created ownership file which was prompted by the application for the Digital Public Good Alliance. 
The file just says that no one owns Mango Tree and it is created as a public good.

* Rename ownership.md to OWNERSHIP.md
---------

Co-authored-by: Kristijan Armeni <[email protected]>
)

doc: update README link, remove AI sections, not needed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants