SimpleKeywordCheck: Case insensitive checks in Unicode. #24

Closed
opened 2022-10-18 17:09:10 +00:00 by CSDUMMI · 2 comments
CSDUMMI commented 2022-10-18 17:09:10 +00:00 (Migrated from gitlab.com)

SimpleKeywordCheck has two problems:

  1. It should make a case insensitive comparision of two strings by lower casing both strings. But string.lower only lower cases ASCII charactes (e.g. Ä is not translated to ä)
  2. It should check every word in a string, but using the "%a+" pattern only matches alphanumerics in ASCII and thus cuts of at the first non-ASCII character. (e.g. utils.str_words("bÄnmich") -> { "b", "mich" })
SimpleKeywordCheck has two problems: 1. It should make a case insensitive comparision of two strings by lower casing both strings. But `string.lower` only lower cases ASCII charactes (e.g. Ä is not translated to ä) 2. It should check every word in a string, but using the `"%a+"` pattern only matches alphanumerics in ASCII and thus cuts of at the first non-ASCII character. (e.g. `utils.str_words("bÄnmich")` -> `{ "b", "mich" }`)
CSDUMMI commented 2022-10-18 17:09:10 +00:00 (Migrated from gitlab.com)

assigned to @CSDUMMI

assigned to @CSDUMMI
CSDUMMI commented 2023-03-01 12:15:05 +00:00 (Migrated from gitlab.com)

If we ever implement a keyword check via Subscriptions, this issue will have to be readdressed. But as of v0.7.0 the WholeWordCheck has been removed (a.k.a SimpleKeywordCheck).

If we ever implement a keyword check via Subscriptions, this issue will have to be readdressed. But as of v0.7.0 the WholeWordCheck has been removed (a.k.a SimpleKeywordCheck).
CSDUMMI (Migrated from gitlab.com) closed this issue 2023-03-01 12:15:05 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
babka/activitycolander#24
No description provided.