Add SubscriptionsWordCheck #63

Merged
CSDUMMI merged 3 commits from SubscriptionsWordCheck into main 2023-03-15 17:11:31 +00:00
CSDUMMI commented 2023-03-06 20:06:55 +00:00 (Migrated from gitlab.com)

This MR adds a new SubscriptionsWordCheck. It follows the same pattern as the SubscriptionsDomainCheck:

  1. Fetch a CSV file from a uri.
  2. Parse the CSV file
  3. Match activities against the contents of the CSV file.

The API varies only in that the SubscriptionsWordCheck replaces the field #domain by #pattern (see examples/word_blocklist.csv).

Below you can see a sample EmailReport created by this Check.

This is an automated report by ActivityColander. The system has detected a message it marks as 'spam'

Sender:     @csdummi@babka.social
ID:     https://babka.social/users/csdummi/statuses/109978107600014847
Activity Type:  Create
Result: spam: [1/2]
Pipeline:   
1. Subscriptions Domain Check:
    Score:  0
    Weight: 1
    Note:   Domain babka.social domain does not match any patterns
2. Subscriptions Word Check:
    Score:  1
    Weight: 1
    Note:   <p><span class="h-card"><a href="https://1.test.jorisgutjahr.eu/@alice" class="u-url mention">@<span>alice</span></a></span> nazi</p> matched nazi from https://gitlab.com/babka_net/activitycolander/-/raw/SubscriptionsWordCheck/examples/word_blocklist.csv because of ""
Received at:    Mon Mar  6 19:59:14 2023
Goodbye,
ActivityColander


This MR adds a new SubscriptionsWordCheck. It follows the same pattern as the SubscriptionsDomainCheck: 1. Fetch a CSV file from a uri. 2. Parse the CSV file 3. Match activities against the contents of the CSV file. The API varies only in that the SubscriptionsWordCheck replaces the field #domain by #pattern (see `examples/word_blocklist.csv`). Below you can see a sample EmailReport created by this Check. ``` This is an automated report by ActivityColander. The system has detected a message it marks as 'spam' Sender: @csdummi@babka.social ID: https://babka.social/users/csdummi/statuses/109978107600014847 Activity Type: Create Result: spam: [1/2] Pipeline: 1. Subscriptions Domain Check: Score: 0 Weight: 1 Note: Domain babka.social domain does not match any patterns 2. Subscriptions Word Check: Score: 1 Weight: 1 Note: <p><span class="h-card"><a href="https://1.test.jorisgutjahr.eu/@alice" class="u-url mention">@<span>alice</span></a></span> nazi</p> matched nazi from https://gitlab.com/babka_net/activitycolander/-/raw/SubscriptionsWordCheck/examples/word_blocklist.csv because of "" Received at: Mon Mar 6 19:59:14 2023 Goodbye, ActivityColander ```
CSDUMMI commented 2023-03-06 20:06:55 +00:00 (Migrated from gitlab.com)

requested review from @emacsen

requested review from @emacsen
CSDUMMI commented 2023-03-06 20:06:55 +00:00 (Migrated from gitlab.com)

assigned to @CSDUMMI

assigned to @CSDUMMI
CSDUMMI commented 2023-03-06 20:07:48 +00:00 (Migrated from gitlab.com)

added 1 commit

  • e8be7c61 - Add SubscriptionsWordCheck to the default pipeline

Compare with previous version

added 1 commit <ul><li>e8be7c61 - Add SubscriptionsWordCheck to the default pipeline</li></ul> [Compare with previous version](/babka_net/activitycolander/-/merge_requests/29/diffs?diff_id=620876065&start_sha=3b2abe55103551719032a4003493232ba47355b8)
CSDUMMI commented 2023-03-15 16:49:03 +00:00 (Migrated from gitlab.com)

TODO: case-insensitive matching

TODO: case-insensitive matching
CSDUMMI commented 2023-03-15 17:11:31 +00:00 (Migrated from gitlab.com)

mentioned in commit af9b6c7bfa

mentioned in commit af9b6c7bfa44f6b8cb50230a18bd8f1468cbc8dc
CSDUMMI (Migrated from gitlab.com) merged commit af9b6c7bfa into main 2023-03-15 17:11:32 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
babka/activitycolander!63
No description provided.