aboutsummaryrefslogtreecommitdiffstats
path: root/tests/test_helpers.py
diff options
context:
space:
mode:
authorGravatar Shirayuki Nekomata <[email protected]>2019-10-18 10:39:27 +0700
committerGravatar Shirayuki Nekomata <[email protected]>2019-10-18 10:39:27 +0700
commit8c2871f89846f2c34f52061323dc7503855266ea (patch)
tree34ef3457570340c1dcc3d0aad7855ae029bcba79 /tests/test_helpers.py
parentFix rule alias. (#537) (diff)
Make it easier for user to search for tags
#### Closes #231 Applying the algorithm for `Needles and Haystack` to find and match tag in tags, for example: ![Example](https://cdn.discordapp.com/attachments/634243438459486219/634592981915140107/unknown.png) This only applies to searching tag_name with more than 3 in length, and at least 80% of its letters are found, from left to right. There are 3 levels of checking, stop at first found: - Check if exact name ( case insensitive ) O(1) getting from a dictionary Dict[str, Tag] - Check for all tags that has 100% matching via algorithm - Check for all tags that has >= 80% matching If there are more than one hit, it will be shown as suggestions: ![Suggestions](https://cdn.discordapp.com/attachments/634243438459486219/634595369531211778/unknown.png) In order to avoid api being called multiple times, I've implemented a cache to only refresh itself when the is a gap of more than 5 minutes from the last api call to get all tags. Editing / Adding / Deleting tags will also modify the cache directly. ##### What about other solution like fuzzywuzzy? fuzzywuzzy was considered for using, but from testing, it was giving much lower scores than expected: Code used to test: ```py from fuzzywuzzy import fuzz def _fuzzy_search(search: str, target: str) -> bool: found = 0 index = 0 _search = search.lower().replace(' ', '') _target = target.lower().replace(' ', '') for letter in _search: index = _target.find(letter, index) if index == -1: break found += index > 0 # return found / len(_search) * 100 return ( found / len(_search) * 100, fuzz.ratio(search, target), fuzz.partial_ratio(search, target) ) tests = ( 'this-is-gonna-be-fun', 'this-too-will-be-fun' ) for test in tests: print(test, '->', _fuzzy_search('this too fun', test)) ``` Result from test: ```py this-is-gonna-be-fun -> (30.0, 50, 50) this-too-will-be-fun -> (90.0, 62, 58) ```
Diffstat (limited to 'tests/test_helpers.py')
0 files changed, 0 insertions, 0 deletions