【Amazon面经】Top K Frequently Mentioned Keywords Amazon OA 开发岗

兔精精 2020-6-30 1894


Given a list of reviews, a list of keywords and an integer k. Find the most popular k keywords in order of most to least frequently mentioned.

The comparison of strings is case-insensitive. Multiple occurances of a keyword in a review should be considred as a single mention. If keywords are mentioned an equal number of times in reviews, sort alphabetically.

Example 1:

Input:

k = 2

keywords = ["anacell", "cetracular", "betacellular"]

reviews = [  

  "Anacell provides the best services in the city",  

  "betacellular has awesome services",  

  "Best services provided by anacell, everyone should use anacell",

]

Output:

["anacell", "betacellular"]

Explanation:

"anacell" is occuring in 2 different reviews and "betacellular" is only occuring in 1 review.

Example 2:

Input:

k = 2

keywords = ["anacell", "betacellular", "cetracular", "deltacellular", "eurocell"]

reviews = [  

  "I love anacell Best services; Best services provided by anacell",  

  "betacellular has great services",  

  "deltacellular provides much better services than betacellular",  

  "cetracular is worse than anacell",  

  "Betacellular is better than deltacellular.",

]

Output:

["betacellular", "anacell"]

Explanation:

"betacellular" is occuring in 3 different reviews. "anacell" and "deltacellular" are occuring in 2 reviews, but "anacell" is lexicographically smaller.

最新回复 (1)
  • Crusader 2021-3-2
    0 引用 2
    
    from collections import defaultdict
    
    def top_k(keywords, reviews, k_val):
       counts = defaultdict(int)
       for review in reviews:
           words_set = set()
           words = review.split()
           for word in words:
               lower_case_word = word.strip(',.').lower()
               words_set.add(lower_case_word)
           for keyword in keywords:
               if keyword in words_set:
                   counts[keyword] += 1
       return sorted(counts.keys(), key=lambda k: (-counts[k], k))[0:k_val]
    
返回