TY - JOUR
T1 - App Miscategorization Detection
T2 - A Case Study on Google Play
AU - Surian, Didi
AU - Seneviratne, Suranga
AU - Seneviratne, Aruna
AU - Chawla, Sanjay
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2017/8
Y1 - 2017/8
N2 - An ongoing challenge in the rapidly evolving app market ecosystem is to maintain the integrity of app categories. At the time of registration, app developers have to select, what they believe, is the most appropriate category for their apps. Besides the inherent ambiguity of selecting the right category, the approach leaves open the possibility of misuse and potential gaming by the registrant. Periodically, the app store will refine the list of categories available and potentially reassign the apps. However, it has been observed that the mismatch between the description of the app and the category it belongs to, continues to persist. Although some common mechanisms (e.g., a complaint-driven or manual checking) exist, they limit the response time to detect miscategorized apps and still open the challenge on categorization. We introduce FRAC+: (FR)amework for (A)pp (C)ategorization. FRAC+ has the following salient features: (i) it is based on a data-driven topic model and automatically suggests the categories appropriate for the app store, and (ii) it can detect miscategorizated apps. Extensive experiments attest to the performance of FRAC+. Experiments on Google Play shows that FRAC+'s topics are more aligned with Google's new categories and 0.35-1.10 percent game apps are detected to be miscategorized.
AB - An ongoing challenge in the rapidly evolving app market ecosystem is to maintain the integrity of app categories. At the time of registration, app developers have to select, what they believe, is the most appropriate category for their apps. Besides the inherent ambiguity of selecting the right category, the approach leaves open the possibility of misuse and potential gaming by the registrant. Periodically, the app store will refine the list of categories available and potentially reassign the apps. However, it has been observed that the mismatch between the description of the app and the category it belongs to, continues to persist. Although some common mechanisms (e.g., a complaint-driven or manual checking) exist, they limit the response time to detect miscategorized apps and still open the challenge on categorization. We introduce FRAC+: (FR)amework for (A)pp (C)ategorization. FRAC+ has the following salient features: (i) it is based on a data-driven topic model and automatically suggests the categories appropriate for the app store, and (ii) it can detect miscategorizated apps. Extensive experiments attest to the performance of FRAC+. Experiments on Google Play shows that FRAC+'s topics are more aligned with Google's new categories and 0.35-1.10 percent game apps are detected to be miscategorized.
KW - App categorization
KW - app market
KW - miscategorization detection
KW - mixture model
KW - von-mises fisher distribution
UR - http://www.scopus.com/inward/record.url?scp=85029113458&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2017.2686851
DO - 10.1109/TKDE.2017.2686851
M3 - Article
AN - SCOPUS:85029113458
SN - 1041-4347
VL - 29
SP - 1591
EP - 1604
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 8
M1 - 7885558
ER -