Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Knowledge Driven Search Intent Mining

Jadhav, Ashutosh

Abstract Details

2016, Doctor of Philosophy (PhD), Wright State University, Computer Science and Engineering PhD.
Understanding users’ latent intents behind search queries is essential for satisfying a user’s search needs. Search intent mining can help search engines to enhance its ranking of search results, enabling new search features like instant answers, personalization, search result diversification, and the recommendation of more relevant ads. Hence, there has been increasing attention on studying how to effectively mine search intents by analyzing search engine query logs. While state-of-the-art techniques can identify the domain of the queries (e.g. sports, movies, health), identifying domain-specific intent is still an open problem. Among all the topics available on the Internet, health is one of the most important in terms of impact on the user and forms one of the most frequently searched areas. This dissertation presents a knowledge-driven approach for domain-specific search intent mining with a focus on health-related search queries. First, we identified 14 consumer-oriented health search intent classes based on inputs from focus group studies and based on analyses of popular health websites, literature surveys, and an empirical study of search queries. We defined the problem of classifying millions of health search queries into zero or more intent classes as a multi-label classification problem. Popular machine learning approaches for multi-label classification tasks (namely, problem transformation and algorithm adaptation methods) were not feasible due to the limitation of label data creations and health domain constraints. Another challenge in solving the search intent identification problem was mapping terms used by laymen to medical terms. To address these challenges, we developed a semantics-driven, rule-based search intent mining approach leveraging rich background knowledge encoded in Unified Medical Language System (UMLS) and a crowd-sourced encyclopedia (Wikipedia). The approach can identify search intent in a disease-agnostic manner and has been evaluated on three major diseases. While users often turn to search engines to learn about health conditions, a surprising amount of health information is also shared and consumed via social media, such as public social platforms like Twitter. Although Twitter is an excellent information source, the identification of informative tweets from the deluge of tweets is the major challenge. We used a hybrid approach consisting of supervised machine learning, rule-based classifiers, and biomedical domain knowledge to facilitate the retrieval of relevant and reliable health information shared on Twitter in real time. Furthermore, we extended our search intent mining algorithm to classify health-related tweets into health categories. Finally, we performed a large-scale study to compare health search intents and features that contribute in the expression of search intent from more than 100 million search queries from smarts devices (smartphones or tablets) and personal computers (desktops or laptops).
Amit Sheth, Ph.D. (Advisor)
Krishnaprasad Thirunarayan, Ph.D. (Committee Member)
Michael Raymer, Ph.D. (Committee Member)
Jyotishman Pathak, Ph.D. (Committee Member)
180 p.

Recommended Citations

Citations

  • Jadhav, A. (2016). Knowledge Driven Search Intent Mining [Doctoral dissertation, Wright State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=wright1464464707

    APA Style (7th edition)

  • Jadhav, Ashutosh. Knowledge Driven Search Intent Mining. 2016. Wright State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=wright1464464707.

    MLA Style (8th edition)

  • Jadhav, Ashutosh. "Knowledge Driven Search Intent Mining." Doctoral dissertation, Wright State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=wright1464464707

    Chicago Manual of Style (17th edition)