Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Using Apache Spark's MLlib to Predict Closed Questions on Stack Overflow

Abstract Details

2016, Master of Computing and Information Systems, Youngstown State University, Department of Computer Science and Information Systems.
Monitoring posts quality on the Stack Overflow website is of critical importance to make the experience smooth for its users. It strongly disapproves unproductive discussion and un-related questions being posted. Questions can get closed for several reasons ranging from questions that are un-related to programming, to questions that do not lead to a productive answer. Manual moderation of the site's content is a tedious task as approximately seventeen thousand new questions are posted every day. Therefore, leveraging machine learning algorithms to identify the bad questions would be a very smart and time-saving method for the community. The goal of this thesis is to build a machine learning classifier that could predict if a question will be closed or not, given the various textual and post related features. A training model was created using Apache Spark's Machine Learning Libraries. This model could not only predict the closed questions with good accuracy, but computes the result in a very small time-frame.
Alina Lazar, PhD (Advisor)
Bonita Sharif, PhD (Committee Member)
Yong Zhang, PhD (Committee Member)
37 p.

Recommended Citations

Citations

  • Madeti, P. (2016). Using Apache Spark's MLlib to Predict Closed Questions on Stack Overflow [Master's thesis, Youngstown State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1463790062

    APA Style (7th edition)

  • Madeti, Preetham. Using Apache Spark's MLlib to Predict Closed Questions on Stack Overflow. 2016. Youngstown State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ysu1463790062.

    MLA Style (8th edition)

  • Madeti, Preetham. "Using Apache Spark's MLlib to Predict Closed Questions on Stack Overflow." Master's thesis, Youngstown State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1463790062

    Chicago Manual of Style (17th edition)