Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
Sinha Vinayak PDF A APPROVED.pdf (1.37 MB)
ETD Abstract Container
Abstract Header
Sentiment Analysis On Java Source Code In Large Software Repositories
Author Info
Sinha, Vinayak
ORCID® Identifier
http://orcid.org/0000-0003-4792-254X
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=ysu1464880227
Abstract Details
Year and Degree
2016, Master of Computing and Information Systems, Youngstown State University, Department of Computer Science and Information Systems.
Abstract
While developers are writing code to accomplish the task assigned to them, their sentiments play a vital role and have a massive impact on quality and productivity. Sentiments can have either a positive or a negative impact on the tasks being performed by developers. This thesis presents an analysis of developer commit logs for GitHub projects. In particular, developer sentiment in commits is analyzed across 28,466 projects within a seven-year time frame. We use the Boa infrastructure’s online query system to generate commit logs as well as files that were changed during the commit. Two existing sentiment analysis frameworks (SentiStrength and NLTK) are used for sentiment extraction. We analyze the commits in three categories: large, medium, and small based on the number of commits using sentiment analysis tools. In addition, we also group the data based on the day of week the commit was made and map the sentiment to the file change history to determine if there was any correlation. Although a majority of the sentiment was neutral, the negative sentiment was about 10% more than the positive sentiment overall. Tuesdays seem to have the most negative sentiment overall. In addition, we do find a strong correlation between the number of files changed and the sentiment expressed by the commits the files were part of. It was also observed that SentiStrength and NLTK show consistent results and similar trends. Future work and implications of these results are discussed.
Committee
Bonita Sharif, PhD (Advisor)
Alina Lazar, PhD (Committee Member)
John Sullins, PhD (Committee Member)
Pages
71 p.
Subject Headings
Computer Science
;
Information Technology
;
Organizational Behavior
Keywords
Sentiment Analysis
;
Emotions
;
Commit logs
;
Java projects
;
Large Software Repositories
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Sinha, V. (2016).
Sentiment Analysis On Java Source Code In Large Software Repositories
[Master's thesis, Youngstown State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1464880227
APA Style (7th edition)
Sinha, Vinayak.
Sentiment Analysis On Java Source Code In Large Software Repositories.
2016. Youngstown State University, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=ysu1464880227.
MLA Style (8th edition)
Sinha, Vinayak. "Sentiment Analysis On Java Source Code In Large Software Repositories." Master's thesis, Youngstown State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1464880227
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
ysu1464880227
Download Count:
16,362
Copyright Info
© 2016, all rights reserved.
This open access ETD is published by Youngstown State University and OhioLINK.