Skip to Main Content
 

Global Search Box

 
 
 
 

Files

File List

ETD Abstract Container

Abstract Header

Intelligent Caching to Mitigate the Impact of Web Robots on Web Servers

Rude, Howard Nathan

Abstract Details

2016, Master of Science (MS), Wright State University, Computer Science.
With an ever increasing amount of data that is shared and posted on the Web, the desire and necessity to automatically glean this information has led to an increase in the sophistication and volume of software agents called web robots or crawlers. Recent measurements, including our own across the entire logs of Wright State University Web servers over the past two years, suggest that at least 60\% of all requests originate from robots rather than humans. Web robots display different statistical and behavioral patterns in their traffic compared to humans, yet present Web server optimizations presume that traffic exhibits predominantly human-like characteristics. Robots may thus be silently degrading the performance and scalability of our web systems. This thesis investigates a new take on a classic performance tool, namely web caches, to mitigate the impact of robot traffic on web server operations. It proposes a cache system architecture that:~(i) services robot and human traffic in separate physical memory stores, with separate polices;~(ii) uses an adaptable policy for admitting robot related resources;~(iii) combines a deep neural network with Bayesian models to improve request prediction. Experiments with real data demonstrate (i) significant reduction in bandwidth usage for prefetching and (ii) improvements in hit rate for human driven traffic compared to a number of baselines, especially in configurations where web caches have limited size.
Derek Doran, Ph.D. (Committee Chair)
Tanvi Banerjee, Ph.D. (Committee Member)
John Gallagher, Ph.D. (Committee Member)
60 p.

Recommended Citations

Citations

  • Rude, H. N. (2016). Intelligent Caching to Mitigate the Impact of Web Robots on Web Servers [Master's thesis, Wright State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=wright1482416834896541

    APA Style (7th edition)

  • Rude, Howard. Intelligent Caching to Mitigate the Impact of Web Robots on Web Servers. 2016. Wright State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=wright1482416834896541.

    MLA Style (8th edition)

  • Rude, Howard. "Intelligent Caching to Mitigate the Impact of Web Robots on Web Servers." Master's thesis, Wright State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=wright1482416834896541

    Chicago Manual of Style (17th edition)