Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

WEBEVO: TAMING WEB APPLICATION EVOLUTION VIA SEMANTIC CHANGE DETECTION

Abstract Details

2020, Master of Sciences, Case Western Reserve University, EECS - Computer and Information Sciences.
With the development of web technology and the beginning of the Big Data era, websites have become increasingly rich with web content. A website typically consists of a set of linked web pages, and each web page is a specific collection information written using HTML, which can be displayed to a user in a web browser, such as Chrome. Browsing websites has become a vital part in our daily life, helping people with information, entrainment, education and business. This has led to the development of various technologies for extracting data from websites. To attract users, web pages are continuously evolving with more fancy UIs, which are enabled by adopting more advanced techniques. However, these advanced techniques also cause challenges for developers to develop or test the web pages, compromising the quality and reliability of web pages. Meanwhile, ensuring information retrieval (IR) tools and automated web test scripts to function properly also become problematic due to the changes of the web page structures as well. To resolve the above problems, it is important to build web monitoring tools to monitor the changes of the web elements (e.g., adding or updating a text box) and report those changes to developers in time for better ensuring the high quality of web pages. To detect changes from two different versions of web pages, existing approaches analyze the DOM-tree of web page or applies visual analysis techniques. A HTML web page is essentially a XML document, where each web element is an XML element and thousands of web elements form a Document Object Model (DOM) tree. By comparing the DOM-tree structures of two web pages, it can reveal how a node changed in a web page. However, such analysis is quite limited due to complex website structures. Alternatively, existing work also adopts visual analysis by using image processing techniques to detect changes via computing the similarity of the screenshots for two web pages. But such analysis is easily affected by background colors and the images of other web elements and will produce incorrect results. To address the challenges faced by the existing approaches, in this thesis, we propose a novel framework, WebEvo, that synergistically combines DOM-tree based comparison with a novel non-content change detection module by leveraging semantic and visual information to identify relevant structural changes. In other words, WebEvo not only considers the basic DOM-tree web site structure changes in the new web page, but also analyzes the semantics and appearance changes of each element and identifies the mappings for the changed elements in the new web page. We implemented three modules to process the web page, Dom-Tree Based Changed element detection module is aiming to output changes of web elements by analysis the HTML DOM tree, History-based Detection module is focus on filter out dynamic web elements and Semantics-based module will analysis both web elements text content and web element screenshot similarities. We conduct the evaluation on datasets constructed from 10 real-world web applications. The results show that WebEvo achieves 16.6%, 5.4% and 12.6% improvement over the state-of-the-art tools on the precision, recall and F-1 values, respectively. Also, WebEvo is 35.2% faster in performing the analysis compared to these tools.
Xusheng Xiao (Advisor)
Xusheng Xiao (Committee Chair)
Andy Podgurski (Committee Member)
An Wang (Committee Member)
Yanfang(Fanny) Ye (Committee Member)
73 p.

Recommended Citations

Citations

  • xu, R. (2020). WEBEVO: TAMING WEB APPLICATION EVOLUTION VIA SEMANTIC CHANGE DETECTION [Master's thesis, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1595242401982817

    APA Style (7th edition)

  • xu, rui. WEBEVO: TAMING WEB APPLICATION EVOLUTION VIA SEMANTIC CHANGE DETECTION. 2020. Case Western Reserve University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=case1595242401982817.

    MLA Style (8th edition)

  • xu, rui. "WEBEVO: TAMING WEB APPLICATION EVOLUTION VIA SEMANTIC CHANGE DETECTION." Master's thesis, Case Western Reserve University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=case1595242401982817

    Chicago Manual of Style (17th edition)