Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

DHT-based Collaborative Web Translation

Abstract Details

2016, PhD, University of Cincinnati, Engineering and Applied Science: Computer Science and Engineering.
Linguistic diversity on the web has stimulated demand for web translation. Online translation services backed by machine translation are able to perform in-place translation on a web page within seconds, giving a web user a general idea of what the web page is about. Nowadays, peer-to-peer applications are adopting Distributed Hash Table (DHT), a serverless approach that distributes loads among all interconnected devices. Kademlia and its variant Mainline DHT have been widely adopted. DHT runs over an overlay network, where interconnected devices as well as keys are assigned node IDs output by a hash function. The value of a key is stored onto one node (or more nodes if needed) whose node ID is the closest to that of the key. Web applications are now in widespread use due to merits of cloud computing, such as cross-platform compatibility and server-centric software maintenance. Node.js unifies server-side and client-side coding and has thrived in recent years. Browserify enables web browsers to exploit existing Node.js modules in the colossal npm repository. WebRTC makes it possible for disparate web browsers to communicate on a real-time basis, paving the way for a browser-based peer-to-peer network where crowdsourced translation can be achieved. Technologies such as bookmarklet, userscript and browser extension empower Internet users to personalize their favorite websites on the fly. The success of Google Translate can be attributed to responsiveness, acceptable accuracy, and integration with web browser. However, Google Translate suffers drawbacks in terms of data privacy, service availability, idiomatic translation, context sensitivity, personalization, minority languages, transliterated text, etc. A DHT-based system for collaborative web translation is proposed to address those drawbacks. Modules of the proposed system include embedded graphical user interface for translation, peer exchanging translation works with its counterparts in an overlay network through WebRTC, translations in volatile memory, non-volatile storage areas for data persistence, and optional servers that assist peer discovery and connection establishment. The proposed system is intended to function in both single-tab and multi-tab scenarios and is capable of synchronization between web browser tabs and peers. It adopts a three-level matching scheme for translation download, taking hyperlinks into account. A caching mechanism is devised to boost performance and lessen network traffic during synchronization. Human users are able to download existing translation from the overlay network, apply translation in their web browsers and upload each piece of translation (whether or not it is their original works) to the overlay network on a voluntary basis. The voluntary sharing mechanism along with inherent hashing of content in DHT helps protect user privacy. A signaling server has been devised to cluster users with similar language background and to get around Network Address Translation (NAT) and proxies, particularly in a multi-tab scenario. Security measures such as Transport Layer Security (TLS) have been taken to guard against common network attacks. A prototype of the proposed system has been implemented. Experimental results showed that the prototype remained responsive in heavy-duty tasks and scaled to hundreds of peers.
Chia Han, Ph.D. (Committee Chair)
Fred Annexstein, Ph.D. (Committee Member)
Anca Ralescu, Ph.D. (Committee Member)
William Wee, Ph.D. (Committee Member)
Xuefu Zhou, Ph.D. (Committee Member)
133 p.

Recommended Citations

Citations

  • Tu, Z. (2016). DHT-based Collaborative Web Translation [Doctoral dissertation, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1479821556144121

    APA Style (7th edition)

  • Tu, Zongjie. DHT-based Collaborative Web Translation. 2016. University of Cincinnati, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1479821556144121.

    MLA Style (8th edition)

  • Tu, Zongjie. "DHT-based Collaborative Web Translation." Doctoral dissertation, University of Cincinnati, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1479821556144121

    Chicago Manual of Style (17th edition)