Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
22346.pdf (3.75 MB)
ETD Abstract Container
Abstract Header
DHT-based Collaborative Web Translation
Author Info
Tu, Zongjie
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=ucin1479821556144121
Abstract Details
Year and Degree
2016, PhD, University of Cincinnati, Engineering and Applied Science: Computer Science and Engineering.
Abstract
Linguistic diversity on the web has stimulated demand for web translation. Online translation services backed by machine translation are able to perform in-place translation on a web page within seconds, giving a web user a general idea of what the web page is about. Nowadays, peer-to-peer applications are adopting Distributed Hash Table (DHT), a serverless approach that distributes loads among all interconnected devices. Kademlia and its variant Mainline DHT have been widely adopted. DHT runs over an overlay network, where interconnected devices as well as keys are assigned node IDs output by a hash function. The value of a key is stored onto one node (or more nodes if needed) whose node ID is the closest to that of the key. Web applications are now in widespread use due to merits of cloud computing, such as cross-platform compatibility and server-centric software maintenance. Node.js unifies server-side and client-side coding and has thrived in recent years. Browserify enables web browsers to exploit existing Node.js modules in the colossal npm repository. WebRTC makes it possible for disparate web browsers to communicate on a real-time basis, paving the way for a browser-based peer-to-peer network where crowdsourced translation can be achieved. Technologies such as bookmarklet, userscript and browser extension empower Internet users to personalize their favorite websites on the fly. The success of Google Translate can be attributed to responsiveness, acceptable accuracy, and integration with web browser. However, Google Translate suffers drawbacks in terms of data privacy, service availability, idiomatic translation, context sensitivity, personalization, minority languages, transliterated text, etc. A DHT-based system for collaborative web translation is proposed to address those drawbacks. Modules of the proposed system include embedded graphical user interface for translation, peer exchanging translation works with its counterparts in an overlay network through WebRTC, translations in volatile memory, non-volatile storage areas for data persistence, and optional servers that assist peer discovery and connection establishment. The proposed system is intended to function in both single-tab and multi-tab scenarios and is capable of synchronization between web browser tabs and peers. It adopts a three-level matching scheme for translation download, taking hyperlinks into account. A caching mechanism is devised to boost performance and lessen network traffic during synchronization. Human users are able to download existing translation from the overlay network, apply translation in their web browsers and upload each piece of translation (whether or not it is their original works) to the overlay network on a voluntary basis. The voluntary sharing mechanism along with inherent hashing of content in DHT helps protect user privacy. A signaling server has been devised to cluster users with similar language background and to get around Network Address Translation (NAT) and proxies, particularly in a multi-tab scenario. Security measures such as Transport Layer Security (TLS) have been taken to guard against common network attacks. A prototype of the proposed system has been implemented. Experimental results showed that the prototype remained responsive in heavy-duty tasks and scaled to hundreds of peers.
Committee
Chia Han, Ph.D. (Committee Chair)
Fred Annexstein, Ph.D. (Committee Member)
Anca Ralescu, Ph.D. (Committee Member)
William Wee, Ph.D. (Committee Member)
Xuefu Zhou, Ph.D. (Committee Member)
Pages
133 p.
Subject Headings
Computer Science
Keywords
DHT
;
Distributed Hash Table
;
Kademlia
;
Collaborative Translation
;
WebRTC
;
Crowdsourcing
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Tu, Z. (2016).
DHT-based Collaborative Web Translation
[Doctoral dissertation, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1479821556144121
APA Style (7th edition)
Tu, Zongjie.
DHT-based Collaborative Web Translation.
2016. University of Cincinnati, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1479821556144121.
MLA Style (8th edition)
Tu, Zongjie. "DHT-based Collaborative Web Translation." Doctoral dissertation, University of Cincinnati, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1479821556144121
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
ucin1479821556144121
Download Count:
1,628
Copyright Info
© 2016, all rights reserved.
This open access ETD is published by University of Cincinnati and OhioLINK.