WWW'24: “Characterizing Ethereum Upgradable Smart Contracts and Their Security Implications”, AR=20.2%, Xiaofan Li, Jin Yang, Jiaqi Chen, Yuzhe Tang, Xing Gao. [preprint], [slides]
Middleware'21 (Industrial track): “Authenticated Key-Value Stores with Hardware Enclaves”, Yuzhe Tang, Kai Li, Q. Zhang, J. Xu, J. Chen. [pdf], [extended version] [slides]
Novelty: Presented a key-value store on SGX with small trusted code inside enclave. The key insight is to ensure performance efficiency by collocating Merkle Hash tree (MHT) with the native Log-structured Merge (LSM) tree inside the KV store.
Middleware'20: “Cost-Effective Data Feeds to Blockchains via Workload-Adaptive Data Replication”, AR=25.2%, Kai Li, Yuzhe Tang, Jiaqi Chen, Zhehu Yuan, C. Xu, J. Xu. [pdf], [extended version], [slides], [talk@Middleware’20], [code]
Novelty: Presented dynamic cost-optimization scheme on the authenticated data feeds to smart contracts and blockchains. The key observation from our performance studies is the following: Data pulling (i.e., blockchains pulling data on-demand from off-chain data feeds) is better suited for write-intensive workloads, and data pushing (data feeds constantly pushing data to the blockchain) is better suited for read-intensive workloads. We propose GRuB, a workload-aware data replication scheme that adapts bewteen data pulling and data pushing by monitoring the read-write ratio in the recent transaction workloads
Summary: This work presents a dynamic cost optimization scheme on Ethereum's data feeds. Data feeding, that is, providing physical world data to smart contracts running on blockchains, becomes a popular paradigm of designing decentralized finance (DeFi) applications and is a multi-million business today. This work presents GRuB, a workload-aware data replications scheme that dynamically adjusts the current data-feed flow between data pulling (i.e., smart contracts pull data to the blockchain on demand) and data pushing (i.e., off-chain data sources push data to the blockchain). The key idea is that static data pulling incurs expensive transactions per read and static data pushing incurs expensive on-chain storage updates per write. Thus, a workload-adaptive scheme can avoid the inefficiency when real-world DeFi workloads dynamically switch between read-intensive and write intensive patterns. Technically, we propose new online decision-making algorithms with proven asymptotic efficiency (i.e., bounded competitiveness). We build a system prototype of GRuB functional with Ethereum and off-chain Google LevelDB. We evaluate GRuB's concrete performance under YCSB workloads and real DeFi workloads (e.g., stablecoins and pegged tokens). The results show that GRuB achieves up to 55% saving in Gas compared with the baseline designs of statically pushed or pulled data feeds.
ICDE'19: “GEM^2-Tree: A Gas-Efficient Structure for Authenticated Range Queries in Blockchain”, Full Paper, AR=26.8%, C. Zhang, C. Xu, J. Xu, Yuzhe Tang, B. Choi. [pdf]
Novelty: Size matters in the design of authenticated data structures (ADS)! The key observation is that in an authenticated data structure (ADS) setting, LSM tree is better suited than B-tree for serving dataset of smaller size. This paper presents an adaptive ADS design that adjust between LSM tree and B tree based on the current dataset size.
Summary: This work presents data outsourcing schemes built on top of untrusted clouds and trusted blockchain for authenticated range queries. The key design challenge tackled is the "Gas" efficiency, that is, the amount of computation done on expensive Ethereum blockchain is minimized. To do so, we propose a novel data authentication scheme that adaptively shape-shifts between a log-structured merge (LSM) tree and a B-tree. The key observation is to choose the most cost-effective data structures (i.e., LSM tree or B tree) based on data sizes. We implement a system prototype on Ethereum, conduct extensive cost analysis and evaluate the concrete monetary cost under YCSB, which shows the proposed scheme saves more than 30% costs compared with state-of-the-art approaches.
BioInformatics’16: “HEALER: Homomorphic computation of ExAct Logistic rEgRes-sion for secure rare disease variants analysis in GWAS”. Shuang Wang, Yuchen Zhang, Wenrui Dai, Kristin Lauter, Miran Kim, Yuzhe Tang, Hongkai Xiong and Xiaoqian Jiang. [pdf]
Summary: This work presents an application of homomorphic encryption for classic biomedical data analysis, GWAS (Genome-wide association studies). With homomorphic-encrypted GWAS, a third-party untrusted cloud can extract useful insights from sensitive human genome data without losing privacy. To reduce the storage and computation cost, we propose new algorithms that evaluate exact logistic regression using a circuit of logarithmic depth to data size. We conduct a case study of using our algorithm in rare Kawasaki Disease.
TKDE’15: “Privacy-Preserving Multi-Keyword Search in Information Networks”. Yuzhe Tang, Ling Liu. [pdf]
Summary: In this work, we build a federated keyword search service over multiple independent document sites. In this distributed system, a site-discovery service is crucial to connect a search to a potential target site. To prevent privacy leakage, the traffic going into the discovery service and coming out is obfuscated; in other words, the keyword-to-site indices maintained by the discovery service are injected with a proper amount of noises with balance between the search/discovery utility and privacy-preserving degree.
ACSAC’14: “Lightweight Authentication of Freshness in Outsourced Key-Value Stores”. AR=19.9% Yuzhe Tang, Ting Wang, Ling Liu, Xin Hu, Jiyong Jang. [pdf], [slides] outsourced databases data integrity
Summary: This work tackles efficient authentication of outsourced data on untrusted cloud storage (e.g., Amazon S3 services). The key technique is a query authentication data structure with efficiency (i.e., short proofs) in both update and read operations. The idea is to use Bloom filters to encode data membership hierarchically in a Merkle tree, which results in incremental tree updates and efficient logarithmic query proofs. By building a prototype on HBase, we evaluate the system throughput under standard YCSB workloads. We confirm the proposed storage system offers higher throughput in an order of magnitude for data stream authentication than the existing work.