Yanwei Zhang
Graduate Research Assistant (865) 696-8290 (mobile)
College
of Computing
Georgia Institute of Technology 329957 Georgia Tech Station, Atlanta, GA 30332-1300
RESEARCH INTERESTS
With broad interests in computer systems. Specifically, I have been working on QoS-aware
scheduling, and resource/performance management techniques for high performance and cloud
computing platform.
Seeking
a Summer 2016 systems research/development internship.
EDUCATION
Georgia Institute of Technology at Atlanta, GA Aug. 2012 - Present
* Ph.D. in Computer Science GPA: 3.86/4.0
* Advisor: Karsten Schwan
University of Tennessee at Knoxville, TN Jan. 2010 - Jul. 2012
* M.S. in Computer Science GPA: 3.94/4.0
* Advisor: Xiaorui Wang
Shandong University, Jinan China Sep. 2004 - Jun. 2008
* B.S. in Computer Science Major GPA: 90.4/100
RESEARCH & WORK EXPERIENCE
Georgia Institute of Technology at Atlanta, GA Aug. 2012 - Present
* Graduate Research Assistant
* Researching and developing advanced QoS and resource management techniques for
building more effective systems, to achieve better resource usage and application
performance.
LinkedIn Corporation, Mountain View, CA May - Aug. 2015
* Summer Intern
* Researched and designed more efficient caching policies for LinkedIn's graph system,
to address the long-tail latency that are observed for query requests. Prototype was
evaluated against LinkedIn's experimental platform.
IBM T.J Watson Research Center, Yorktown Heights, NY May - Aug. 2014
* Summer Intern
* Researched and designed resource adaptation policies for energy efficiency
management across multi-tenant virtualized Hadoop clusters. Preliminary results
were conducted based on an openstack cloud environment.
Oak Ridge National Laboratory May - Aug. 2013
* Summer Intern
* Researched designs for active workflow systems managing near-real-time scientific
data analytics via adaptive stream processing techniques.
University of Tennessee at Knoxville, TN Jan. 2010 - Jul. 2012
* Graduate Research Assistant
* Developed various optimization-based systems for efficient resource and power
management of cloud-scale data centers.
Institute of Computing Technology, CAS, Beijing China Sep. 2008 - Nov. 2009
* Research Assistant
* Proposed an analytic model for VM-based data centers to help administrators gain
insights on the upper bound of consolidated physical servers needed to guarantee
QoS with the same loss probability of requests as in dedicated servers.
RESEARCH PROJECTS
Towards Better Sharing With Latency Critical Workloads (Gatech Ongoing Project)
* Exploring new opportunities to better share resources with tail-latency critical
applications to promote more effective resource usage, without sacrificing user-
perceived SLA experience.
PartialGraph: Faster Query on Very Large-Scale LinkedIn Social Networks (LinkedIn)
* Developed more effective caching policies for LinkedIn's graph system, to address
the long-tail latency that are observed for query requests.
* Incremental-based partial update strategy is leveraged to reduce massive network
traffic by users' 2nd-degree network connections. Prototype is evaluated against
LinkedIn's experimental platform.
Dynamic Energy-Efficient Resource Adaption Across Multi-tenant Virtualized Hadoop
Clusters (IBM T.J Watson Research Center)
* Designed resource adaptation policies to achieve a best-effort energy efficiency
across multi-tenant Hadoop clusters.
* First-Fit Decreasing (FFD) bin-packing algorithm is used to determine the destination
Vhadoop cluster to service the job.
Active Workflow System for Near Real-Time Extreme-Scale Science (GaTech and ORNL)
* Developed an active workflow system for near-real-time science knowledge
discovery in the era of ever-increasing unpredictability of system runtime behaviors.
* Designed a two-tier schema in which decisions become adaptively enhanced online
according to the system runtime status, instead of solely relying on determining what
and where to run scientific workflows beforehand, or partial dynamically. This is
enabled by embedding workflow along with data streams.
GreenWare: Greening Cloud-Scale Data Centers to Maximize the Use of Renewable
Energy (University of Tennessee at Knoxville)
* Devised an effective optimization-based model to maximize the use of renewable
energy within allowed operation budgets.
* Modeled the intermittent generation of renewable energy based on massive local
weather data, and formulated our core objective function as a constrained
optimization problem. Linear-fractional programming-based algorithm was proposed
as the optimal solution.
Electricity Bill Capping for Cloud-Scale Data Centers that Impact the Power Markets
* Proposed a two-tier optimization algorithm to minimize the electricity cost, and also
enforce a cost budget on the monthly bill for cloud-scale data centers that impact
power markets.
* Modeled impacts of power demands from cloud computing systems on their local
power prices by analyzing electricity price behaviors in real-world power markets
w.r.t power demands. Mix integer programming-based techniques was proposed.
Capital One Data Analysis Challenge
* Advised Capital One to decide what factors significantly influence decisions for new
bank branch locations.
* Applied Principal Component Analysis (PCA) techniques to analyze the massive
dataset provided by Capital One.
Utility Analysis for Internet-Oriented Server Consolidation in VM-Based Data Centers
(Institute of Computing Technology, Chinese Academy of Sciences, China)
* Proposed an analytic model for VM-based data centers to help data center
administrators gain insights onto the upper bound of consolidated physical servers
needed to guarantee QoS with the same loss probability of requests as in dedicated
servers. The proposed model can also evaluate the server consolidation in terms of
power and utility of physical servers.
* Leveraged queuing theory to model the interaction between user requests with QoS
requirements and capability flowing.
PEER-REVIEWED PUBLICATIONS
* Yanwei Zhang, Matthew Wolf, Karsten Schwan, Qing Liu, Greg Eisenhauer,
Scott Klasky, "Co-Sites: The Autonomous Distributed Dataflows in Collaborative
Scientific Discovery", the 10th Workshop on Workflows in Support of Large-Scale
Science (WORKS '15), in conjunction with SC 2015, November 15-20 2015, Austin,
Texas USA
* Yanwei Zhang, Qing Liu, Scott Klasky, Matthew Wolf, Karsten Schwan, Greg
Eisenhauer, Jong Choi, Norbert Podhorszki, "Active Workflow System for Near Real-
Time Extreme-Scale Science", the 1st Workshop on Parallel Programming for Analytics
Applications (PPAA 2014).
* Yanwei Zhang, Yefu Wang, and Xiaorui Wang, "Electricity Bill Capping for Cloud-
Scale Data Centers that Impact the Power Markets", IEEE/IFIP network operations and
management symposium (ICPP 2012). (Acceptance rate: 28%)
* Yanwei Zhang, Yefu Wang, and Xiaorui Wang, "TEStore: Exploiting Thermal and
Energy Storage to Cut the Electricity Bill for Datacenter Cooling", the 8th International
Conference on Network and Service Management (CNSM 2012). (Acceptance rate:
15.5%)
* Yanwei Zhang, Yefu Wang, and Xiaorui Wang, "GreenWare: Greening Cloud-Scale
Data Centers to Maximize the Use of Renewable Energy", the 12th ACM/IFIP/USENIX
International Middleware Conference (Middleware 2011). (Acceptance rate: 19%)
* Yanwei Zhang, Yefu Wang, and Xiaorui Wang, "Capping the Electricity Cost of Cloud-
Scale Data Centers with Impacts on Power Markets", the 20th ACM International
Symposium on High-Performance Parallel and Distributed Computing (HPDC 2011). (2-
page poster; Acceptance rate: 20%)
* Yefu Wang, Xiaorui Wang, and Yanwei Zhang, "Leveraging Thermal Storage to Cut the
Electricity Bill for Datacenter Cooling", the 4th Workshop on Power-Aware Computing
and Systems (HotPower 2011), in conjunction with the 23rd ACM Symposium on
Operating Systems Principles (SOSP). (Acceptance rate: 26%)
* Ying Song, Yanwei Zhang, Yuzhong Sun and Weisong Shi "Utility Analysis for
Internet-Oriented Server Consolidation in VM-Based Data Centers", the 11th IEEE
International Conference on Cluster Computing (Cluster 2009).
SKILLS AND SELECTED GRADUATE COURSES
* Spoken languages: Chinese (native), English (proficient)
Programming languages
: C, Python, Java (basic)
* Data tools: R
* Platforms: Linux, Windows
* Courses: Distributed Systems, Computer Systems Architecture, Advanced Operating
Systems, Advanced Algorithms, Software Systems, Real-Time System, Data Mining,
Markov Chains/Computer Science, Applied Linear Algebra
HONORS AND AWARDS
* 2011 HPDC conferences Student Travel Award
* 2010-2011 University of Tennessee EECS Department Excellence Fellowship
* 2008 2008 National Scholarship (China)
* 2008 Excellent Undergraduate of Shandong Province
* 2004-2005 Provincial-level Outstanding Students
* 2004-2007 First-class Scholarship for 3 Consecutive Years as Outstanding Students
* 2006 2006 President Scholarship of Shandong University