CS557 Term Paper
Build Distributed System by Raspberry Pi
Rong Zheng
Louisiana Tech University
Ruston, LA 71270
Build Distributed System by Raspberry Pi
Rong Zheng
Abstraction
Cloud computing, which is a popular area for people to pursue the high quality computing and storing. And distributed computing is the fundamentalmechanism for cloud computing. But if we want to learn some realisticabout distributed computing in campus or just in your apartment, which would be difficult. To build a distributed computing system, many of computers should be bought and lots of maintenance fees could cost. However, By deploying Raspberry pi, a real distributed system could be designed and tested in your home within an inexpensive price (Less than $100 if you just build basic distributed system).
- Introduction
Definition of distributed system: in which hardware or software components located at networked computers communicate and coordinate their actions only by message passing.
Raspberry Piis a credit-card-sized single-board computer developed in the UK by the Raspberry Pi Foundation with the intention of promoting the teaching of basic computer science in schools. It is only cost $35 to get a real computer that could do many things as your laptop especially on distributed computing. Ethernet card is its I/O port for networking and its performance just like a regular PC. By exploiting many of these cards together to build a cluster, performance of network information handling could be highly improved, and the cost of maintenance is cheaper than other servers.
A cluster is simply a collection of identical, commodity computer hardware based systems, networked together and running same kind of parallel processing software that allows each node in the cluster to share data and computation. Typically, the parallel programming software is message passing interface, which use TCP/IP along with some basic libraries, to let user to create parallel programs which could split a task into parts suitable to run on multiple computers simultaneously. An API could enable both asynchronous and synchronous process interaction which be provided by message passing interface.
- Build a Cluster by Raspberry Pi
The reason that why I would like build a cluster using Raspberry Pi is:
1)Convenience
Sometimes lots of projects require to using parallel processing to be handle, but when students want to do some experiments test and get to know the distributed system’s working mechanisms, it seems to be hard to be realized in campus because using a real cluster would be unavailable. Build a cluster using raspberry pi only take you few times to build a small cluster that could simply execute lots of parallel projects, especially for web servers. It only occupies a small space for running and you could easily debug for your projects.
2)Cheap
Even though the CPU embedded in Raspberry pi is not very strong, running at 700MHZ, a raspberry pi only costs $35 and it has a good Ethernet to connect to network which is enough for most of uncommercial projects to interact message between each Raspberry pi. The rated power of a raspberry pi is 3.5w which is much smaller than any computers you could find. 5v of input power source is enough to drive it to work perfectly. If you run a cluster build 10 Raspberry pis together, the hardware cost may less than $500. Less than $1 electric cost even if you keep this cluster running 24 hours per day.
There are only four major components needed for working a Raspberry pi cluster: Raspberry pis, Linux operating system (Exp. Raspbian), an MPI library, a load balancer.
For learning reason, I just got two Raspberry pis which could be subsystems, and my laptop could use to be a load balancer.
There are many different software implementations of load balancing. I learn to use Apache’s load balancer module because of the set up process is easy and could support a good performance webserver. And I use Tomcat to deal with dynamic web needed which support PHP, ASP, JSP, etc. My website is using PHP to build so also install PHP5 and MySQL (Database).
Install and configure webserver load balancer
Linux command installs Apache2 and Tomcat in each of Raspberry pi.
sudo apt- get install apache2
sudo apt-get install php5 mysql-server
sudo apt-get install libapache2-mod-auth-mysql php5-mysql phpmyadmin
sudo apt-get install tomcat7
I want to use Java application on my server, Java Development Kit has been download.
sudo apt-get install default-jdk
sudo apt-get install ant git
There are several websites that I learn from them to get to know how to configure a raspberry pi and build them together.
Here's the complete /etc/apache2/sites-available/default for the load balancer:
sudo apt-get install ant git
VirtualHost *:80>
ServerAdminwebmaster@localhost
DocumentRoot /var/www
<Directory />
Options FollowSymLinks
AllowOverride All
</Directory>
<Directory /var/www/>
Options Indexes FollowSymLinksMultiViews
AllowOverride All
Order allow,deny
allow from all
</Directory>
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
<Directory "/usr/lib/cgi-bin">
AllowOverride None
Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
AddHandlercgi-script .py
Order allow,deny
Allow from all
</Directory>
ProxyRequests Off
<Proxy balancer://rpicluster>
BalancerMember
BalancerMember
AllowOverride None
Order allow,deny
allow from all
ProxySetlbmethod=byrequests
</Proxy>
<Location /balancer-manager>
SetHandler balancer-manager
Order allow,deny
allow from 192.168.0
</Location>
ProxyPass /balancer-manager !
ProxyPass / balancer://rpicluster/
ErrorLog ${APACHE_LOG_DIR}/error.log
# Possible values include: debug, info, notice, warn, error, crit,
# alert, emerg.
LogLevel warn
CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost
After all, webserver has been built and could do some basic load balancing control.
Implementation of MPI
For distributed computing, MPI should be fitted in our raspberry pi cluster. As for the MPI implementation, MPICH is been used.
sudo apt-get install mpich2
sudopacman -Syyopenmpi
Compile MPICH is a huge work for a 700Mhz machine to handle especially when we have lots of them. QEMU would help us. QEMU is a CPU architecture emulator and virtualization tool. It is capable to emulate an ARM architecture which is very similar to the raspberry pi. This allows us to boot an Raspberry pi image direct on my laptop. First, we need to download the Linux kernel source, apply a patch to support the Raspberry pi, add the various kernel features and drivers needed and compile.
Two website I have learned how to do that:
After configure the Raspberry pi image with QEMU, I was able to write the compile result into Raspberry pi SD card.
- Result
From here we could see the result of webserver load balancer and distributed computing.
This is the page of testing load balancer manger
I have checked some of reference paper of performance when you use many of Raspberry pis to build a real cluster.
This is the price compare to different type of computer.
Performance without using parallel processing
Performance using parallel processing
We could see that the speed of execution is much faster than the most expensive computer that we are comparing when we use 32 nodes to be a cluster.
Reference
[1]
[2]
[3] Agile Methodology,
[4]
[5]
[6]
[7]
[8]