/ Research Institute Finds Unified Software Stack WellSuited to Scientific Computing
Overview
Country or Region:Italy
Industry:Life sciences
Customer Profile
The Microsoft Research–University of Trento Centre for Computational and Systems Biology (COSBI) is a nonprofit research consortium focused on algorithmic systems biology.
Business Situation
A common focus of the research at COSBI is how to speedup computer simulations—as required to simulate highly complex biological processes in a reasonable amount of time.
Solution
COSBI has applied several Microsoft technologies to develop massively parallel applications, including Windows HPC Server 2008, the .NET Framework 4, the Visual Studio 2010 development system, and the Windows Azure platform.
Benefits
Unified software stack for scientific computing
Ease of development and debugging
Performance gains of up to 200 times / / “By helping researchers easily harness the power of parallel computing, we’re enabling them to further their research in the areas of medical science, biology, bio-information, and complex systems in general.”
Professor Corrado Priami, President and Chief Executive Officer, the Microsoft Research–University of Trento Centre for Computational and Systems Biology (COSBI)
The Microsoft Research–University of Trento Centre for Computational and Systems Biology (COSBI) is applying computer science to the development of modeling, simulation, and analysis tools for understanding the behavior of biological systems. Researchers at COSBI have used parallel computing tools from Microsoft software for several projects, including stochastic simulations on a 64-node cluster, in silico simulations on modern graphics processing units (GPUs), development of a componentized framework for combining different research tools, and the movement of computing workloads to the Windows Azure platform. With Microsoft software, COSBI has a unified platform and toolset for building parallel applications that target desktop PCs, HPC clusters, and the cloud—capabilities that are helping the institute help researchers answer new types of questions about complex biological systems.

Situation

Located in the Italian Alps, the Microsoft Research–University of Trento Centre for Computational and Systems Biology (COSBI) is advancing the convergence of the life sciences and computer science. Researchers at COSBI work at the leading edge of both disciplines, seeking to apply innovations in computer hardware, software, and programming languages to the development of new modeling, simulation, and analysis tools for understanding the behavior of biological systems—an approach that COSBI calls algorithmic systems biology. COSBI is a nonprofit consortium jointly funded by the University of Trento and Microsoft Research Cambridge.

As of April 2011, the research team at COSBI consists of 27 people: 13 researchers, five PhD students, and nine software developers. One developer is Lorenzo Dematté, PhD, who collaborated with COSBI while he was a PhD candidate focused on modeling languages, parallel simulation, and visualization for systems biology. Dematté joined COSBI as Development Manager in February 2009, and assumed his current role as Software Architect and Technical Lead in September 2010.

In keeping with the institute’s overall mission, a common focus of Dematté’s work at COSBI has been speedingup computer simulations—as required to simulate highly complex biological processes in a reasonable amount of time. More and more, COSBI is turning to parallel computing on commodity hardware as a means of addressing this challenge. “Biological systems are some of the most massively parallel systems ever studied,” Dematté says. “Just as many businesses are turning to parallelism to speedup massive computational workloads, we at COSBI are actively adopting parallel computing to achieve the necessary computing performance.”

For Dematté, whose role includes providing guidance to others at COSBI on software approaches and architectures, part of the challenge is finding the best parallel computing tools for each research project. “A common theme across the work that we do is that it’s computationally intensive,” he says. “Within that domain, however, the solution requirements are different. Some workloads are dataintensive and require massive data manipulation, whereas others have little input/output but require intensive computation. The problems can vary greatly, and it’s our mission to study and evaluate the various technologies that are available.”

Solution

In his time at COSBI, Dematté has applied parallel computing technologies from Microsoft to a broad range of research projects, including:

Use of the Windows HPC Server 2008 operating system to run stochastic simulations on the center’s 64-node, high-performance computing (HPC) cluster.

Use of the Microsoft Visual Studio 2010 development system to write parallel code for simulating biological processes on graphics processing units (GPUs) that are based on the NVIDIA CUDA architecture.

Adoption of new parallel libraries in the Microsoft .NET Framework 4 to build a parallelized component framework that enables software tools to be combined in new ways.

Moving computational workloads to the Windows Azure platform.

Running Stochastic Simulations on an HPC Cluster

One of Dematté’s first applications of parallel computing at COSBI came during his work as a PhD candidate in 2007, when he needed to speedup the simulation of biological models. Such simulations are stochastic in nature in that they apply to processes with random characteristics, which are reflected in the simulations with random variables. Through such simulations, researchers can study the behavior of a cell over time, with respect to external stimuli, or simulate the behavior of a population of organisms over several generations. Either way, this approach requires a great number of simulations.

“We define a set of rules that are followed by an organism and then let it ‘live’ within the computer simulation—growing, consuming, interacting with other organisms, and so on,” explains Dematté. “Because the simulations are stochastic and can employ a high number of random variables, we need to run hundreds or thousands of simulations and then use statistics to collect useful data. One simulation run on a desktop PC can take minutes to hours, which is why it is needed to run many of them in parallel.”

To speed up the stochastic simulations, Dematté wrote a simple program that enables simulation programs that run on researchers’ desktop PCs to be offloaded to the center’s high-performance computing (HPC) cluster. Based on the Windows HPC Server 2008 operating system, the cluster consists of 64 compute nodes, each with four processor cores. “With our HPC cluster, we can run up to 256 simulations at a time—oneon each processor core,” says Dematté. “Implementation was easy, using the APIs provided.”

Dematté says that several new and improved features in Windows HPC Server 2008 contributed to that ease of development. “The application programming interfaces are much improved over Windows HPC Server 2003, as they enable using managed .NET languages,” he says. “Support for Windows PowerShell was also useful, as it enabled us to use Windows PowerShell scriptlets instead of writing our own code.”

The ability to easily offload simulations to the cluster enabled COSBI to win an international biological-modeling competition, in which researchers ran 5,000 simulations to determine how the division of cells is related to the circadian rhythm of a single cell. The effort involved using the HPC cluster, a mix of scripts, and a custom application that used the Windows HPC Server 2008 APIs to submit and monitor the jobs.

“It still took seven days to complete the required 5,000 simulations on the HPC cluster—just in time to produce the results we needed for the competition,” says Dematté. “There’s no way we could have done this on a desktop workstation; it would have taken more than a year.”

Harnessing the Processing Power of Graphics Processing Units

In another project, part of his PhD thesis on how to speedup biological simulations, Dematté had to find a way to speed up the performance of a single in silico simulation. (The term in silico means “performed on computer or via computer simulation”—an analogy to the Latin phrases in vivo and in vitro, which are used in systems biology to refer to experiments done inside and outside living organisms to test different experimental conditions and aid in the discovery of the dynamics that regulate biological systems.)

As with the previous project, the computational power required by these simulations exceeded that provided by common desktop PCs. In this case, the workload was a prime candidate for the massively parallel processing architecture of today’s common graphics processing units.

“The goal of this project was to speedup a single simulation, in which we had to track many molecules inside of a cell,” says Dematté. “The user for such a tool is a biologist who needs to experiment with a biological model, in which the position in space of the individual molecules is important. Such experimentation can be very time-consuming, with a single simulation taking hours on a common desktop PC. Furthermore, the biologist often needs to use a trial-and-error process to refine the model, running several simulations on that model.”

Dematté realized that GPUs would be a good fit for this workload because the simulations involve many molecules that act more or less the same. “We need to perform each simulation step on many molecules in parallel, in much the same way that a GPU is designed to perform operations on a large number of polygons in parallel,” he explains.

Programming for this project was more time-consuming than for the stochastic simulations on the HPC cluster because Dematté had to use the NVIDIA CUDA library for the C++ programming language instead of a managed .NET Framework language—thereby requiring the manipulation of memory, pointers, and so on. However, he was still able to benefit fromthe many productivity aids in the Microsoft Visual Studio 2010 development system, including its advanced debugging tools, and to execute his solution on a desktop PC running the Windows 7 operating system.

“Debugging was difficult, in that the simulation program runs partly on the CPU and partly on the GPU,” he says. “That said, we were still able to develop the software using Visual Studio 2010 and run it on the latest version of Windows, which made the work easier. I wish we could have used a managed language such as Microsoft Visual C# for the project, but thankfully Visual Studio 2010 is very good for writing and debugging C++ code, too.”

Dematté tested his code on several GPUs, with resulting performance improvements varying broadly depending on the workload and power of the GPU. “We tested on GPUs ranging from a $50 low-end graphics card to an NVIDIA Tesla C1060 Computing Processor, which costs about $1,300 today,” says Dematté. “We achieved performance gains of up to 15 times on low-end graphics cards, and more than 100 times onhigh-end cards.”

Developing a Parallelized Component Framework

In a third project, Dematté had to design a parallel-enabled framework for coordinating the different software tools and components that researchers use—a framework that will be used in COSBI LAB, the next generation of COSBI applications for Algorithmic Systems Biology. “We are building a framework to coordinate different software components, each of which has its own processes and communicates with input-output messages,” explains Dematté. “Components can include those for visualization, simulation, or collecting statistics. We tie them together with the framework and then let them run—the challenge lying in keeping everything coordinated, even with some components running on a desktop PC and others offloaded to run on the HPC cluster.”

Dematté originally parallel-enabled a set of COSBI tools by using the Concurrency and Coordination Runtime and Decentralized Software Services components of the Microsoft Robotics Developer Studio. Those components provided a set of parallel-enabled libraries that could be used separately from the rest of the Robotics Developer Studio toolkit. Since then, the development team at COSBI modified the code to use the new parallel programming libraries provided in the .NET Framework 4, which are supported by new features in Visual Studio 2010.

The parallel programming libraries provided in the .NET Framework 4 include:

Task Parallel Library (TPL), which includes parallel implementations of for and foreach loops (For and For Each in the Visual Basic language) as well as lower-level types for task-based parallelism. Implemented as a set of public types and APIs in the System.Threading.Tasks namespace, the TPL relies on a task scheduler that is integrated with the .NET ThreadPool and that scales the degree of concurrency dynamically so that all available processors and processing cores are used most efficiently.

Parallel Language-Integrated Query (PLINQ), a parallel implementation of LINQ to Objects that combines the simplicity and readability of LINQ syntax with the power of parallel programming. PLINQ implements the full set of LINQ standard query operators as extension methods in the System.Linq namespace, along with additional operators to control the execution of parallel operations. As with code that targets the Task Parallel Library, PLINQ queries scale in the degree of concurrency according to the capabilities of the host computer.

Data Structures for Parallel Programming, which introduces several new types that are useful in parallel programming—including a set of concurrent collection classes that are scalable and thread-safe, lightweight synchronization primitives, and types for lazy initialization. Developers can use these new types with any multithreaded application code, including that which uses the Task Parallel Library and PLINQ.

Although the parallel libraries support both managed and unmanaged .NET languages, Dematté finds them especially useful in parallelizing applications written in C#, which enables him to avoid dealing with issues such as pointers and memory management. “The parallel libraries in the .NET Framework 4 are not only great for parallelizing code, but also very useful for combining tasks in nonsequential ways,” says Dematté. “With our parallel-enabled framework, we can freely compose processes by connecting the various tools that we provide—without having to deal with threads, locks, or other low-level complexities.”

The development team at COSBI is also taking advantage of new parallel diagnostic tools in Visual Studio 2010, which includes new Parallel Stacks and Parallel Tasks windows for debugging code. Visual Studio 2010 Premium and Ultimate also have a new Concurrency Visualizer, which is integrated with the profiler. The visualizations provide graphical, tabular, and numerical data about how a multithreaded application interacts with itself and other programs, enabling developers to quickly identify areas of concern and navigate through call stacks and to relevant call sites in the source code. “The Concurrency Visualizer in Visual Studio 2010 has been especially useful, as it helps us see which parts of the code are bottlenecks and should be parallelized first,” says Dematté.

Moving Parallel Workloads to the Cloud

Today, Dematté and others at COSBI are focused on moving simulation workloads from on-premises systems to the cloud—or, more specifically, to the Windows Azure platform. Along the same lines, Dematté is evaluating new features introduced in Windows HPC Server 2008 R2 Service Pack 1, such as the ability to augment the institute’s on-premises HPC cluster with Windows Azure worker nodes.

“We use computing resources such as our HPC cluster in an unpredictable manner; at times it’s overloaded, and at times it’s dormant for days or weeks,” says Dematté. “Cloud computing provides great elasticity, enabling us to always have enough computing resources at hand. We can ask for several virtual machines, quickly provision them, and give them back when they’re no longer needed, only paying for the computing time we use. Other advantages of offloading computing workloads include the elimination of system maintenance and an improved ability to share the software we develop with others outside of COSBI.”

COSBI has migrated the most computationally demanding workloads to Windows Azure and is actively working on transitioning additional ones to the cloud. “We’ve found the Windows Azure tools for Visual Studio 2010 to be quite strong—including the ability to test applications locally before deploying them to the cloud,” says Dematté. “It’s all very, very promising.”

Benefits

With Microsoft software, COSBI has a unified platform and toolset for building parallel solutions that run on desktop PCs, on HPC clusters, and in the cloud. Through its use of that platform, COSBI is achieving the high levels of performance needed to model, simulate, and analyze complex biological processes. The result is the design and implementation of more scalable simulation tools—firmly grounded in new, massively parallel paradigms. “By helping researchers easily harness the power of parallel computing, we’re enabling them to further their research in the areas of medical science, biology, bio-information, and complex systems in general,” says Professor Corrado Priami, President and Chief Executive Officer at COSBI.

Unified Software Stack for Scientific Computing

Microsoft software gives COSBI a comprehensive software stack for building and running parallel applications. Components of that stack include operating systems, runtimes, and cloud services that enable efficient and scalable parallel execution; libraries and language features that enable developers to easily express parallelism in their applications; and tools that simplify the process of developing, debugging, optimizing, deploying, and managing parallel applications.

“We’ve found Microsoft software to be very wellsuited to scientific computing,” says Dematté. “It’s great to be able to develop, debug, and run applications on a single, integrated platform.”

Ease of Development, Debugging, and Deployment

With Microsoft software, COSBI has the tools required to develop parallel applications with ease. The parallel libraries in the .NET Framework 4 are proving to be especially useful for developers like Dematté, who can now easily write parallel applications. “In the past, parallel programming required an extensive focus on low-level concepts such as thread management and race conditions,” says Dematté. “Today, with proven libraries for parallel computing a core part of the .NET Framework, we’re able to focus on the science of our work instead of how to parallelize it for better performance. New features in Visual Studio 2010 make my job even easier, providing the profiling and other tools necessary to identify bottlenecks and debug parallel code.”