Immunocomputing: a survey

I.Antoniou, S.Gutnikov, V.Ivanov, Yu.Melnikov, A.Tarakanov

International Solvay Institutes for Physics and Chemistry, Campus Plaine ULB, CP231, Bd.du Triomphe, Brussels 1050, Belgium

The University of Oxford, Department of Biochemistry, South Park Road, Oxford OX1 3QU, United Kingdom

Amtel Systems Overseas Ltd., PO Box 307, Circular Road 19/21, Douglas IM99 2BE, United Kingdom

Abstract

The recently appeared notion of the immunocomputing is currently under implementation in the frame of the EU project IMCOMP. The aim of this project is to create a new kind of computational paradigm based on some principles of information processing by proteins and immune networks in the living nature. This paradigm will be used for solving specific complex problems and protection from computer viruses, intruder attacks, noise, and random errors. The implementation of this immunocomputing paradigm will lead to development of a new kind of computer that we propose to call immunocomputer by analogy to the widely spread neurocomputers based on the models of neurons and neural networks. The objective of this review is to compare IMCOMP with the existing approaches in computer science and to highlight its novelty and advantages.

1. Introduction

Biological systems, even at the levels of cells and biomolecules, can be regarded as sophisticated information processing systems and can provide inspiration for various ideas in engineering and technology. However, there are only two systems in animals that possess extraordinary capabilities of information processing such as learning and memory, ability to recognize patterns and to make decisions about how to behave in an unfamiliar environment. These two are (1) nervous system and (2) immune system.

The animal nervous system has been already intensively used in computer science as a biological prototype for mathematical algorithms of artificial neural networks (ANN). Software, based on ANN, has been created and found its hardware implementation in neural computers [20,27].

However, the extraordinary information processing capabilities of the natural immune system has been appreciated only recently. The aim of the IMCOMP project is to create mathematical algorithms based on the principles of functioning of the natural immune systems and to develop software and hardware implementation of these algorithms. In this paper we present the overview of the IMCOMP project.

An introduction to the main principles of functioning of the natural immune systems is given in Section 2. Section 3 contains a brief review of Artificial Immune Systems (AIS) and their applications, as this is the field of computer science closest to IMCOMP. The basic elements and main functional principles of immunocomputing are described in Section 4. These include mathematical background (4.1), principles of information processing and their application for the solution of some data processing problems (4.2), and a brief description of a prototype of the immuno-chip: the basic element of the future immunocomputer (4.3). Finally, in Section 5 the main innovations and objectives of the IMCOMP project are discussed.

2. Overview of the natural immune system

The word immunity (from Latin immunitas) means "freedom from". The main purpose of the immune system is to keep the organism free from unfriendly foreign organisms, cells, or molecules (collectively called pathogens). The organism’s defense against intrusion is multilayered. First there are mechanical barriers: skin, mucus of the respiratory, digestive and urogenital tracts, tears, etc. The second barrier is environmental: excreted body fluids (sweat, saliva, tears, etc.) have physical and chemical characteristics that provide inappropriate living conditions for many pathogens. Pathogens that managed to pass the first lines of defense and enter the body are handled by the immune system

Non-specific innate defence mechanisms exist; this innate immunity is primarily maintained by circulating scavenger cells such as macrophages that ingest extracellular molecules and materials, clearing the system of both pathogens and debris. For the most efficient protection, the specific acquired immunity based on recognition and selective targeting "non-self" patterns has evolved. Acquired immunity is based on a sophisticated physiological mechanism that involves many different types of cells and molecules. It is also called adaptive because it is responsible for immunity that is adaptively acquired during the lifetime of the organism. An important part of adaptive immunity is the ability of the immune system to "memorise" encountered pathogens and to produce enhanced response in the case of repeated intrusion of the same or similar pathogen.

Parts of the pathogen that are recognised by the immune system are called antigens. A single pathogen, e.g. bacteria, may contain a large number of different antigens. The adaptive immune system can be viewed as a distributed detection system, which consists primarily of white blood cells, called lymphocytes that circulate through the body in the blood and lymph. Detection, or recognition, occurs when molecular bonds are formed between antigens and receptors that cover the surface of the lymphocyte. When antigen is detected, a mechanism is triggered that causes proliferation of cells producing antibodies capable of selective binding to that particular antigen. When an antigen is bound with the antibody, its carrier pathogen becomes a target for destruction by macrophages.

Both antigens and cell receptors are molecules of protein nature. The immune system's pattern recognition mechanism must be highly effective: it can distinguish about 105 "self" proteins from more those 1016 "non-self" ones. This powerful recognition mechanism is a property of the immune system as a whole, not that of a single lymphocyte. Each lymphocyte has on its surface receptors of only one type and hence it can recognise only one antigen.

The ability to detect most pathogens requires a huge diversity of lymphocyte receptors, which is achieved by generating lymphocyte receptors through genetic process that provides a huge amount of randomness. When in this random process a lymphocyte with receptors to a "self" protein is created, that lymphocyte is eliminated before it matures. Thus, only lymphocytes with receptors to "non-self" are released into circulation. In this respect, lymphocytes can be viewed as negative detectors, because they detect only “non-self “ patterns, and ignore “self” patterns.

Even though receptors are randomly generated, there are not enough lymphocytes in the body to provide a complete coverage of the space of all possible antigens: one estimate is that there are some 108 different lymphocyte receptors in the body at any given time, while the potential number of antigens is in the order of 1016. Immune protection is a probabilistic process. First, pathogens usually have several different antigens, so there is a chance that at least some of them will be recognised and that is sufficient for triggering immune response. Second, protection is made dynamic by continual circulation of lymphocytes through the body, and by continual turnover of the lymphocyte population. Lymphocytes are typically short-lived (several days) and are continually replaced with new lymphocytes that have new randomly generated receptors. Finally, if by misfortune the immune system of a single organism fails to recognise and resist infection, there is sufficient probability that other organisms in the population will have appropriate detectors at the time of infection and this is sufficient for survival of the species.

The immune learning and memory achieve a more efficient protection against a specific pathogen. If immune system detects an antigen it had not encountered before, it undergoes a primary response, during which it “learns” to recognise that specific antigen more effectively, i.e. it produces a large number of lymphocytes with high affinity for that antigen, through a process called affinity maturation. These so called memory cells remain in circulation and provide faster detection and elimination of the pathogen at the next encounter.

Summary. The natural immune system has many features that are desirable from a computer science standpoint. The system is massively parallel and its functioning is truly distributed. Individual components are disposable and unreliable, yet the system as a whole is robust. Previously encountered infections are detected and eliminated quickly, while novel intrusions are detected on a slower time scale, using a variety of adaptive mechanisms. The system is autonomous, controlling its own behaviour both at the detector and effector levels. Individual organism's immune systems detect infections in slightly different ways, so pathogens that are able to evade the defences of one organism cannot necessarily evade those of every other organism in population.

3. Artificial immune systems

The most close to IMCOMP is the field of Artificial Immune Systems (AIS). The formation of this field could be seen as completed in 1999 when the fist book on the question has been issued [2].

AIS represent the new and rapidly growing field of computer science. AIS are expected to give rise to powerful and robust information processing capabilities for solving complex problems. Like ANN, AIS can learn new information, recall previously learned information and perform pattern recognition in a highly decentralized fashion.

AIS have already been applied in:

–detection of faults in manufacturing

–security of information

–design of vaccines

–control of autonomous mobile robots

–mining of commercial data

–monitoring of plague foci in Central Asia.

3.1. Immune Network Model

Of special interest is the widespread theory of immune networks, formed from the interactions as well as between antibodies and immune cells. Niels Jerne, who worked in the Institute Pasteur of Paris, proposed in 1973 the general theory of idiotypic networks, also called as immune networks [18]. These theories is based on the concept that immune cells (lymphocytes) are not isolated, but communicate with each other among different species of lymphocytes through interaction among antibodies. Accordingly, the identification of antigens is not done by a single recognizing set but rather a system level recognition of the sets connected by antigen-antibody reaction as a network.

Nowadays the existence of the immune networks is established beyond all doubts. Their fragments and interactions have been detected experimentally. It is worth to note that similar networks under the name molecular circuits have been even proposed as a possible molecular basis of neuronal memory in the human brain.

Jerne's immune network theory received a lot of attention among the researchers over the last two decades and many computational aspects of this model are derived for practical use.

From the mathematical viewpoint namely N.Jerne initiated the development of a rigorous framework to modelling immune system. His theory is modelled with differential equations, which simulate the dynamics of lymphocytes.

Based on Jerne's work, Perelson [22] presented a probabilistic approach to idiotypic networks. His approach is very mathematical, discussing more about phase transition in idiotype networks.

3.2. Negative Selection Algorithm

Forrest et. al. [12] developed a negative-selection algorithm for change detection based on the principles of self-nonself discrimination in the immune system.

This approach can be summarised as follows:

  1. Define self as a collection S of strings of length l over a finite alphabet, a collection that needs to be protected or monitor. For example, S may be normal pattern (program, data file) of activity, which is segmented into equal-sized sub-strings.
  1. Generate a set R of detectors, each of which fails to match any string in S. Instead of exact or perfect matching, the method uses a partial matching rule, in which two strings match if and only if they are identical at least r contiguous positions, where r is a suitable chosen parameter.
  2. Monitor S for changes by continually matching the detectors in R against S. If any detector ever matches, then a change is known to have occurred, because the detectors are designed to match any of the original strings in S.

The algorithm seems to have many potential applications in change-detection.

3.3. Other Models

There exist other computational models [10,11] which emulate different immunological aspects, for example, its ability to detect common patterns in a noisy environment, its ability to discover and maintain coverage of diverse pattern classes, and its ability to learn effectively, even when not all antibodies are expressed and not all antigens are presented. Hoffman has compared the immune system and the nervous system, and has found many similarities at the level of system behaviour. Farmer et al. [10], and Bersini and Varela [3] have compared the immune system with learning classifier systems. Gilbert and Routen [14] experimented with immune network model to create a content-addressable auto-associative memory, specifically for image recognition.

3.4. Some Applications

The models based on immune system principles are finding increasing applications in the fields of science and engineering.

3.4.1. Computer Security

S.Forrest and her group at the University of New Mexico are working on a research project with a long-term goal to build an artificial immune system for computers. Their computer immune system has to protect a computer against non-authorized use of computer facilities, maintain the integrity of data files, and prevent the spread of computer viruses. Their research program is based on the negative-selection algorithm.

3.4.2. Anomaly Detection in time series data

Dasgupta and Forrest [6] experimented with several time series data sets (both real and model) to investigate the performance of the negative-selection algorithm for detecting anomaly in the data series. The objective of this work is to develop an efficient algorithm that can be used for noticing any changes in steady-state characteristics of a system or a process. In this case, the notion of self is considered as the normal behaviour patterns of the monitored system. Any deviation that exceeds an allowable variation in the observed data is considered as an anomaly in the behaviour pattern.

The results have shown that this approach can be used as a tool for automated monitoring of safety-critical operations.

3.4.3. Fault Diagnosis

Ishida [16] studied the mutual recognition feature of the immune network model for fault diagnosis. In his implementation, fault tolerance was attained by mutual recognition of interconnected units in the studied plant. That is, system level recognition was achieved by unit level recognition. The results are very promising and worth further investigation.

Ishiguro et al. [17] applied the immune network model to on-line fault diagnosis of plant systems. This work attempts to develop an integrated fault diagnosis method, which can be used in industrial plants.

3.4.4 AIS for Pattern Recognition

Hunt and Cooke [1996] investigated an AIS based on the theory of immune network within the context of machine learning. Such a system combines the advantages of learning classifier systems

with some of the advantages of neural networks, machine induction and case-based retrieval. They have shown the potential of AIS on a pattern recognition problem, namely the recognition of promoters in DNA sequences.

3.5. Summary

AIS are a subject of great research interest because of their powerful information processing capabilities. In particular, they perform many complex computations in a completely parallel and distributed fashion. Like ANN, AIS can learn new information, recall previously learned information and performs pattern recognition tasks in a highly decentralized fashion. Also learning takes place by evolutionary processes similar to evolutionary computations.

There are many potential application areas in which immunity-based computational models appear to be very useful.

However, a comparison with ANN shows that the field of AIS has not yet:

  1. A clear and sound mathematical basis
  1. Hardware implementation analogous to the existing neurocomputers that were based on ANN.

Nowadays AIS is represented by software tools based on heuristic algorithms, using ideas from genetic algorithms, cellular automata, ANN, etc. Thus, solving the above problems could raise AIS as well as their principal applications (e.g. to information security) on the new level of reliability, flexibility and operating speed.

4. Immunocomputing

The natural immune system is based on interaction of proteins. The main goal of the IMCOMP is to implement the principles of information processing by proteins and immune networks in a new kind of computational paradigm in order to solve specific complex problems while protected from viruses, noise, errors and intrusions. We shall demonstrate that our immunocomputing leads to a new kind of computer, we propose to call immunocomputer by analogy to the widely spread neurocomputers, which are based on the models of neurons and neural networks.

Three main innovations are expected to emerge from the IMCOMP project:

1.Appropriate mathematical framework (formal immune networks);

2.New approach to information processing (immunocomputing);

3. New hardware (immuno-chips).

These are discussed in detail below.

4.1. Appropriate mathematical framework.

According to biological prototypes and their mathematical models [23-26], the principal difference between IMCOMP and other types of computations should be determined by functions of their basic elements. For example, if artificial neuron, as a basic element of ANN and neural computing is considered as a summation with a threshold, connected with fixed neurons [27], then protein as a basic element of the IMCOMP ensures quite other conditions [4]:

  • Spatial conformation of protein is determined by the linear sequence (word) of its amino acid’s code;
  • This conformation determines functions of any protein.

In fact, there is no mathematical models even approach to these demands. Thus we need to develop a new concept of formal protein (FP) as a mathematical abstraction for key biophysical mechanisms of natural proteins’ behavior. The FP has the same importance for IMCOMP as the well-known concept of artificial (or formal) neuron has for the neural computing.

Namely in the frame of interaction between formal proteins, we intend to develop the new concept of Formal Immune Networks (FIN)and demonstrate rigorously, that such networks are able to learn, recognize and solve problems like artificial intelligence systems.

The most close to FIN could be considered mathematical models based on the theory of idiotypic networks of N.Jerne. His theory can be modeled also with differential equations, which simulates the dynamics of lymphocytes – the increase or decrease of the concentration of a set of lymphocyte clones and the corresponding immunoglobines.