Forgot password |Register |Login Password

GEA is a system for the analysis of micro-array experiments results. It includes modules for background correction an normalization (NM stage), principal component analysis (PCA stage), clusterization (CL stage), quality control (QC stage), and main vector determination (MV stage). The procedures use standard techniques as well as new methods and approaches developed mainly by L.I. Brodsky, A.M. Leontovich and other members of the GeneBee group.

The input for each GEA project consists of a set of probes in standard Affymetrix format (one probe per file), and the output of each stage consists of a set of tables (one table per file). For many table the information can be visualized using the developed web interface.

The first stage of each GEA project (NM stage) is background correction and normalization. This stage includes procedures for artifacts reduction and signal rescaling.

In order to improve the quality of background correction and normalization the procedures is initially applied to the overlapping regions of the probes and than the resulting information is combined using smoothing interpolation schemes.

Principal component analysis (PCA stage) is a standard step of the analysis of micro-array experiment results. It is based on the evaluation of eigenvalues and eigenvectors in the gene space and the analysis of eigenvectors that correspond to the greatest eigenvalues.

IN GEA PCA is combined with the initial (rough) clustering to make the analysis more meaningful.

Clusterization stage (CL stage) is designed to form a nested structure of gene clusters. Clusters can be treated as sets of genes with similar patterns, and the nested structure enables the multi-level analysis (from global analysis to the analysis of small sets of genes with very similar behavior).

The usage of complex mathematical models (different metrics, different clusterization schemes) allows to obtain very good results and to adjust the stage to every individual task.

The quality control stage (QC stage) allows to correct the complicated artificial influence on the experiment result and to determine clusters of genes that should be excluded from the biological analysis due to the artificial nature of these clusters.

The idea behind this stage is the analysis of geometrical properties of gene clusters, particularly, the cluster localization on each probe.

The main vector determination stage (MV stage) selects a relatively small subset of genes that are expected to have the maximum connection with the studied biological conditions.

The advantage of these stage is that no ordering of probes with respect to the studied biological condition is required, the analysis is performed in a space of all probe pairs. This analysis uses the ideas of sparse representation theory and includes the modern methods of approximation theory, e.g., greedy expansion with the correction of the expanding elements.