NEUROINFORMATICS - Abstracts - Alexander N. Gorban

[RUSSIAN]


Selected Publications and Preprints

2007

[EN] A.N. Gorban, N.R. Sumner, and A.Y. Zinovyev Topological grammars for data approximation Applied Mathematics Letters, Volume 20, Issue 4 (2007), 382-386 A method of topological grammars is proposed for multidimensional data approximation. For data with complex topology we define a principal cubic complex of low dimension and given complexity that gives the best approximation for the dataset. This complex is a generalization of linear and non-linear principal manifolds and includes them as particular cases. The problem of optimal principal complex construction is transformed into a series of minimization problems for quadratic functionals. These quadratic functionals have a physically transparent interpretation in terms of elastic energy. For the energy computation, the whole complex is represented as a system of nodes and springs. Topologically, the principal complex is a product of one-dimensional continuums (represented by graphs), and the grammars describe how these continuums transform during the process of optimal complex construction. This factorization of the whole process onto one-dimensional transformations using minimization of quadratic energy functionals allows us to construct efficient algorithms.


2005

[EN] A. Gorban, A. Zinovyev Elastic Principal Graphs and Manifolds and their Practical Applications Computing 75, 359–379 (2005), (DOI) 10.1007/s00607-005-0122-6 Principal manifolds serve as useful tool for many practical applications. These manifolds are defined as lines or surfaces passing through “the middle” of data distribution. We propose an algorithm for fast construction of grid approximations of principal manifolds with given topology. It is based on analogy of principal manifold and elastic membrane. First advantage of this method is a form of the functional to be minimized which becomes quadratic at the step of the vertices position refinement. This makes the algorithm very effective, especially for parallel implementations. Another advantage is that the same algorithmic kernel is applied to construct principal manifolds of different dimensions and topologies. We demonstrate how flexibility of the approach allows numerous adaptive strategies like principal graph constructing, etc. The algorithm is implemented as a C++ package elmap and as a part of stand-alone data visualization tool VidaExpert, available on the web. We describe the approach and provide several examples of its application with speed performance characteristics.


2004

[EN] A.N. Gorban, D.A. Rossiyev, M.G. Dorrer MultiNeuron - Neural Networks Simulator For Medical, Physiological, and Psychological Applications The talk for the 1995 World Congress on Neural Networks, E-print: http://arxiv.org/abs/q-bio.QM/0411034 This work describes neural software applied in medicine and physiology to: investigate and diagnose immune deficiencies; diagnose and study allergic and pseudoallergic reactions; forecast emergence or aggravation of stagnant cardiac insufficiency in patients with cardiac rhythm disorders; forecast development of cardiac arrhythmia after myocardial infarction; reveal relationships between the accumulated radiation dose and a set of immunological, hormonal, and bio-chemical parameters of human blood and find a method to be able to judge by these parameters the dose value; propose a technique for early diagnosis of chor-oid melanomas; Neural networks also help to predict human relations within a group.


[EN] Gorban A.N., Zinovyev A.Y. Elastic principal manifolds and their practical applications E-print http://arxiv.org/abs/cond-mat/0405648 Principal manifolds defined as lines or surfaces passing through "the middle" of the data distribution serve as useful objects for many practical applications. We propose a new algorithm for fast construction of grid approximations of principal manifolds with given topology. One advantage of the method is a new form of the functional to be minimized, which becomes quadratic at the step of the vertexes positions refinement. This makes the algorithm very effective, especially for parallel implementations. Another advantage is that the same algorithmic kernel is applied to construct principal manifolds of different dimensions and topologies. We demonstrate how flexibility of the approach allows easily numerous adaptive strategies like principal graph constructing, etc. The algorithm is implemented as a C++ package elmap and as a part of stand-alone data visualization tool VidaExpert, available on the web. We describe the approach and provide several examples of its applications with speed performance characteristics.


2003

[EN] A. N. Gorban, A. Y. Zinovyev, D.C. Wunsch Application of The Method of Elastic Maps In Analysis of Genetic Texts Proceedings of IJCNN2003 Method of elastic maps allows to construct efficiently 1D, 2D and 3D non-linear approximations to the principal manifolds with different topology (piece of plane, sphere, torus etc.) and to project data onto it. We describe the idea of the method and demonstrate its applications in analysis of genetic sequences.


[EN] A. N. Gorban Neuroinformatics: What are us, where are we going, how to measure our way? The lecture was given at the USA-NIS Neurocomputing opportunities workshop, Washington DC, July 1999 (Associated with IJCNN'99) E-print: http://arxiv.org/abs/cond-mat/0307346 What is neuroinformatics? We can define it as a direction of science and information technology, dealing with development and study of the methods for solution of problems by means of neural networks. A field of science cannot be determined only by fixing what it is "dealing with". The main component, actually constituting a scientific direction, is "THE GREAT PROBLEM", around which the efforts are concentrated. One may state even categorically: if there is no a great problem, there is no a field of science, but only more or less skilful imitation. What is "THE GREAT PROBLEM" for neuroinformatics? The problem of effective parallelism, the study of brain (solution of mysteries of thinking), etc are discussed. The neuroinformatics was considered not only as a science, but as a services sector too. The main ideas of generalized technology of extraction of explicit knowledge from data are presented. The mathematical achievements generated by neuroinformatics, the problem of provability of neurocomputations, and benefits of neural network realization of solution of a problem are discussed.


[EN] Alexander N. Gorban, Andrei Yu. Zinovyev, Tatyana G. Popova Detecting simple cluster structure of triplet distributions in genetic texts Bioinformatics, Submitted, 2003 Motivation: In several recent papers new algorithms were proposed for detecting coding regions without requiring learning dataset of already known genes. In this paper we interpret some of these results and propose a simpler method. Results: Several complete genomic sequences were analyzed, using visualization of tables of triplet counts in a sliding window. The distribution of 64-dimensional vectors of triplet frequencies displays a well-detectable cluster structure. The structure was found to consist of seven clusters, corresponding to protein-coding information in three possible phases in one of the two complementary strands and in the non-coding regions. Awareness of the existence of this structure allows development of methods for the segmentation of sequences into regions with the same coding phase and non-coding regions. This method may be completely unsupervised or use some external information. Since the method does not need extraction of ORFs, it can be applied even for unassembled genomes. Accuracy calculated on the base-pair level (both sensitivity and specificity) exceeds 90%. This is not worse as compared to such methods as HMM, however, has the advantage to be much simpler. Availability: The software and datasets are available at http://www.ihes.fr/~zinovyev/bullet.


[EN] A. N. Gorban, A. Y. Zinovyev, D.C. Wunsch Application of The Method of Elastic Maps In Analysis of Genetic Texts Proceedings of IJCNN2003 Method of elastic maps allows to construct efficiently 1D, 2D and 3D non-linear approximations to the principal manifolds with different topology (piece of plane, sphere, torus etc.) and to project data onto it. We describe the idea of the method and demonstrate its applications in analysis of genetic sequences.
[EN] A. Gorban, A. Rossiev, N. Makarenko, Y. Kuandykov, V. Dergachev Recovering data gaps through neural network methods International Journal of Geomagnetism and Aeronomy vol. 3, no. 2, pages 191-197, December 2002 A new method is presented to recover the lost data in geophysical time series. It is clear that gaps in data are a substantial problem in obtaining correct outcomes about phenomenon in time series processing. Moreover, using the data with irregular coarse steps results in the loss of prime information during analysis. We suggest an approach to solving these problems, that is based on the idea of modeling the data with the help of small-dimension manifolds, and it is implemented with the help of a neural network. We use this approach on real data and show its proper use for analyzing time series of cosmogenic isotopes. In addition, multifractal analysis was applied to the recovered 14C concentration in the Earth’s atmosphere.


[EN] Gorban A.N., Zinovyev A.Yu. Visualization of Data by Method of Elastic Maps and its Applications in Genomics, Economics and Sociology Institut des Hautes Etudes Scientifiques Preprint. IHES M/01/36. Online-version: http://www.ihes.fr/PREPRINTS/M01/Resu/resu-M01-36.html Technology of data visualization and data modeling is suggested. The basic of the technology is original idea of elastic net and methods of its construction and application. A short review of relevant methods has been made. The methods proposed are illustrated by applying them to the real biological, economical, sociological datasets and to some model data distributions.


[EN] Alexander N. Gorban, Katya O. Gorbunova Liquid Brain: Kinetic Model of Structureless Parallelism A new formal model of parallel computations – the Kirdin kinetic machine – is suggested. It is expected that this model will play the role for parallel computations similar to Markov normal algorithms, Kolmogorov and Turing machine or Post schemes for sequential computations. The basic ways in which computations are realized are described; correctness of the elementary programs for the Kirdin kinetic machine is investigated. It is proved that the determined Kirdin kinetic machine is an effective calculator. A simple application of the Kirdin kinetic machine – heap encoding – is suggested. Subprograms similar to usual programming enlarge the Kirdin kinetic machine.


[EN] A.N. Gorban, A.A. Rossiev, D. C. Wunsch II Neural Network Modeling of Data with Gaps: Method of Principal Curves, Carleman's Formula, and Other A method of modeling data with gaps by a sequence of curves has been developed. The new method is a generalization of iterative construction of singular expansion of matrices with gaps. Under discussion are three versions of the method featuring clear physical interpretation: 1) linear – modeling the data by a sequence of linear manifolds of small dimension; 2) quasilinear – constructing “principal curves: (or “principal surfaces”), univalently projected on the linear principal components; 3) essentially non-linear – based on constructing “principal curves”: (principal strings and beams) employing the variation principle; the iteration implementation of this method is close to Kohonen self-organizing maps. The derived dependencies are extrapolated by Carleman’s formulas. The method is interpreted as a construction of neural network conveyor designed to solve the following problems: 1) to fill gaps in data; 2) to repair data – to correct initial data values in such a way as to make the constructed models work best; 3) to construct a calculator to fill gaps in the data line fed to the input.


[EN] Gorban A. N. Neuroinformatics: What are us, where are we going, how to measure our way? What is neuroinformatics? For me here and now neuroinformatics is a direction of science and information technology, dealing with development and study of the methods for solution of problems by means of neural networks. A base example of artificial neural network, which will be referred to below, is a feed-forward network from standard neurons.


[EN] M.Yu.Senashova, A.N.Gorban Back-Propagation of Accuracy Our problem is to determine maximal allowable errors, possible for signals and parameters of each element of a network proceeding from the condition that the vector of output signals of the network should be calculated with given accuracy. Back-propagation of accuracy is developed to solve this problem.


[EN] Alexander N. Gorban, Eugeniy M. Mirkes and Victor G. Tsaregorodtsev Generation of Explicit Knowledge from Empirical Data through Pruning of Trainable Neural Networks International Joint Conference on Neural Networks, Washington, DC July 10-16, 1999 This paper presents a generalized technology of extraction of explicit knowledge from data. The main ideas are ¹⁾ maximal reduction of network complexity (not only removal of neurons or synapses, but removal all the unnecessary elements and signals and reduction of the complexity of elements), ²⁾ using of adjustable and flexible pruning process (the pruning sequence shouldn't be predetermined - the user should have a possibility to prune network on his own way in order to achieve a desired network structure for the purpose of extraction of rules of desired type and form), and ³⁾ extraction of rules not in predetermined but any desired form. Some considerations and notes about network architecture and training process and applicability of currently developed pruning techniques and rule extraction algorithms are discussed. This technology, being developed by us for more than 10 years, allowed us to create dozens of knowledge-based expert systems.


[EN] Gorban A. N. Approximation of Continuos Functions of Several Variables by an Arbitrary Nonlinear Continuous Function of One Variable, Linear Functions, and Their Superpositions Appl. Math. Lett., Vol. 11, No. 3, pp 45-49, 1998


[RU] Gorban A. N. Function of Multiple Variables and Neural Networks Соросовский образовательный журнал, № 12, 1998 (на русском языке) The following problems on the representation of function of multiple variables by means of simpler functions are discussed: are there functions of multiple variables which cannot be exactly represented by means of functions of one variable and when is an approximate representation of functions using simpler functions possible? An introduction to the theory of artificial neural networks is given. It is shown that with the aid of neural networks one can arbitrarily calculate precisely any continuous function.


[EN] Alexander N. Gorban, Yeugenii M. Mirkes, Donald C. Wunsch II High Order Ortogonaltensor Networks: Information Capacity and Reliability Neural networks based on construction of orthogonal projectors in the tensor power of space of signals are described. Sharp estimate of their ultimate information capacity is obtained. The number of stored prototype patterns (prototypes) can many times exceed the number of neurons. A comparison with the error control codes is made.


[EN] Masha Yu. Senashova, Alexander N. Gorban, Donald C. Wunsch II Back-propagation of Accuracy In this paper we solve the problem: how to determine maximal allowable errors, possible for signals and parameters of each element of a network proceeding from the condition that the vector of output signals of the network should be calculated with given accuracy? “Back-propagation of accuracy” is developed to solve this problem.


[RU] Горбань А.Н. Обобщенная аппроксимационная теорема и вычислительные возможности нейронных сетей Исследуются вычислительные возможности искусственных нейронных сетей. В связи с этим происходит возврат к классическому вопросу о представлении функций многих переменных с помощью суперпозиций и сумм функций одного переменного и новая редакция этого вопроса (ограничение одной произвольно выбранной нелинейной функцией одного переменного). Показано, что можно получить сколь угодно точное приближение любой непрерывной функции многих переменных, используя операции сложения и умножения на число, суперпозицию функций, линейные функции а также одну произвольную непрерывную нелинейную функцию одной переменной. Для многочленов получен алгебраический вариант теоремы: любой многочлен может от многих переменных быть за конечное число шагов (точно) получен с использованием операций сложения умножения на число и произвольного (одного) многочлена от одного переменного степени выше 1. Аппроксимационная теорема Стоуна переносится с колец функций на любые их алгебры, замкнутые относительно произвольной нелинейной операции, а также относительно сложения и умножения на число. Для нейронных сетей полученные результаты означают: от функции активации нейрона требуется только нелинейность - и более ничего. Какой бы она ни была, можно так построить сеть связей и подобрать коэффициенты линейных связей между нейронами, чтобы нейронная сеть сколь угодно точно вычисляла любую непрерывную функцию от своих входов.


[EN] Serge Ye. Gilev, Alexander N. Gorban, Donald C. Wunsch II Quasinewtonian Acceleration of Training: a Simple Test Three supervisor methods of training neural networks have been compared in a test. The essence of all methods compared is to minimize step-by-step the function of estimate H and differ from one another by the direction along which a step of this minimization is made: 1) to optimize in random direction; 2) to optimize in direction of H antigradient (steepest descent); 3) to correct this direction by the quasinewtonian single-step method (BFGS formula). The comparison was made on two standard problems: 1) to recognize the direction of cyclic shift of 0-1 sequence; 2) to recognize the symmetry of 0-1 sequence.


[EN] Dmitri N. Golub, Alexander N. Gorban Multi-particle Networks for Associative Memory New light is thrown on Associative Memory Networks. Multi-particle or high-rank tensor approach, which is a generalization of the Hopfield model, is considered for the Pattern Recognition. The approach allows to enhance significantly informational capacity of Associative Memory Networks. A new efficient method is offered for calculating high order or multi-particle nets. The method is confirmed by two experiments, the results of which are presented and discussed in the paper. Key-words : Associative Memory, Neural Network, k-particle Net.

/body>