If S is of tree structure, it has been shown before [4, 5] that for all such functional networks there are lowdimensional manifolds M\subset {\Re}^{n} such that it is sufficient to measure data in a {U}_{\epsilon}environment of M to in order to identify the model properly. Such manifolds M are called data bases. The same authors have proven that the minimal dimension of data bases is equal to the maximum number of input edges of any blackbox node in the network. Moreover, almost all differentiable, monotonic submanifolds M\subset \Omega with dim(M)= maximum number of input variables in a black box node have (at least locally) the properties of a data base. Additionally, direct as well as indirect identification procedures have been analysed and implemented in software [13].
This result is based on the structure of S which guarantees that, despite all nodes in S may be black box models, the overall functional network model cannot represent any smooth function y=y(\underline{x}) depending on n input variables. Now we show that this intrinsic property of hierarchical functional networks is a specific property of the topology of S and allows, if large enough data sets are available, a direct reconstruction of the topology of S from data.
In all functional models, where S has a tree structure, there will be a unique path {P}_{i} connecting each input variable {x}_{i} to the output node. As the paths from inputs i and j to the output node may join in a node k, {P}_{i} and {P}_{j} are not necessarily disjoined. Suppose all node functions are strictly monotonic in all variables with bounded second derivatives. Then the partial derivatives of the output function y=y(\underline{x}) with respect to {x}_{i} are the product of the partial derivatives of all iofunctions {u}_{k} along the path {P}_{i} starting at the input node of {x}_{i} and ending with the output node of the entire model:
{y}_{{x}_{i}}=\left(\prod _{k=2:\mathit{length}({P}_{i})}{\partial}_{{u}_{k1}}{u}_{k}\right)\phantom{\rule{0.2em}{0ex}}{\partial}_{{x}_{i}}{u}_{l}=:{\partial}_{i}{P}_{i}\phantom{\rule{0.2em}{0ex}}{\partial}_{{x}_{i}}{u}_{l},
where {u}_{l} is the inputnode of {x}_{i}. The term {\partial}_{i}{P}_{i} represents the product of the partial derivatives of the functional nodes along the path {P}_{i} with respect to {x}_{i}. Let {P}_{ij} be the common part of the paths {P}_{i} and {P}_{j}, then it holds {\partial}_{i}{P}_{ij}={\partial}_{j}{P}_{ij}.
Let the input variables {x}_{i} and {x}_{j} be input variables to the same input node l whose inputoutput relation is represented by the function {u}_{l}={u}_{l}(\dots ,{x}_{i},\dots ,{x}_{j},\dots )=:{u}_{l}({\underline{x}}^{l}) and {x}_{k} be an input variable to any other node. Then application of the chain rule for derivations with respect to {x}_{i}, {x}_{j} and {x}_{k} leads to the following set of partial differential equations (PDEs) for the output function y=y(\underline{x}):
\begin{array}{c}{y}_{{x}_{i}}={\partial}_{i}{P}_{i}\phantom{\rule{0.2em}{0ex}}{\partial}_{{x}_{i}}{u}^{l},\hfill \\ {y}_{{x}_{j}}={\partial}_{j}{P}_{j}\phantom{\rule{0.2em}{0ex}}{\partial}_{{x}_{j}}{u}^{l}.\hfill \end{array}
(1)
Since the variables i and j are inputs of the same node {u}_{l}, {P}_{i} and {P}_{j} are identical. The respective products of the partial derivatives along both pathways are the same for i and j, leading to the relation:
\frac{{y}_{{x}_{1}}}{{y}_{{x}_{j}}}=\frac{{\partial}_{i}{P}_{i}{u}_{{x}_{i}}}{{\partial}_{j}{P}_{j}{u}_{{x}_{j}}}=\frac{{u}_{{x}_{i}}^{l}}{{u}_{{x}_{j}}^{l}}\left({\underline{x}}^{l}\right).
(2)
All partial derivatives of (2) with respect to any variable {x}_{k} which is not part of {\underline{x}}^{l} will vanish everywhere:
{\partial}_{{x}_{k}}\left(\frac{{y}_{{x}_{i}}}{{y}_{{x}_{j}}}\right)=0,\phantom{\rule{1em}{0ex}}\forall {x}_{k}\notin {\underline{x}}^{l}.
(3)
Therefore, all functions y=y(\underline{x}) which can be represented by the functional network have to satisfy the set of PDEs:
{y}_{{x}_{j}}\phantom{\rule{0.2em}{0ex}}{\partial}_{{x}_{k}}{y}_{{x}_{i}}{y}_{{x}_{1}}\phantom{\rule{0.2em}{0ex}}{\partial}_{{x}_{k}}{y}_{{x}_{j}}=0
(3a)
for all triplets i,j,k\in [1,\dots ,n] where {x}_{i} and {x}_{j} are inputs to the same node, whereas {x}_{k} is the input to another node.
Generalizing this argument, we show that S is associated with an even larger set of structural PDEs that y(\underline{x}) has to satisfy. Now let the root and rank be defined as follows:
Definition 1 Node k shall be the root
{T}_{ij} of the input variables {x}_{i} and {x}_{j}, if the pathways from {x}_{i} to the output of the entire system z and from {x}_{j} to y join for the first time in node k. As in tree structures the pathways from each input variable to the output are unique, all pairs of input variables will have a unique root.
The rank
\mathit{Rg}(k) of a node k shall be given by the length of the path from k to the output z of the entire system. In tree structures each node will have a unique rank.
Then, in tree structures with n input variables {x}_{i} and one output variable y the following theorem holds:
Theorem 1 (StructureConstraint Theorem)
For each triplet of input variables \{{x}_{i},{x}_{j},{x}_{k}\}, i,j,k=1,\dots ,n, the conditions:

(i)
{y}_{{x}_{i}}\phantom{\rule{0.2em}{0ex}}{\partial}_{{x}_{k}}{y}_{{x}_{j}}{y}_{{x}_{j}}\phantom{\rule{0.2em}{0ex}}{\partial}_{{x}_{k}}{y}_{{x}_{i}}=0
and:

(ii)
\{\mathit{Rg}({T}_{ij})>\mathit{Rg}({T}_{ik})\}\wedge \{\mathit{Rg}({T}_{ij})>\mathit{Rg}({T}_{jk})\}
are equivalent.
Remark Eq. (3a) is a special case of the structureconstraint theorem, where \mathit{Rg}({T}_{ij}) is maximal.
Proof For all triplets i, j, k satisfying (ii) the pathways {P}_{i}, {P}_{j} and {P}_{k} must be at least partially disjoined. As (ii) is satisfied, each of the pathways can be decomposed into three components with specific overlaps:
{P}_{i}^{1}={P}_{j}^{1},\phantom{\rule{1em}{0ex}}{P}_{i}^{2}={P}_{j}^{2}={P}_{k}^{2}
(4b)
with
\begin{array}{rl}{\partial}_{j}{P}_{i}^{0}& ={\partial}_{k}{P}_{i}^{0}={\partial}_{i}{P}_{j}^{0}={\partial}_{k}{P}_{j}^{0}={\partial}_{i}{P}_{k}^{0}={\partial}_{j}{P}_{k}^{0}=0,\\ {\partial}_{k}{P}_{i}^{1}& ={\partial}_{k}{P}_{j}^{1}={\partial}_{i}{P}_{k}^{1}={\partial}_{j}{P}_{k}^{1}=0\end{array}
and, because of the partial coincidence of the pathways: {P}_{i}^{1}={P}_{j}^{1},{P}_{i}^{2}={P}_{j}^{2}={P}_{k}^{2}, it holds:
\begin{array}{rl}{\partial}_{i}{P}_{i}^{1}& ={\partial}_{j}{P}_{j}^{1},\\ {\partial}_{i}{P}_{i}^{2}& ={\partial}_{j}{P}_{j}^{2}.\end{array}
Equation (2) leads to
\frac{{y}_{{x}_{1}}}{{y}_{{x}_{j}}}=\frac{{\partial}_{i}{P}_{i}{u}_{{x}_{i}}}{{\partial}_{j}{P}_{j}{u}_{{x}_{j}}}=\frac{{\partial}_{i}{P}_{i}^{0}\times {\partial}_{i}{P}_{i}^{1}\times {\partial}_{i}{P}_{i}^{2}\times {u}_{{x}_{i}}^{{l}_{i}}}{{\partial}_{j}{P}_{j}^{0}\times {\partial}_{j}{P}_{j}^{1}\times {\partial}_{j}{P}_{j}^{2}\times {u}_{{x}_{j}}^{{l}_{j}}}=\frac{{\partial}_{i}{P}_{i}^{0}\times {u}_{{x}_{i}}^{{l}_{i}}}{{\partial}_{j}{P}_{j}^{0}\times {u}_{{x}_{j}}^{{l}_{j}}}.
Because of (4b) the last term does not depend on {x}_{k}, and it holds:
{\partial}_{k}\frac{{y}_{{x}_{i}}}{{y}_{{x}_{j}}}={\partial}_{k}\frac{{\partial}_{i}{P}_{i}^{0}\times {u}_{{x}_{i}}^{{l}_{i}}}{{\partial}_{j}{P}_{j}^{0}\times {u}_{{x}_{j}}^{{l}_{j}}}=0\Rightarrow {y}_{{x}_{i}}\phantom{\rule{0.2em}{0ex}}{\partial}_{{x}_{k}}{y}_{{x}_{j}}{y}_{{x}_{j}}\phantom{\rule{0.2em}{0ex}}{\partial}_{{x}_{k}}{y}_{{x}_{i}}=0
On the other side, if (i) holds, then we can find a decomposition of the respective pathways {P}_{i}, {P}_{j} and {P}_{k} according to eq. (4a) and (4b), resulting in (ii). □
Based on the StructureConstraint Theorem, the structure S of the functional network can be unravelled from the data as follows:
Algorithm 1 Direct hierarchical functional network reconstruction:

i.
Test for any triplet of input variables i, j, k whether condition (i) of the structureconstraint theorem is globally satisfied leading to a full set of satisfied rankroot conditions for the structure S.

ii.
Pick all double combinations i, j where for no k=1,\dots ,n the condition (ii):
\{\mathit{Rg}({T}_{ik})>\mathit{Rg}({T}_{ij})\}\wedge \{\mathit{Rg}({T}_{jk})>\mathit{Rg}({T}_{ij})\}
holds. Then i and j are inputs to the same input node. Use this combinatorial information to distribute all input variables onto their respective input nodes.

iii.
Join the outputs of each input node l to one ‘child’ variable {x}_{l}^{\prime}. The roots for a ‘child’ variable {x}_{l}^{\prime} are equal to those roots of the respective ‘parent’ variables which are not yet identified as input nodes. The respective ranks for the roots of the ‘child’ variables are the ranks of the respective roots of the parent variables minus 1. So we arrive at a new, smaller structure {S}^{\prime} which consists of all nodes which have not been identified in step (ii) as input nodes. Therefore, {S}^{\prime} is identical to the respective part of S, the input variables of {S}^{\prime} are the ‘child’ variables of the input nodes. The respective roots and ranks can be determined from the roots and ranks from S.

iv.
Distribute the ‘child’ variables as input variables of {S}^{\prime} on their input nodes in {S}^{\prime}. This can be performed as described in step (ii) leading to novel ‘grandchild’ variables. To do so, go to step (ii).

v.
In each treestructure there exists m, m<\infty, such that m loops of steps iiiv described above will lead to a structure {S}^{m\prime} where all new input variables have the same root node. Then this common root is the output node of the entire system structure S and the algorithm stops.
Notes

a.
If for all triplets of input variables \{{x}_{i},{x}_{j},{x}_{k}\} the rankroot relations are known, then the adjoint tree structure of S can be directly reengineered from this set of relations. Therefore, if very large sets of data are given (for example, from highthroughput experimentation) such that a reliable test on truth of the conditions (i, ii) for all triplets can be performed, then the structure of the underlying functional network can be directly reconstructed. This direct approach is much more effective than the approach of identifying quantitatively the model for all possible model structures S, then selecting the structure of the model with the lowest residues.

b.
The results described above can be transferred to models with discrete, for example, binary outputs. Then it allows the direct identification of the structure of the functional mechanisms behind the measured data in various scientific applications, if, for example, in the identification of pharmacological mechanisms from highthroughput screening data [16].
The direct network identification algorithm provides a very efficient approach to hierarchical network reengineering. It is superior to onestep reengineering approaches which need the minimization of an error functional of residues, which leads to a highly nonlinear, combinatorial optimization problem. As the algorithm can be generalized to discrete variables, it may be an efficient method for the analysis of next generation sequencing data when large data sets will be available. However, its drawbacks are the existing limitation to tree structures as well as the required estimates for condition (i) which is an illposed problem. Further research will be necessary for the development of stable routines which can be applied by nonexperts in a standardized workflow.