 Research
 Open access
 Published:
Approximate kernel reconstruction for timevarying networks
BioData Mining volume 12, Article number: 5 (2019)
Abstract
Background
Most existing algorithms for modeling and analyzing molecular networks assume a static or timeinvariant network topology. Such view, however, does not render the temporal evolution of the underlying biological process as molecular networks are typically “rewired” over time in response to cellular development and environmental changes. In our previous work, we formulated the inference of timevarying or dynamic networks as a tracking problem, where the target state is the ensemble of edges in the network. We used the Kalman filter to track the network topology over time. Unfortunately, the output of the Kalman filter does not reflect known properties of molecular networks, such as sparsity.
Results
To address the problem of inferring sparse timevarying networks from a set of undersampled measurements, we propose the Approximate Kernel RecONstruction (AKRON) Kalman filter. AKRON supersedes the Lasso regularization by starting from the LassoKalman inferred network and judiciously searching the space for a sparser solution. We derive theoretical bounds for the optimality of AKRON. We evaluate our approach against the LassoKalman filter on synthetic data. The results show that not only does AKRONKalman provide better reconstruction errors, but it is also better at identifying if edges exist within a network. Furthermore, we perform a realworld benchmark on the lifecycle (embryonic, larval, pupal, and adult stages) of the Drosophila Melanogaster.
Conclusions
We show that the networks inferred by the AKRONKalman filter are sparse and can detect more known genetogene interactions for the Drosophila melanogaster than the LassoKalman filter. Finally, all of the code reported in this contribution will be publicly available.
Background
Understanding the dynamical behavior of living cells from their complex genomic regulatory networks is a challenge posed in systems biology; yet it is one of critical importance (i.e., morphogenesis). Gene expression data can be used to infer, or reverseengineer, the underlying genomic network to analyze the interactions between the molecules. Unfortunately, most of the existing work on reverseengineering genomic regulatory networks estimates one single static network from all available data, which is often collected during different cellular functions or developmental epochs. The idea that molecular networks are remodeled as a function of time and stage is well understood; this conclusion is supported by the developmental networks of sea urchin embryos [1]. Throughout a cellular process, such as cancer progression or anticancer therapy, there may exist multiple underlying “themes” that determine the functionalities of each molecule and their relationships to others, and such themes are dynamic. In signal processing terms, summarizing gene expression data, that comes from different cellular stages, into one network would be similar to characterizing a nonstationary signal by its Fourier spectrum. Biologically, static networks cannot reveal regimespecific or key transient interactions that lead to biological changes.
One of the challenges of inferring a timevarying network is that there are only a few observations available at each time point. This small sample size is amplified by the high dimension of every sample, leading to a small n large p problem (i.e., more variables than observations). In particular, the system is underdetermined. However, exploiting the fact that molecular networks are sparse, one can use compressive sensing to find a solution. Compressive sensing is concerned with the optimal reconstruction of a sparse signal from an underdetermined linear system [2, 3]. Underdetermined systems are quite common in computational biology/ecology, and the application of compressive sensing to solve these underdetermined systems in nature has been a popular solution [4–6]. Compressive sensing theory states that, under the restricted isometry property (RIP), the optimal sparsest solution of a linear system is equivalent to the minimum l_{1}norm solution [2, 3]. Unfortunately, it is almost impossible to check whether a linear system satisfies the RIP condition. In general, the minimum l_{1}norm solution can be far from the optimal sparse solution.
In our previous work [4], we addressed the problem of undersampled sparse systems by proposing a new energyweighted likelihood function that ensures the convergence of the likelihood function for underdetermined systems with unknown covariance. The approach was coined Small sample MUltivariate Regression with Covariance estimation (SMURC) and was applied to infer the wingmuscle gene regulatory networks of the Drosophila Melanogaster during the four phases of its development [4]. However, the estimated networks at every epoch used only the data in the corresponding epoch. In particular, the larval network ignored all the measurements in the previous embryonic phase, and so was the case for the subsequent stages. Other research efforts have been proposed to address the problem of recovering timevarying gene regulatory networks by using dynamic Bayesian models [7], nonparametric Bayesian regression [8], and random graph models [9].
In this contribution, we introduce a new approach to modeling sparse timevarying networks and their applications to gene regulatory networks that are based on our recent work [10, 11]. We start by projecting the Kalman solution onto an “approximately sparse” space by using l_{1}regularization. We further expand upon our previous work by using a Kalman smoother. We then explore the l_{1}neighborhood for sparser solutions by leveraging our recent compressive sensing technique known as Kernel RecONstruction (KRON) [12]. KRON recovers the optimal sparsest solution whether the RIP condition is satisfied or not. However, KRON’s computational complexity is still exponential in the number of parameters p. We, therefore, advance Approximate KRON (AKRON) [11], which builds growing neighborhoods of the l_{1} solution that moves towards the optimal sparsest solution and eventually reaches it. The size of the neighborhood is tunable depending on the computational resources available. We derive theoretical bounds of optimality. The AKRON Kalman filter is validated on synthetic and realworld data sets.
The statespace model and Kalman filter
Following the works in [4, 13], we model the network dynamics using a state space model. The system equation is given by a random walk model, which reflects a lack of prior knowledge of the network topological changes. The observation equation is given by a firstorder differential equation, whose parameters reflect the strength and sign of interactions (positive and negative are activating and repressing, respectively) [14]. The state space model of the incoming edges, \(\mathbf {a}_{i} \in \mathbb {R}^{p}\), for gene i can be shown to be [13]
where i=1,⋯,p and p is the number of genes. \(\boldsymbol {X}(k) \in \mathbb {R}^{p \times n}\) is the gene expression matrix at time k. y_{i}(k) is the rate of change of expression of gene i at time k. w_{i}(k) and v_{i}(k) are the process and observation noise, respectively. These noise processes are assumed to be zero mean Gaussian noise processes with the known covariances Q_{k} and R_{k}, respectively, and uncorrelated to the state vector a_{i}(k). The full connectivity matrix, A(k), can be recovered by simultaneous parallel recovery of its rows \(\boldsymbol {a}_{i}^{t}(k)\) at every time instant k. Thus, we can process each gene in parallel. The Kalman filter can be used to track a(k) [13, 15]; however, this is only if the system is observable. The problem with using a Kalman filter in our setting is that the system is underdetermined (i.e., more variables than equations, p>n). This problem, however, can be circumvented by taking into account the sparsity of the vector a_{i}(k). Since each gene in the genomic regulatory network is governed by only a small number of other genes, these networks are known to be sparse. Furthermore, we have also experimented with a Kalman Smoother that is applied after the Kalman filter. Note that the Kalman Smoother is optional. A Kalman smoother can reduce the covariance of the optimal estimate. We implemented Rauch et al.’s Kalman smoothing algorithm [16] and we compared AKRONKalman both with and without smoothing.
The pseudo code for the proposed Kalman filtering approach is shown in Fig. 1.
Constrained Kalman filtering
It is known that the connectivity of the edges in gene regulatory networks [13] is sparse. Unfortunately, the output of the Kalman filter is likely not going to be sparse. We first start by projecting the Kalman solution at time k, a_{kk}, onto the set of “approximately sparse” vectors by solving the following Lasso problem [17]:
where α∈[0,1] controls the tradeoff between the Kalman estimate and sparsity. An α close to zero will result in a solution that is close to the Kalman estimate, but that may not be sparse. The opposite happens when α is close 1, which will produce a sparser solution, but may be far from the Kalman estimate. This is achieved by minimizing the reconstruction error (i.e., the first term in (3)).
AKRON: a search for a sparser solution
Consider the following l_{0}optimization problem, which finds the optimal sparsest solution in a linear underdetermined system.
where ∥·∥_{0} is the l_{0}norm, which is defined as the support of the vector, \(\boldsymbol {x} \in \mathbb {R}^{p}\), \(\boldsymbol {y} \in \mathbb {R}^{n}\), and \(\boldsymbol {\Phi } \in \mathbb {R}^{n \times p}\). We consider the scenario where p≫n and denote in the sequel s=p−n. Without loss of generality, Φ is assumed to be fullrank. Compressive sensing theory [3] shows that, under the Restricted Isometry Property (RIP) condition on the matrix Φ, the l_{1}norm solution is equivalent to the l_{0}norm solution. Unfortunately, it is impossible to check if the RIP condition is satisfied for a given matrix. Despite this strict condition, l_{1} has been routinely used to find a sparse solution in systems of the form (4).
The proposed Approximate Kernel RecONstruction (AKRON) is an approximation to computationally complex Kernel RecONstruction (KRON) problems [12]. KRON is able to achieve an exact solution to (4), but the algorithm becomes computationally expensive for typically p>15. AKRON, detailed below, is introduced to balance the tradeoff between the computational resources that are available and the accuracy of the reconstruction.
The Kalman filter estimate is first sparsified by incorporating l_{1} regularization in (3) (line 5 of Fig. 1). However, the l_{1} projection is not guaranteed to be the optimal sparsest solution. AKRONKalman filter (AKRONKF) starts off from the l_{1}regularized Kalman estimate in (3). Then, the s=p−n smallest elements of the l_{1} projection in (3) are set to zero. The logic behind this strategy is to use the l_{1}projection to guess the position of the zeros in the optimal solution. Given that the kernel of the system matrix Φ in (3) has dimension s, we know that if s zero locations are correctly set, then the optimal sparsest solution can be exactly found by solving the linear system in (4) [12].
Following this reasoning, AKRON finds a sparser solution by exploring δneighborhoods of the l_{1}projection. The central idea behind AKRON’s δneighborhoods is as follows: (i) find the indices with the (s+δ) smallest magnitudes of the l_{1} solution, (ii) set exactly s of these indices to zero, (iii) resolve the system Φx=y. All the possible \(s+\delta \choose s\) combinations of the smallest elements in the solution of (3) are evaluated. This idea can also be viewed as a “perturbation” of the l_{1} approximation to make it closer to the l_{0}norm. The size of the neighborhood δ is tunable depending on the computational power available, and vary from 0 (l_{1}approximation) to n (KRON, i.e., perfect reconstruction).
Example: To understand the AKRON algorithm and illustrate the importance of the δneighborhoods, we present a simple numerical example. Consider the following randomly generated noiseless system as in (4):
The optimal ℓ_{0}norm sparsest solution is given by
The ℓ_{1} solution, which solves (4), is given by
Clearly, the l_{1}solution is not as sparse as the optimal solution and has incorrect zero locations. We have n=3,p=5 and thus s=2. If we choose δ=1 the AKRON considers the s+δ=3smallest magnitudes of \(\widehat {\boldsymbol {x}}_{1}\), which are located at indices 1, 2 and 4. We set s=2 locations to zero among these 3 indices. We consider all \({s+\delta \choose s} = {3 \choose 2} = 3\) combinations of two zeros in indices 1, 2 and 4 of \(\widehat {\boldsymbol {x}}_{1}\). The combination of indices 1 and 2 set to zero leads to the sparsest optimal solution x^{∗} in (5). Thus, in this case, the ℓ_{1}norm solution is suboptimal; but by considering a δ=1neighborhood of this approximation, AKRON is able to exactly recover the sparsest optimal l_{0}solution.
In the noisy case, where the constraint in (4) is replaced by ∥Φx−y∥≤ε, with ε being a given noise threshold level, the neighborhood δ is chosen adaptively as follows: set the s smallest magnitudes of the l_{1} solution to zero; compute the observation error ∥Φx−y∥. If this error is smaller than the energy of the noise, we adopt this solution. Otherwise, the next smallest element is set to zero and the error is recalculated.
In the following propositions, we investigate under which assumptions on the entries of the l_{1} solution and its closeness to the l_{0} solution, will AKRON yield the optimal l_{0} solution.
Proposition 1
Consider the system in (4) with the optimal l_{0}solution x^{∗} having k>s zeros. Consider the l_{1}solution, x_{1}, and assume that ∥x_{1}−x^{∗}∥_{2}≤ε. Let J denote the number of indices j such that
Then, by choosing δ≤J−s, AKRON yields the optimal l_{0}solution.
Proof
Let Θ and \(\overline {\Theta }\) be the index sets of zero and nonzero entries in the l_{0}solution x^{∗}, respectively. We have Θ=k. We need to show that at least s indices in Θ are where (6) holds. To prove this fact, assume the opposite, i.e., at least (k−s+1) indices in Θ are such that
However, in this case, we have
which contradicts the assumption ∥x_{1}−x^{∗}∥_{2}≤ε. □
The following proposition derives an upper bound for \(\delta \in \mathbb {N}_{+}\) when the nonzero elements of the optimal l_{0}solution are bounded from below. We first need the following Lemma.
Lemma 1
Consider the system in (4) with the optimal l_{0}solution x^{∗} and approximate l_{1}solution x_{1}. Denote by Θ and \(\overline {\Theta }\) the index sets of zero and nonzero entries in x^{∗}, respectively. Assume that ∥x_{1}−x^{∗}∥_{2}≤ε. Let R be the number of indices \(j \in \overline {\Theta }\) such that (x_{1})_{j}≤ε. If δ=R, then AKRON yields the optimal solution.
Proof
To obtain the sparsest l_{0}solution, it is sufficient to choose s zeros in “correct places”, i.e., with indices in Θ. Recall that AKRON sets s out of the smallestmagnitude (s+δ) entries in x_{1} to zero. Therefore, AKRON will yield the optimal solution if out of these (s+δ) smallestmagnitude entries, there are at least s entries from Θ. But all entries from Θ have (x_{1})_{j}≤ε, for otherwise the assumption ∥x_{1}−x^{∗}∥_{2}≤ε would be violated. This means that only those entries from \(\overline {\Theta }\), which also satisfy (x_{1})_{j}≤ε, could be chosen by AKRON, and there are only R of them. Hence δ=R will yield the optimal l_{0}solution.
Lemma 1 provides a sufficient condition on δ for optimality of AKRON, namely, if δ=R, then we are guaranteed the optimal l_{0}solution. Since this condition is not necessary, we could reach optimality with δ≤R. □
Proposition 2
Consider the system in (4) with the optimal l_{0}solution x^{∗} and approximate l_{1}solution x_{1}. Let \(\overline {\Theta }\) be the set of indices of nonzero elements in x^{∗}. Assume that the nonzero entries of x^{∗} are bounded from below, i.e.,
Assume further that ∥x_{1}−x^{∗}∥_{2}≤ε. If ε<η, then \(\delta \leq \left (\frac {\epsilon }{\eta  \epsilon }\right)^{2}\) yields the optimal l_{0}solution. In particular, if \(\epsilon < \frac {\eta }{2}\) then δ=0 suffices.
Proof
Since (x^{∗})_{j}≥η for \(j \in \overline {\Theta }\), we have
where we used the fact that δ≤R from Lemma 1. Thus, \(\epsilon \geq \left \ \boldsymbol {x}_{1}\boldsymbol {x}^{*} \right \_{2} \geq (\eta \epsilon)\sqrt {\delta }\), which completes the proof. □
Although Propositions 1 and 2 derive theoretical bounds for the choice of the neighborhood radius δ to recover the optimal sparsest solution, we found in our experiments below that relatively small values of δ are sufficient to achieve a balance between desired accuracy and computational complexity.
Results
In this section, we present an empirical analysis of the AKRONKF and its smoother, including comparisons to other approaches proposed for detecting the relationships between different genes in a molecular network. The experiments include a number of carefully designed synthetic data sets, as well as a realworld data set, namely the fruit fly.
Overview of experimental protocols
Our experiments are conducted on realworld and synthetic data. The advantage of the synthetic data are that the ground truth networks are known; therefore, we can calculate different statistics about the reconstruction error of the network. Unfortunately, we do not have a clear view of the “ground truth” for realworld data. Therefore, we use findings from the life sciences that have studied these networks and were able to infer genetogene relationships that are well established [18].
Our experiments make use of the following algorithms for a sparse reconstruction of a timevarying network:

l_{1}KF(S): This algorithm is the output of the Kalman filter with the l_{1} projection applied to the state vector. The (S) indicates whether the smoother was implemented.

AKRONKF(S): This is the proposed approach using the output of l_{1}KF(S) to seed AKRON. It is also implemented with and without the smoother.
Both of the above algorithms can reconstruct a network that represents the interactions between genes. We compute the true positive (TP), true negative (TN), false positive (FP) and false negative (FN) rates. These rates are summarized through accuracy (acc), sensitivity (sen), specificity (spe), and Matthew’s correlation coefficient (mcc), which are defined below.
Matthew’s correlation coefficient provides a more balanced statistic for examining the overall tradeoffs between the different rates (i.e., TP, TN, FP, and FN).
Results on synthetic data
Synthetic timevarying networks are simulated to evaluate the efficacy of the proposed AKRONKF(S) on data that we have complete control over. All results in this section are presented as the average over 25 monte carlo simulations. Averaging is performed because there could be a large degree of variation in the timevarying networks that are randomly generated.
First, we evaluate the impact of α in (3) on the estimation of a_{kk} and the reconstruction errors of l_{1}KF and AKRONKF. The experiment is configured as follows: a 25gene network evolves over fourtime points by simulating a random walk; all networks being 85% sparse. At each time step, there are nine observations that are available. Figure 2 shows the effect of α on the reconstruction errors of AKRONKF and l_{1}KF. Clearly, AKRONKF is the better performer across the statistics that we collected. Thus, AKRON significantly improves the previous implementation of sparse Kalman filters for timevarying networks. Furthermore, AKRONKF detects the location of the edges in the simulated networks (see Fig. 2b). Simply using the solution from l_{1}KF for a small α is not enough to find the location of the zeros. In fact, α needs to be close to one to achieve a high accuracy at edge detection (i.e., (3) will place a large weight on the l_{1} penalty and a small weigh on the error). Given these results, we choose α=0.2 for the remainder of the experiments since this value provides a reasonable tradeoff between the different statistics that were assessed. Figure 2e shows that AKRONKF is also superior to LassoKF as assessed by Matthew’s correlation coefficient, which provides a balanced measure.
Second, we expand the synthetic experiments to evaluate the impact of δ in AKRON and the dimensionality of the kernel of X, and we also evaluate the impact that the smoothing has on the reconstruction of the networks. We simulated three different kernel dimensions and values for δ. Table 1 shows the outcome of these experiments. The entries in the table are presented as A/B, where A and B are the results from AKRONKF and l_{1}KF, respectively. The table is also divided in half to separate the results for the Kalman filter and Kalman smoother. Similar to the first experiment, we observe that AKRONKF typically outperforms l_{1}KF in nearly every statistic and the results can be quite significant. Furthermore, systems with a large kernel dimension (i.e., large p small n) benefit significantly from a larger value of δ. For example, consider a network that is 9×50×4. δ=1 provides a little reduction in error; however, increasing δ to 3, significantly reduces the error and increases the other statistics. Finally, we observe that smoothing improves the error of the system; however, we should note that this improvement comes at a computational cost.
Results on Flybase
The application of interest is the inference of the timevarying wingmuscle genomic network of the Drosophila Melanogaster (fruit fly). The Drosophila’s microarray dataset originally consists of 4028 genes taken over 66 different time points [18]. The data includes 4 stages of the Drosophila’s life: embryonic (samples 1 through 30), larval (samples 31 through 40), pupal (samples 41 through 58), and adulthood (samples 59 through 66). Flybase hosts a list of undirected gene interactions [19]. We set α=0.2 based on the experiments in the previous section for l_{1}KF and AKRONKF.
In this application, we considered a list of 11 genes that are responsible for the wing muscle development, which has been considered by many researchers before [7–9, 20]. The embryonic, pupal, and larval stages are undersampled to 9 observations in each stage that were used in the reconstruction of the 11gene network in each developmental epoch. All 8 time points were used in the adulthood period. To summarize, the reconstruction of the connectivity matrix uses 9 samples in the embryonic, pupal, and larval developmental stages and 8 samples in the adulthood developmental stage. The 11 gene network was reconstructed throughout each of the four developmental stages using AKRONKF and AKRONKFS.
The networks reconstruction using the l_{1}KF and AKRONKF are shown in Figs. 3ad and 4ad, respectively. For clarity, the displayed networks do not show the strength of the interaction, only that there is an interaction detected by one of the algorithms. The AKRONKF tracker results in clearly sparser networks than the l_{1}KF. AKRONKF was able to find all the connections reported in Flybase: (Actn,prm) appears in the embryonic, larval, and pupal stages, (Actn,up) appears in all four stages, (up,mhc) appears in the embryonic, larval, and pupal stages, (up,sls) appears in the embryonic, larval, and pupal stages and (sls,mhc) appears in the embryonic and larval stages. The other two connections appear through one medium gene (Actn,sls) appears in the embryonic, pupal and adulthood phases through one additional gene and (twin, eve) appears through one or more additional genes only in the embryonic and larval phases.
The networks reconstruction using the l_{1}KF and AKRONKF with smoothers are shown in Figs. 5ad and 6ad, respectively. The smoothing provides very similar network topologies to the ones without the smoother; however, we did observe that the networks were sparser in the larval stage (see Figs. 4b and 6b). Note that while Fig. 7 is considered the ground truth, the could exist relationships that have not yet been discovered. These statistics are shown in Table 2, which again shows the benefit of using the AKRONKF over the l_{1}KF.
Table 3 lists all previous algorithms that were applied to this genetic network. Only the AKRONKF, l_{1}KF [13], SMURC [4] and Dynamic Bayesian networks [7] considered timevarying networks; and, hence, were able to distinguish the different phases in the network. The other algorithms (minimum description length [20], random graph model [9], and nonparametric Bayesian regression [8]) assumed a stationary network, and hence it is not clear at which stage the detected connections develop. The AKRONKF along with the l_{1}KF are the only algorithms able to recover all known interactions and specify the developmental stage where these interactions occur. Although the l_{1}KF also finds all reported interactions, the networks are denser (less sparse) than the AKRONKF. With regard to the MDL results in Table 2, we have not reported the interations. The MDL authors in [20] inferred a single network, using all 66 time points, that characterizes the entire Drosophila’s life cycle. In particular, the MDL approach is stationary and does not differentiate between the phases or timevarying epochs of the data. We wanted to report the MDL findings as they are in the literature. Notice that AKRON used less points and found more correct interactions.
Conclusion
In this work, we addressed the problem of inferring timevarying molecular networks as a tracking problem that can be solved using the Kalman filter. The major difficulty, however, is that there is not a sufficient number of observations at each time point, which makes the statespace model unobservable and the tracking senseless. Fortunately, molecular networks are known to be sparse because the dynamics of every gene are governed by only a small number of genes. By incorporating the sparsity condition, we show that the tracking problem becomes feasible.
We presented the AKRON Kalman filter, which builds on our previous work on the LassoKalman filter (l_{1}KF). Our proposed approach leverages the AKRON algorithm to find a sparser solution that is more representative of the ground truth. The proposed tracker/smoother first computes the output of l_{1}KF; then explores growing neighborhoods of the l_{1}projection to look for sparser solutions, eventually reaching the optimal sparsest estimate. The size of these neighborhoods is a tunable parameter that depends on the computational power available. AKRONKF was benchmarked on synthetic and realworld data against l_{1}KF. The results demonstrate that the proposed approach is better at recovering sparse timevarying networks than l_{1}KF. Not only was the reconstruction error of the proposed approach lower than l_{1}KF, but it was also better at detecting whether an edge exists in a network. AKRONKF tracker was applied to infer the wing muscle gene regulatory network of the Drosophila Melanogaster during four developmental phases of its life cycle, and successfully identified all seven known interactions reported in Flybase. We should also note that our proposed approach will work for timeseries networks that have more than four times steps and sparsity levels
Our future work includes applying the AKRONKF to other types of data, particularly data related to different types of cancers to create a predictive network biomarker for clinical outcome. These ideas have applicability in translational clinical cancer research, basic cancer research, and in networkbased drug discovery.
References
Sethi AJ, Wikramanayake RM, Angerer RC, Range RC, Angerer LM. Crosstalk regulates endomesoderm segregation in sea urchin embryos. Science. 2012; 335(6068):590–3.
Foucart S, Rauhut H. A mathematical introduction to compressive sensing. Birkhäuser. 2013.
Fornasier M, Rauhut H. Compressive sensing. In: Handbook of Mathematical Methods in Imaging: 2011.
Bayar B, Bouaynaya N, Shterenberg R. SMURC: Highdimension smallsample multivariate regression with covariance estimation. IEEE J Biomed Health Inform. 2016; 21(2):573–81.
Koslicki D, Foucart S, Rosen G. WGSQuikr: fast wholegenome shotgun metagenomic classification. PLoS ONE. 2014; 9(3):91784.
Koslicki D, Rosen G, Foucart S. Quikr: a method for rapid reconstruction of bacterial communities via compressive sensing. Bioinformatics. 2013; 29(17):2096–102.
Robinson JW, Hartemink AJ. Learning nonstationary dynamic bayesian networks. J Mach Learn Res. 2010; 11:3647–80.
Miyashita H, Nakamura T, Y Ida TM, Kaburagi T. Nonparametic Bayesbased Heterogenious Drosophila Melanogaster Gene Regulatory Network Inference: Tprocess Regression. In: International Conference on Artificial Intelligence and Applications: 2013. p. 51–8.
Go F, Hanneke S, Fu W, Xing EP. Recovering Temporally Rewiring Networks: A Model Based Approach. In: International Conference on Machine Learning: 2007. p. 321–8.
Carluccio V, Bouaynaya N, Ditzler G, Fathallah Shaykh HM. The AKRONKalman Filter for Tracking TimeVarying Networks. In: IEEE International Conference on Biomedical and Health Informatics: 2017.
Ditzler G, Bouaynaya N, Shterenberg R. AKRON: an Algorithm for Approximating Sparse Kernel Reconstruction. Signal Processing. 2018; 144:265–270.
Bayar B, Bouaynaya N, Shterenberg R. Kernel Reconstruction: an Exact Greedy Algorithm for Compressive Sensing. In: International Workshop on Genomic Signal Processing and Statistics: 2014.
Khan J, Bouaynaya N, FathallahShaykh H. Tracking of timevarying genomic regulatory networks with a lassokalman smoother. EURASIP J Bioinforma Syst Biol. 2014; 1. https://link.springer.com/article/10.1186/1687415320143.
FathallahShaykh HM, Bona JL, Kadener S. Mathematical model of the drosophila circadian clock: Loop regulation and transcriptional integration. Biophys J. 2009; 97(9):2399–40.
Xiong J, Zhou T. A kalmanfilter based approach to identification of timevarying gene regulatory networks. PLoS ONE. 2013; 8(10). https://www.ncbi.nlm.nih.gov/pubmed/24116005.
Rauch HE, Tung F, Striebel CT. Maximum likelihood estimates of linear dynamic systems. AIAA J. 1965; 3(8):1445–50.
Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc. 1996; 58(1):267–88.
Arbeitman M, Furlong E, Imam F, Johnson E, Null B, Baker B, Krasnow M, Scott M, Davis R, White K. Gene expression during the life cycle of drosophila melanogaster. Science. 2002; 297(5590):2270–5.
Marygold S, Leyland PC, Seal RL, Goodman JL, Thurmond J, Strelets VB, Wilson RJ. Flybase: improvements to the bibliography. Nucleic Acids Res. 2013; 41:751–7.
Zhao W, Serpedin E, Dougherty E. Inferring gene regulatory networks form time series data using minimum description length principle. Bioinformatics. 2006; 22(17):2129–35.
Acknowledgements
This work was supported by the National Science Foundation under Award Numbers NSF CCF1527822 and NSF DUE1610911.
Funding
This work was supported by the National Science Foundation (NSF) under CCF1527822 and DUE1610911. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the funding agencies.
Availability of data and materials
Code and reproducibles will be made available on Github if accepted for publication.
Author information
Authors and Affiliations
Contributions
GCD, NCB and RS developed the AKRON Kalman filter and the theory of AKRON. GCD, NCB, RS and HMFS participated in the design of the study. GCD and NCB implemented the study. GCD, NCB, RS and HMFS analyzed and interpreted the data. GCD and NCB prepared the manuscript and revised it. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Ditzler, G., Bouaynaya, N., Shterenberg, R. et al. Approximate kernel reconstruction for timevarying networks. BioData Mining 12, 5 (2019). https://doi.org/10.1186/s1304001901921
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1304001901921