An automated tool for predicting TF-DNA binding is required to replace time consuming and expensive lab experiments, will be relevant in understanding gene-regulatory networks and will be useful in tackling TF-related diseases. Classical way to build a TF-DNA predictive model is based on motif-finding approaches in form of consensus or PWMs. The inherent weakness of these models are the underlying assumption of independent nucleotide interactions in the TF-DNA binding.

Traditional machine learning TF-DNA predictive approaches outperformed classical methods by representing more generalized and flexible motifs as k-mers and paved way for developing deep learning based methods. The state-of-the-art TF-DNA predictive methods are based on convolutionary neural networks that use image-like patterns in sequences as motifs. However, the design of cnn architecture is challenging and is difficult to lay out the rules that are used for predictions.

In this talk, we will go through the findings of a recently published article ‘convolutional neural network achitectures for predicting DNA-protein binding [Zeng et. al, 2016]’ that highlights the possibly best cnn architecture for TF-DNA predictive model.