**About ViralPhos**

** System Flow of ViralPhos**

** Two-layered SVMs of ViralPhos**

In this work, a public SVM library, LIBSVM, was employed to generate the predictive models for each MDDLogo-clustered subgroups. With reference to the encoding method of SulfoSite, the positional weighted matrix (PWM), which specifies the relative frequency of amino acids surrounding substrate sites, was utilized in encoding the fragment sequences. A matrix of m * w elements was used to represent each residue of a training dataset, where m stands for the window size and w consists of 21 elements including 20 types of amino acids and one for terminal signal. Each MDDLogo-identified substrate motif contained a corresponding PWM with m * w elements, as illustrated in this Figure, and a SVM classifier was learned from each PWM. The radial basis function (RBF) was used as the kernel function of the SVMs. The LIBSVM library could output a value of probability estimate ranging from 0 to 1 for each prediction. Thus, the values of probability estimates from each SVM classifier trained with the PWM corresponding to a specific motif were adopted to form an input vector for second-layered SVM.