WELCOME TO USE EnILs

Introduction

Interleukins (ILs) are a group of cytokines produced by many kinds of cells, which play important roles in transmitting information, activating and regulating immune cells, mediating activation, proliferation and differentiation of T and B cells, and in inflammatory responses. At present, a number of machine learning methods have been proposed to predict ILs inducing peptides, but their predictive performance needs to be further improved, and the inducing peptides of different interleukins are predicted separately, rather than using a general approach.

In this work, we combine the statistical features with word embedding of peptide sequence to design an ensemble general model named EnILs to predict inducing peptides of IL-6, IL-10, IL-17, in which the predictive probabilities of random forest, eXtreme Gradient Boosting and neural network are integrated in an average way.

Service

By submitting the data and your contact information, we will return the results to you by email.

Loading
Your message has been sent. Thank you!
  • 01 Instructions For Use
     1. Your email is the only way to communicate with you after we have completed the task you have submitted, so please be sure to fill it out carefully.  2. For different inducing peptides, you can choose according to your needs. And the sample files for different models are provided here.  3. The returned results will be returned with the descriptor as the identifier, so please use a unique descriptor for each sequence if possible.
  • The data you submit should meet the following conditions.
     1. Word2vec features can be entered directly into a fasta file. For IL-6 inducing peptide, IL-10 inducing peptide and IL-17 inducing peptide, the submitted amino acid sequence length is less than the standard 25, 42, 30.
     2. Statistical features extracted from a website Pfeature. Note that for all three types of inducing peptide datasets, we use 14 types of statistical features, namely AAC, DPC, ABC, RRI, DDOR, SE, SER, SEP, CTD, CeTD, PAAC, APAAC, QSO, SOCN. The 14 types of features are stitched together in sequence to form a csv file.

Dataset

In this work, the data we used are all extracted from the immune epitope database. For the IL6 Dhall dataset, IL10 Nagpal dataset and IL17 Gupta dataset, sequences containing unnatural amino acids are removed and the lengths of the restriction peptide sequences are 8-25, 8-42 and 5-30, respectively. In addition, we also extracted the unused data of IL6 Dhall dataset, IL10 Nagpal dataset and IL17 Gupta dataset from the immune epitope database as an independent test set, respectively.

Contact Us

If you have any questions, please feel free to contact us.

Name:

Rui Su

Location:

No. 1, Linghai Road, Dalian City, Liaoning Province, China