The University of Texas Medical Branch
Department of Biochemistry and Molecular Biology Sealy Center for Structural Biology Computational Biology

SDAP Home Page
SDAP Overview

Use SDAP
SDAP All
SDAP Food

SDAP Tools
FAO/WHO Allergenicity Test
FASTA Search in SDAP
Peptide Match
Peptide Similarity
Peptide-Protein PD Index
Aller_ML, Allergen Markup Language
List SDAP

About SDAP
General Information
Manual
FAQ
Publications
Who Are We
Advisory Board

Allergy Links

Our Software Tools
MPACK
FANTOM
GETAREA
NOAH/DIAMOD
MASIA
PCPMer
InterProSurf
EpiSearch

Protein Databases
PDB
MMDB - Entrez
SWISS-PROT
NCBI - Entrez
PIR

Protein Classification
CATH
CE
FSSP
iProClass
ProtoMap
SCOP
TOPS
VAST

Bioinformatics Servers
@TOME
BLAST @ NCBI
BLAST @ PIR
FASTA @ PIR
Peptide Match @ PIR
ClustalW @ BCM
ClustalW @ EMBL - EBI
ClustalW @ PIR

Bioinformatics Tools
Cn3D
MolMol

Bioinformatics Links
Bioinformatics Links Directory

SDAP - Structural Database of Allergenic Proteins
Go to: SDAP All allergens       Go to: SDAP Food allergens
Send a comment to Ovidiu Ivanciuc      Submit new allergen information to SDAP
Last Updated: February 25, 2013  
Alphabetical listing of allergens: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Access to SDAP is available free of charge for Academic and non-profit use.
Licenses for commercial use can be obtained by contacting W. Braun (webraun@utmb.edu).
Secure access to SDAP is available from https://fermi.utmb.edu/SDAP

 

AllerML - Markup Language for Allergens

 

The Allergen Markup Language, AllerML, that we describe here is a first step in developing automated tools to access data on allergens in multiple databases. AllerML is based on IUIS nomenclature and consists of a hierarchical set of tags that describe the most important information normally contained in allergen databases, including common names, sources, sequence, structure, IgE and T-cell epitopes, and cross-reactivity. AllerML also includes tags for attributes, such as Pfam classifications, that link allergen-specific databases to other general purpose biological data sets. In its current form, AllerML can be used to automate the dynamic exchange of information on allergens, to incorporate data on new allergenic proteins as they are identified, and to support computational and bioinformatics studies of allergenicity and clinically significant cross-reactivity. Wide implementation of AllerML will simplify automatic exchange of data between allergen databases, and improve data access for integrated computational and bioinformatics analysis.

 

AllerML Tags

 

The AllerML tags proposed here encode all molecular information on allergens and IgE epitopes, as present in the major allergen databases. For each allergen in SDAP, the AllerML record can be obtained from the link “Translate to AllerML” located immediately below the title line with the allergen name in the SDAP page corresponding to an allergen.

 

Table 1. AllerML Tags (only the start tag is shown for each section)

Tag

Description

<AllerML_Allergen>

Allergen section

<AllerML_Allergen_Name>

Allergen name

<AllerML_Allergen_SDAP_ID>

Allergen SDAP ID

<AllerML_Isoallergen>

Isoallergen section

<AllerML_Isoallergen_Name>

Isoallergen name

<AllerML_Isoallergen_SDAP_ID>

Isoallergen SDAP ID

<AllerML_Allergen_Type>

Allergen type: IUIS or non-IUIS

<AllerML_Organism>

Allergen organism

<AllerML_Systematic_Name>

Systematic name

<AllerML_Taxonomy_ID>

Taxonomy ID

<AllerML_Common_Name>

Common name

<AllerML_Taxonomy>

Taxonomy

<AllerML_Comment>

Comment

<AllerML_Protein>

Protein section

<AllerML_Protein_Source>

Protein source: UniProt, GenBank, PubMed, or DOI

<AllerML_UniProt>

UniProt section

<AllerML_UniProt_ID>

UniProt ID

<AllerML_UniProt_Accession>

UniProt accession number

<AllerML_GenBank>

GenBank section

<AllerML_GenBank_Locus>

GenBank locus

<AllerML_GenBank_Accession>

GenBank accession number

<AllerML_GenBank_Version>

GenBank version

<AllerML_GenBank_GI>

GenBank GI

<AllerML_Protein_Length>

Protein length

<AllerML_Protein_Sequence>

Protein sequence

<AllerML_PDBML>

PDBML section

<AllerML_Epitopes>

Epitopes section

<AllerML_Epitope_Set>

Section for an epitope set, normally from a single publication

<AllerML_Epitope_Type>

Epitope type: IgE, T-cell

<AllerML_Epitope_Position>

Epitope position

<AllerML_Epitope_Sequence>

Epitope sequence

<AllerML_Epitope_Comment>

Epitope comment

<AllerML_MotifMate_Motifs>

Section for a collection of MotifMate motifs

<AllerML_MotifMate_Motif>

MotifMate motif

<AllerML_MotifMate_Motif_Position>

MotifMate motif position

<AllerML_MotifMate_Motif_Sequence>

MotifMate motif sequence

<AllerML_Cross-references>

Cross-references to other databases

<AllerML_EMBL>

EMBL (http://www.ebi.ac.uk/)

<AllerML_GenBank>

GenBank (http://www.ncbi.nlm.nih.gov)

<AllerML_DDBJ>

DDBJ (DNA Data Bank of Japan, http://www.ddbj.nig.ac.jp/)

<AllerML_HSSP>

HSSP (http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-page+LibInfo+-lib+HSSP)

<AllerML_PDB>

PDB (http://www.pdb.org)

<AllerML_IntAct>

IntAct (http://www.ebi.ac.uk/intact/site/index.jsf)

<AllerML_GO>

GO (http://www.geneontology.org/)

<AllerML_InterPro>

InterPro (http://www.ebi.ac.uk/interpro/)

<AllerML_Gene3D>

Gene3D (http://gene3d.biochem.ucl.ac.uk/Gene3D/)

<AllerML_Pfam>

Pfam (http://pfam.sanger.ac.uk/)

<AllerML_PRINTS>

PRINTS (http://www.bioinf.manchester.ac.uk/dbbrowser/PRINTS)

<AllerML_PROSITE>

PROSITE (http://www.expasy.org/prosite/)

<AllerML_References>

Set of references

<AllerML_Reference>

Reference

<AllerML_Reference_PubMed>

PubMed ID

<AllerML_Reference_Authors>

Set of authors

<AllerML_Author>

Author

<AllerML_Reference_Journal>

Journal

<AllerML_Reference_Title>

Title

<AllerML_Reference_Year>

Year

<AllerML_Reference_Volume>

Volume

<AllerML_Reference_Pages>

Pages

 

 

Allergen Name and Taxonomy

 

An example of the implementation of AllerML for Ara h 3 is shown in Scheme 1. The first part of the record indicates the IUIS allergen name and type (IUIS or non-IUIS). Other unique identifiers from each allergen database can be included (if available) using appropriate tags. The SDAP ID is given here as an example. The second part of the record contains source organism information including the accession number from the NCBI taxonomy database (http://www.ncbi.nlm.nih.gov/Taxonomy/). Comments may be included in any AllerML section under the tag <AllerML_Comment>, and may contain HTML tags used to format the text for display or to include other HTML elements such as tables or hyperlinks.

 

Scheme 1. Core information of Ara h 3 as an AllerML document

<AllerML_Allergen>

 <AllerML_Allergen_Name>Ara h 3</AllerML_Allergen_Name>

 <AllerML_Allergen_SDAP_ID>326</AllerML_Allergen_SDAP_ID>

 <AllerML_Allergen_Type>IUIS</AllerML_Allergen_Type>

 <AllerML_Organism>

  <AllerML_Systematic_Name>Arachis hypogaea</AllerML_Systematic_Name>

  <AllerML_Taxonomy_ID>3818</AllerML_Taxonomy_ID>

  <AllerML_Common_Name>peanut</AllerML_Common_Name>

  <AllerML_Taxonomy>Eukaryota; Viridiplantae; Streptophyta; Embryophyta;

  Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons;

  core eudicotyledons; rosids; eurosids I; Fabales; Fabaceae;

  Papilionoideae; Aeschynomeneae; Arachis

  </AllerML_Taxonomy>

 </AllerML_Organism>

 <AllerML_Comment>glycinin</AllerML_Comment>

 <AllerML_References>

  <AllerML_Reference>

   <AllerML_Reference_PubMed>10021462</AllerML_Reference_PubMed>

  </AllerML_Reference>

  <AllerML_Reference>

   <AllerML_Reference_PubMed>11146387</AllerML_Reference_PubMed>

  </AllerML_Reference>

 </AllerML_References>

...

</AllerML_Allergen>

 

Cross-references to Other Web Databases

 

Other protein databases also contain relevant information about allergenic proteins, and we propose the following set of AllerML tags to link to the most relevant such services:

  • EMBL (http://www.ebi.ac.uk/), tag <AllerML_EMBL>
  • GenBank (http://www.ncbi.nlm.nih.gov), tag <AllerML_GenBank>
  • DDBJ (DNA Data Bank of Japan, http://www.ddbj.nig.ac.jp/), tag <AllerML_DDBJ>
  • HSSP (http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-page+LibInfo+-lib+HSSP), tag <AllerML_HSSP>
  • PDB (http://www.pdb.org), tag <AllerML_PDB>
  • IntAct (http://www.ebi.ac.uk/intact/site/index.jsf), tag <AllerML_IntAct>
  • GO (http://www.geneontology.org/), tag <AllerML_GO>
  • InterPro (http://www.ebi.ac.uk/interpro/), tag <AllerML_InterPro>
  • Gene3D, tag <AllerML_Gene3D>
  • Pfam (http://pfam.sanger.ac.uk/), tag <AllerML_Pfam>
  • PRINTS (http://www.bioinf.manchester.ac.uk/dbbrowser/PRINTS), tag <AllerML_PRINTS>
  • PROSITE (http://www.expasy.org/prosite/), tag <AllerML_PROSITE>

 

 

Scheme 2. Cross-references section of Ara h 3

<AllerML_Cross-references>

 <AllerML_Protein_Source>UniProt</AllerML_Protein_Source>

 <AllerML_UniProt>

  <AllerML_UniProt_ID>O82580_ARAHY</AllerML_UniProt_ID>

  <AllerML_UniProt_Accession>O82580</AllerML_UniProt_Accession>

 </AllerML_UniProt>

 <AllerML_EMBL>AF093541</AllerML_EMBL>

 <AllerML_EMBL>AAC63045.1</AllerML_EMBL>

 <AllerML_GenBank>AF093541</AllerML_GenBank>

 <AllerML_DDBJ>AF093541</AllerML_DDBJ>

 <AllerML_HSSP>P04776</AllerML_HSSP>

 <AllerML_PDB>1UD1</AllerML_PDB>

 <AllerML_IntAct>O82580</AllerML_IntAct>

 <AllerML_GO>0045735</AllerML_GO>

 <AllerML_InterPro>IPR006045</AllerML_InterPro>

 <AllerML_InterPro>IPR014710</AllerML_InterPro>

 <AllerML_InterPro>IPR006044</AllerML_InterPro>

 <AllerML_Gene3D>G3DSA:2.60.120.10</AllerML_Gene3D>

 <AllerML_Pfam>PF00190</AllerML_Pfam>

 <AllerML_PRINTS>PR00439</AllerML_PRINTS>

 <AllerML_PROSITE>PS00305</AllerML_PROSITE>

</AllerML_Cross-references>

 

Protein Sequence

 

Scheme 3. Protein section of Ara h 3

<AllerML_Protein>

 <AllerML_Protein_Source>UniProt</AllerML_Protein_Source>

 <AllerML_UniProt>

  <AllerML_UniProt_ID>O82580_ARAHY</AllerML_UniProt_ID>

  <AllerML_UniProt_Accession>O82580</AllerML_UniProt_Accession>

 </AllerML_UniProt>

 <AllerML_Protein_Source>GenBank</AllerML_Protein_Source>

 <AllerML_GenBank>

  <AllerML_GenBank_Locus>AAC63045</AllerML_GenBank_Locus>

  <AllerML_GenBank_Accession>AAC63045</AllerML_GenBank_Accession>

  <AllerML_GenBank_Version>AAC63045.1</AllerML_GenBank_Version>

  <AllerML_GenBank_GI>3703107</AllerML_GenBank_GI>

  <AllerML_Protein_Length>510</AllerML_Protein_Length>

  <AllerML_Protein_Sequence>

  ISFRQQPEENACQFQRLNAQRPDNRIESEGGYIETWNPNNQEFECAGVAL

  SRLVLRRNALRRPFYSNAPQEIFIQQGRGYFGLIFPGCPRHYEEPHTQGR

  RSQSQRPPRRLQGEDQSQQQRDSHQKVHRFDEGDLIAVPTGVAFWLYNDH

  DTDVVAVSLTDTNNNDNQLDQFPRRFNLAGNTEQEFLRYQQQSRQSRRRS

  LPYSPYSPQSQPRQEEREFSPRGQHSRRERAGQEEENEGGNIFSGFTPEF

  LEQAFQVDDRQIVQNLRGETESEEEGAIVTVRGGLRILSPDRKRRADEEE

  EYDEDEYEYDEEDRRRGRGSRGRGNGIEETICTASAKKNIGRNRSPDIYN

  PQAGSLKTANDLNLLILRWLGPSAEYGNLYRNALFVAHYNTNAHSIIYRL

  RGRAHVQVVDSNGNRVYDEELQEGHVLVVPQNFAVAGKSQSENFEYVAFK

  TDSRPSIANLAGENSVIDNLPEEVVANSYGLQREQARQLKNNNPFKFFVP

  PSQQSPRAVA

  </AllerML_Protein_Sequence>

 </AllerML_GenBank>

</AllerML_Protein>

 

Epitopes Section

 

Scheme 4. Epitope section of Ara h 3

<AllerML_Epitopes>

 <AllerML_Epitope_Set>

  <AllerML_Epitope_Type>IgE</AllerML_Epitope_Type>

  <AllerML_Epitope_Position>33-47</AllerML_Epitope_Position>

  <AllerML_Epitope_Sequence>IETWNPNNQEFECAG</AllerML_Epitope_Sequence>

  <AllerML_Epitope_Comment>

   When altered, amino acids shown in red lead to a decrease of IgE binding: IETWN<B><FONT COLOR="RED">PN</FONT></B>NQEFECAG

  </AllerML_Epitope_Comment>

  <AllerML_Epitope_Position>237-251</AllerML_Epitope_Position>

  <AllerML_Epitope_Sequence>GNIFSGFTPEFLEQA</AllerML_Epitope_Sequence>

  <AllerML_Epitope_Comment>

   When altered, amino acids shown in red lead to a decrease of IgE binding: GNI<font color="RED">F</font>SG<font color="RED">F</font>TPE<font color="RED">FL</font>EQA

  </AllerML_Epitope_Comment>

  <AllerML_Epitope_Position>276-290</AllerML_Epitope_Position>

  <AllerML_Epitope_Sequence>VTVRGGLRILSPDRK</AllerML_Epitope_Sequence>

  <AllerML_Epitope_Comment>

   When altered, amino acids shown in red lead to a decrease of IgE binding: VTVRGG<font color="RED">L</font>R<font color="RED">IL</font>S<font color="RED">P</font>DRK

  </AllerML_Epitope_Comment>

  <AllerML_Epitope_Position>303-317</AllerML_Epitope_Position>

  <AllerML_Epitope_Sequence>DEDEYEYDEEDRRRG</AllerML_Epitope_Sequence>

  <AllerML_Epitope_Comment>

   When altered, amino acids shown in red lead to a decrease of IgE binding: DEDEY<font color="RED">EYDE</font>E<font color="RED">DR</font>RRG

  </AllerML_Epitope_Comment>

   <AllerML_Reference_PubMed>10021462</AllerML_Reference_PubMed>

  </AllerML_Epitope_Reference>

 </AllerML_Epitope_Set>

 

</AllerML_Epitopes>

 

 

Allergen INSCH Motifs

 

 

Scheme 5. AllerML encoding for the INSCH motif section of Ara h 3.

<AllerML_INSCH_Motif>

 <AllerML_INSCH_Motif_ID>INSCH011</AllerML_INSCH_Motif_ID>

 <AllerML_INSCH_Motif_Sequence>

  VNSLKLPQLQDLDLSAEYVVLYEGALLLPHYNSNAHSIVYVTRGEGRVQV

 </AllerML_INSCH_Motif_Sequence>

</AllerML_INSCH_Motif>

 

 

Allergen MotifMate Motifs

 

Scheme 6. AllerML encoding for the MotifMate motifs section of Ara h 3.

<AllerML_MotifMate_Motifs>

 <AllerML_MotifMate_Motif>

  <AllerML_MotifMate_Motif_Position>127-138</AllerML_MotifMate_Motif_Position>

  <AllerML_MotifMate_Motif_Sequence>FdeGDlIavPtG</AllerML_MotifMate_Motif_Sequence>

 </AllerML_MotifMate_Motif>

 <AllerML_MotifMate_Motif>

  <AllerML_MotifMate_Motif_Position>65-76</AllerML_MotifMate_Motif_Position>

  <AllerML_MotifMate_Motif_Sequence>ApqEiFIqqGrG</AllerML_MotifMate_Motif_Sequence>

 </AllerML_MotifMate_Motif>

 <AllerML_MotifMate_Motif>

  <AllerML_MotifMate_Motif_Position>173-178</AllerML_MotifMate_Motif_Position>

  <AllerML_MotifMate_Motif_Sequence>FnLaGN</AllerML_MotifMate_Motif_Sequence>

 </AllerML_MotifMate_Motif>

</AllerML_MotifMate_Motifs>

 

 

 

IgE Cross-reactive Peptides

 

Scheme 7. AllerML encoding of quantitative data regarding allergen cross-reactivity with peptides for Jun a 1

<AllerML_Epitope_Reference>

 <AllerML_Epitope_Sequence>MPRARYGL</AllerML_Epitope_Sequence>

 <AllerML_Epitope_PD>0</AllerML_Epitope_PD>

 <AllerML_Epitope_RWT>1</AllerML_Epitope_RWT>

</AllerML_Epitope_Reference>

<AllerML_Peptide>

 <AllerML_Peptide_Sequence>MPRARYGF</AllerML_Peptide_Sequence>

 <AllerML_Peptide_PD>1.19</AllerML_Peptide_PD>

 <AllerML_Peptide_RWT>0.74</AllerML_Peptide_RWT>

</AllerML_Peptide>

<AllerML_Peptide>

 <AllerML_Peptide_Sequence>MPRARYGM</AllerML_Peptide_Sequence>

 <AllerML_Peptide_PD>1.43</AllerML_Peptide_PD>

 <AllerML_Peptide_RWT>0.65</AllerML_Peptide_RWT>

</AllerML_Peptide>

<AllerML_Peptide>

 <AllerML_Peptide_Sequence>MPRARFGL</AllerML_Peptide_Sequence>

 <AllerML_Peptide_PD>1.59</AllerML_Peptide_PD>

 <AllerML_Peptide_RWT>1.08</AllerML_Peptide_RWT>

</AllerML_Peptide>

 

 

 

 


SDAP Home Page | Search SDAP | SDAP Manual | SDAP FAQ | Contact  
UTMB | Search | Directories | Toolbox | News | Employment | Sitemap 
UT System | Reports to the State | Compact With Texans | Statewide Search
 
This site published by Ovidiu Ivanciuc
Copyright   2001-2013  The University of Texas Medical Branch. Please review our privacy policy and Internet guidelines.