Class HgvsProtein


  • public class HgvsProtein
    extends Hgvs
    Coding change in HGVS notation (amino acid changes) References: http://www.hgvs.org/mutnomen/recs.html
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected java.lang.String aaCode​(char aa1Letter)  
      protected java.lang.String aaCode​(java.lang.String aa1Letter)
      Use one letter / three letter AA codes Most times we want to vonvert to 3 letter code HGVS: the three-letter amino acid code is prefered (see Discussion), with "*" designating a translation termination codon; for clarity we this page describes changes using the three-letter amino acid
      protected java.lang.String del()
      Deletions remove one or more amino acid residues from the protein and are described using "del" after an indication of the first and last amino acid(s) deleted separated by a "_" (underscore).
      protected java.lang.String delins()
      Mixed variants Deletion/insertions (indels) replace one or more amino acid residues with one or more other amino acid residues.
      protected java.lang.String dup()
      Duplications
      protected java.lang.String fs()
      Frame shifts are a special type of amino acid deletion/insertion affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame (specified 2013-10-11).
      protected java.lang.String ins()
      Insertions Insertions add one or more amino acid residues between two existing amino acids and this insertion is not a copy of a sequence immediately 5'-flanking (see Duplication).
      protected boolean isDuplication()
      Is this variant a duplication Reference: http://www.hgvs.org/mutnomen/disc.html#dupins ...the description "dup" (see Standards) may by definition only be used when the additional copy is directly 3'-flanking of the original copy (tandem duplication)
      protected java.lang.String pos​(int codonNum)
      Protein position
      protected java.lang.String pos​(int start, int end)  
      protected java.lang.String pos​(Transcript tr, int codonNum)
      Protein position
      protected java.lang.String pos​(Transcript tr, int start, int end)
      Position string given two coordinates
      protected java.lang.String posDel()
      Position for deletions
      protected java.lang.String posDelIns()
      Position for 'delins'
      protected java.lang.String posDup()
      Position for 'duplications' (a special kind of insertion)
      protected java.lang.String posFs()
      Frame shifts ....are described using ...
      protected java.lang.String posIns()
      Position for insertions
      protected java.lang.String posSnpOrMnp()
      Position: SNP or NMP
      protected java.lang.String snpOrMnp()
      SNP or MNP changes
      java.lang.String toString()  
      protected java.lang.String translocation()
      Translocation nomenclature.
      protected java.lang.String typeOfReference()
      Return "p." string with/without transcript ID, according to user command line options.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • debug

        public static boolean debug
    • Constructor Detail

      • HgvsProtein

        public HgvsProtein​(VariantEffect variantEffect)
    • Method Detail

      • aaCode

        protected java.lang.String aaCode​(char aa1Letter)
      • aaCode

        protected java.lang.String aaCode​(java.lang.String aa1Letter)
        Use one letter / three letter AA codes Most times we want to vonvert to 3 letter code HGVS: the three-letter amino acid code is prefered (see Discussion), with "*" designating a translation termination codon; for clarity we this page describes changes using the three-letter amino acid
      • del

        protected java.lang.String del()
        Deletions remove one or more amino acid residues from the protein and are described using "del" after an indication of the first and last amino acid(s) deleted separated by a "_" (underscore). Deletions remove either a small internal segment of the protein (in-frame deletion), part of the N-terminus of the protein (initiation codon change) or the entire C-terminal part of the protein (nonsense change). A nonsense change is a special type of deletion removing the entire C-terminal part of a protein starting at the site of the variant (specified 2013-03-16). 1) in-frame deletions - are described using "del" after an indication of the first and last amino acid(s) deleted separated, by a "_" (underscore). p.Gln8del in the sequence MKMGHQQQCC denotes a Glutamine-8 (Gln, Q) deletion to MKMGHQQCC p.(Cys28_Met30del) denotes RNA nor protein was analysed but the predicted change is a deletion of three amino acids, from Cysteine-28 to Methionine-30 2) initiating methionine change (Met1) causing a N-terminal deletion (see Discussion, see Examples) NOTE: changes extending the N-terminal protein sequence are described as an extension p.0 - no protein is produced (experimental data should be available) NOTE: this change is not described as p.Met1_Leu833del, i.e. as a deletion removing the entire protein coding sequence p.Met1? - denotes that amino acid Methionine-1 (translation initiation site) is changed and that it is unclear what the consequence of this change is p.Met1_Lys45del - a new translation initiation site is activated (at Met46) 3) nonsense variant - are a special type of amino acid deletion removing the entire C-terminal part of a protein starting at the site of the variant. A nonsense change is described using the format p.Trp26Ter (alternatively p.Trp26*). The description does not include the deletion at protein level from the site of the change to the C-terminal end of the protein (stop codon) like p.Trp26_Leu833del (the deletion of amino acid residue Trp26 to the last amino acid of the protein Leu833). p.(Trp26Ter) indicates RNA nor protein was analysed but amino acid Tryptophan26 (Trp, W) is predicted to change to a stop codon (Ter) (alternatively p.(W26*) or p.(Trp26*))
      • delins

        protected java.lang.String delins()
        Mixed variants Deletion/insertions (indels) replace one or more amino acid residues with one or more other amino acid residues. Deletion/insertions are described using "delins" as a deletion followed by an insertion after an indication of the amino acid(s) flanking the site of the deletion/insertion separated by a "_" (underscore, see Discussion). Frame shifts are a special type of amino acid deletion/insertion affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame (specified 2013-10-11). A frame shift is described using "fs" after the first amino acid affected by the change. Descriptions either use a short ("fs") or long ("fsTer#") description. The description of frame shifts does not include the deletion at protein level from the site of the frame shift to the natural end of the protein (stop codon). The inserted amino acid residues are not described, only the total length of the new shifted frame is given (i.e. including the first amino acid changed).
      • dup

        protected java.lang.String dup()
        Duplications
      • fs

        protected java.lang.String fs()
        Frame shifts are a special type of amino acid deletion/insertion affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame (specified 2013-10-11). A frame shift is described using "fs" after the first amino acid affected by the change. Descriptions either use a short ("fs") or long ("fsTer#") description
      • ins

        protected java.lang.String ins()
        Insertions Insertions add one or more amino acid residues between two existing amino acids and this insertion is not a copy of a sequence immediately 5'-flanking (see Duplication). Insertions are described using "ins" after an indication of the amino acids flanking the insertion site, separated by a "_" (underscore) and followed by a description of the amino acid(s) inserted. Since for large insertions the amino acids can be derived from the DNA and/or RNA descriptions they need not to be described exactly but the total number may be given (like "ins17"). Examples: 1) p.Lys2_Met3insGlnSerLys denotes that the sequence GlnSerLys (QSK) was inserted between amino acids Lysine-2 (Lys, K) and Methionine-3 (Met, M), changing MKMGHQQQCC to MKQSKMGHQQQCC 2) p.Trp182_Gln183ins17 describes a variant that inserts 17 amino acids between amino acids Trp182 and Gln183 NOTE: it must be possible to deduce the 17 inserted amino acids from the description given at DNA or RNA level
      • isDuplication

        protected boolean isDuplication()
        Is this variant a duplication Reference: http://www.hgvs.org/mutnomen/disc.html#dupins ...the description "dup" (see Standards) may by definition only be used when the additional copy is directly 3'-flanking of the original copy (tandem duplication)
      • pos

        protected java.lang.String pos​(int codonNum)
        Protein position
      • pos

        protected java.lang.String pos​(int start,
                                       int end)
      • pos

        protected java.lang.String pos​(Transcript tr,
                                       int codonNum)
        Protein position
      • pos

        protected java.lang.String pos​(Transcript tr,
                                       int start,
                                       int end)
        Position string given two coordinates
      • posDel

        protected java.lang.String posDel()
        Position for deletions
      • posDelIns

        protected java.lang.String posDelIns()
        Position for 'delins'
      • posDup

        protected java.lang.String posDup()
        Position for 'duplications' (a special kind of insertion)
      • posFs

        protected java.lang.String posFs()
        Frame shifts ....are described using ... the change of the first amino acid affected ... the description does not include a description of the deletion from the site of the change
      • posIns

        protected java.lang.String posIns()
        Position for insertions
      • posSnpOrMnp

        protected java.lang.String posSnpOrMnp()
        Position: SNP or NMP
      • snpOrMnp

        protected java.lang.String snpOrMnp()
        SNP or MNP changes
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • translocation

        protected java.lang.String translocation()
        Translocation nomenclature. From HGVS: Translocations at protein level occur when a translocation at DNA level leads to the production of a fusion protein, joining the N-terminal end of the protein on one chromosome to the C-terminal end of the protein on the other chromosome (and vice versa). No recommendations have been made sofar to describe protein translocations. t(X;17)(DMD:p.Met1_Val1506; SGCA:p.Val250_*387) describes a fusion protein resulting from a translocation between the chromosomes X and 17; the fusion protein contains an N-terminal segment of DMD (dystrophin, amino acids Methionine-1 to Valine-1506), and a C-terminal segment of SGCA (alpha-sarcoglycan, amino acids Valine-250 to the stop codon at 387)
      • typeOfReference

        protected java.lang.String typeOfReference()
        Return "p." string with/without transcript ID, according to user command line options.