Appendix F - Appendix F to Subpart G of Part 1—List of Feature Keys Related to Protein Sequences

Source: World Intellectual Property Organization (WIPO) Handbook on Industrial Property Information and Documentation, Standard ST.25: Standard for the Presentation of Nucleotide and Amino Acid Sequence Listings in Patent Applications (2009).

Key Description CONFLICTdifferent papers report differing sequences. VARIANTauthors report that sequence variants exist. VARSPLICdescription of sequence variants produced by alternative splicing. MUTAGENsite which has been experimentally altered. MOD__RESpost-translational modification of a residue. ACETYLATIONN-terminal or other. AMIDATIONgenerally at the C-terminal of a mature active peptide. BLOCKEDundetermined N- or C-terminal blocking group. FORMYLATIONof the N-terminal methionine. GAMMA-CARBOXYGLUTAMIC ACID HYDROXYLATIONof asparagine, aspartic acid, proline, or lysine. METHYLATIONgenerally of lysine or arginine. PHOSPHORYLATIONof serine, threonine, tyrosine, aspartic acid or histidine. PYRROLIDONE CARBOXYLIC ACIDN-terminal glutamate which has formed an internal cyclic lactam. SULFATATIONgenerally of tyrosine. LIPIDcovalent binding of a lipidic moiety. MYRISTATEmyristate group attached through an amide bond to the N-terminal glycine residue of the mature form of a protein or to an internal lysine residue. PALMITATEpalmitate group attached through a thioether bond to a cysteine residue or through an ester bond to a serine or threonine residue. FARNESYLfarnesyl group attached through a thioether bond to a cysteine residue. GERANYL-GERANYLgeranyl-geranyl group attached through a thioether bond to a cysteine residue. GPI-ANCHORglycosyl-phosphatidylinositol (GPI) group linked to the alpha- carboxyl group of the C-terminal residue of the mature form of a protein. N-ACYL DIGLYCERIDEN-terminal cysteine of the mature form of a prokaryotic lipoprotein with an amide-linked fatty acid and a glyceryl group to which two fatty acids are linked by ester linkages. DISULFIDdisulfide bond; the `FROM' and `TO' endpoints represent the two residues which are linked by an intra-chain disulfide bond; if the `FROM' and `TO' endpoints are identical, the disulfide bond is an interchain one and the description field indicates the nature of the cross-link. THIOLESTthiolester bond; the `FROM' and `TO' endpoints represent the two residues which are linked by the thiolester bond. THIOETHthioether bond; the `FROM' and `TO' endpoints represent the two residues which are linked by the thioether bond. CARBOHYDglycosylation site; the nature of the carbohydrate (if known) is given in the description field. METALbinding site for a metal ion; the description field indicates the nature of the metal. BINDINGbinding site for any chemical group (co-enzyme, prosthetic group, etc.); the chemical nature of the group is given in the description field. SIGNALextent of a signal sequence (prepeptide). TRANSITextent of a transit peptide (mitochondrial, chloroplastic, or for a microbody). PROPEPextent of a propeptide. CHAINextent of a polypeptide chain in the mature protein. PEPTIDEextent of a released active peptide. DOMAINextent of a domain of interest on the sequence; the nature of that domain is given in the description field. CA__BINDextent of a calcium-binding region. DNA__BINDextent of a DNA-binding region. NP__BINDextent of a nucleotide phosphate binding region; the nature of the nucleotide phosphate is indicated in the description field. TRANSMEMextent of a transmembrane region. ZN__FINGextent of a zinc finger region. SIMILARextent of a similarity with another protein sequence; precise information, relative to that sequence, is given in the description field. REPEATextent of an internal sequence repetition. HELIXsecondary structure: Helices, for example, Alpha-helix, 3(10) helix, or Pi-helix. STRANDsecondary structure: Beta-strand, for example, Hydrogen bonded beta-strand, or Residue in an isolated beta-bridge. TURNsecondary structure Turns, for example, H-bonded turn (3-turn, 4-turn, or 5-turn). ACT__SITEamino acid(s) involved in the activity of an enzyme. SITEany other interesting site on the sequence. INIT__METthe sequence is known to start with an initiator methionine. NON__TERthe residue at an extremity of the sequence is not the terminal residue; if applied to position 1, this signifies that the first position is not the N-terminus of the complete molecule; if applied to the last position, it signifies that this position is not the C-terminus of the complete molecule; there is no description field for this key. NON__CONSnon consecutive residues; indicates that two residues in a sequence are not consecutive and that there are a number of unsequenced residues between them. UNSUREuncertainties in the sequence; used to describe region(s) of a sequence for which the authors are unsure about the sequence assignment.
[86 FR 57052, Oct. 14, 2021]