The FASTA format for sequences.

by Jeng-Sheng Yeh, 5,14,2003
  1. DNA sequences in FASTA format will looks like this example:
    >gi|29826276|gb|AY274119.1| SARS coronavirus TOR2, complete genome
    CTACCCAGGAAAAGCCAACCAACCTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTG
    TAGCTGTCGCTCGGCTGCATGCCTAGTGCACCTACGCAGTATAAACAATAATAAATTTTACTGTCGTTGA
    CAAGAAACGAGTAACTCGTCCCTCTTCTGCAGACTGCTTACGGTTTCGTCCGTGTTGCAGTCGATCATCA
    GCATACCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTCTTGGTGTCAACGAGA
    AAACACACGTCCAACTCAGTTTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCGTGGCTTCGGGGACTC
    TGTGGAAGAGGCCCTATCGGAGGCACGTGAACACCTCAAAAATGGCACTTGTGGTCTAGTAGAGCTGGAA
    AAAGGCGTACTGCCCCAGCTTGAACAGCCCTATGTGTTCATTAAACGTTCTGATGCCTTAAGCACCAATC
    
    
    The first line is started by a '>' (Greater Sign).
    The A,T,G,C or U (U are treated as T) stand for:
    A: Adenine, T: Thymine, G: Guanine, C: Cytosine, U: Uracil
    Those are the basic elements for DNA/RNA sequencing.

  2. Amino Acid sequences in FASTA format will looks like below example:
    > pdb1l6z
    EVTIEAVPPQVAEDNNVLLLVHNLPLALGAFAWYKGNTTAIDKEIARFVP
    NSNMNFTGQAYSGREIIYSNGSLLFQMITMKDMGVYTLDMTDENYRRTQA
    TVRFHVHQPVTQPFLQVTNTTVKELDSVTLTCLSNDIGANIQWLFNSQSL
    QLTERMTLSQNNSILRIDPIKREDAGEYQCEISNPVSVRRSNSIKLDIIF
    DPS
    
    
    The first line is also started by a '>' (Greater Sign).
    The alphabets after second line are Amino Acids stand for:
    A(Alanine, ala), C(Cysteine, cys), D(Aspartic acid, asp), etc.

    You could refered to my Yes_Amino_HowTo page

    Alanine ala A 89.1 Non-polar
    Arginine arg R 174.2 Positive polar
    Asparagine asn N 132.1 Uncharged polar
    Aspartic acid asp D 133.1 Negative polar
    Cysteine cys C 121.2 Uncharged polar
    Glutamic acid glu E 147.1 Negative polar
    Glutamine gln Q 146.2 Uncharged polar
    Glycine gly G 75.1 Uncharged polar
    Histidine his H 155.2 Positive polar
    Isoleucine ile I 131.2 Non-polar
    Leucine leu L 131.2 Non-polar
    Lysine lys K 146.2 Positive polar
    Methionine met M 149.2 Non-polar
    Phenylalanine phe F 165.2 Non-polar
    Proline pro P 115.1 Non-polar
    Serine ser S 105.1 Uncharged polar
    Threonine thr T 119.1 Uncharged polar
    Tryptophan trp W 204.2 Non-polar
    Tyrosine tyr Y 181.2 Uncharged polar
    Valine val V 117.1 Non-polar