DNA and Protein Sequence Database Search
Version 1.8.2

[Search Form]  [About SALSA]  [Help]  [Download]

SALSA is a program for sequence database searches based on sequence similarities. The scores calculated by SALSA are based on alignments with gaps, and are generally equal to or very near the scores that would result from an optimal local alignment procedure, as computed by the Smith-Waterman aligorithm (1). SALSA is currently about as fast as FASTA (ktup=2) (28). Further details about SALSA can be found in our paper (26).

This is an experimental service, free for academic use. Please be patient, some searches my take a long time to complete. At present, all results are returned on the web, however, e-mailing of results is planned.

The software has been developed by Torbjørn Rognes at the Department of Molecular Biology (Head: Prof. Erling Seeberg), Institute of Medical Microbiology, The National Hospital, University of Oslo, Norway.

Send your bug reports, suggestions, comments, etc to

The sequence data that is available for searching using this service may be incomplete and may contain errors. Please read the disclaimers from TIGR and Genome Therapeutics Corporation.

Thanks to all the genome project organisations for making their sequence data public and early available.

This project is supported by the Research Council of Norway.


  1. Smith and Waterman (1981) Identification of Common Molecular Subsequences. J. Mol. Biol. 147, 195-197.
  2. Altschul et al. (1990) Basic Local Alignment Search Tool. J. Mol. Biol. 215, 403-410.
  3. Fleischmann et al. (1995) Whole-Genome Random Sequencing and Assembly of Haemophilus influenzae Rd. Science, 269, 449-604.
  4. Fraser et al. (1995) The Minimal Gene Complement of Mycoplasma genitalium, Science, 270, 397-403.
  5. Bult et al. (1996), Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii, Science, 273, 1017-1140.
  6. Kaneko et al. (1996) Sequence analysis of the genome of the unicellular Cyanobacterium Synechocystis sp. strain PCC 6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions, DNA Res., 3, 109-136.
  7. Himmelreich et al. (1996) Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae, Nucleic Acids Research, 22, 4420-4449.
  8. Goffeau et al. (1996) Life with 6000 Genes, Science, 274, 546-567.
  9. Blattner et al. (1997) The Complete Genome Sequence of Escherichia coli K-12, Science 277, 1453-1462.
  10. Kunst et al. (1997) The complete genome sequence of the Gram-positive bacterium Bacillus subtilis, Nature, 390, 249-256.
  11. B.A. Roe, S. Clifton and D.W. Dyer, preliminary data from The Gonococcal Genome Sequencing Project
  12. B.A. Roe, S. Clifton, Mike McShan and Joseph Ferretti, preliminary data from The Streptococcal Genome Sequencing Project
  13. B.A. Roe, D. Kupfer, S. Clifton, and Rolf Prade, preliminary data from The Aspergillus nidulans cDNA Sequencing Project
  14. Preliminary data from The Institute for Genomic Research (TIGR)
  15. Tomb et al. (1997) The complete genome sequence of the gastric pathogen Helicobacter pylori, Nature, 388, 539-547.
  16. Preliminary data from the Pseudomonas Genome project
  17. The Magpie genome sequencing project list
  18. Smith et. al. (1997) Complete genome sequence of Methanobacterium thermoautotrophicum delta H: functional analysis and comparative genomics, J.Bacteriol. 179, 7135-7155.
  19. Preliminary data from Genome Therapeutics Corporation
  20. Klenk et al. (1997) The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus. Nature, 390, 364-370.
  21. Fraser et al. (1997) Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi. Nature, 390, 580-586.
  22. Deckert et al. (1998) The complete genome of the hyperthermophilic bacterium Aquifex aeolicus. Nature, 392, 353-358.
  23. Cole et al. (1998) Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393, 537-544.
  24. Genome data from the National Centre for Biotechnology Information (NCBI)
  25. The SWISS-PROT database
  26. Rognes T. and Seeberg E.C. (1998) SALSA: improved protein database searching by a new algorithm for assembly of sequence fragments into gapped alignments. Bioinformatics 14 (10), 839-845.
  27. Kawarabayasi et al. (1998) Complete Sequence and Gene Organization of the Genome of a Hyper-thermophilic Archaebacterium, Pyrococcus horikoshii OT3. DNA Research 5, 55-76.
  28. Pearson (1990) Rapid and sensitive sequence comparison with FASTP and FASTA. Methods in Enzymology, 183, 63-98.
  29. Fraser et al. (1998) Complete genome sequence of Treponema pallidum, the syphilis spirochete. Science, 281, 375-388.
  30. Stephens et al. (1998) Genome Sequence of an Obligate Intracellular Pathogen of Humans: Chlamydia trachomatis. Science, 282, 754-759.
  31. Andersson et al. (1998) The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature, 396, 133-143.

Copyright © 1999 Torbjørn Rognes. Last updated 1999-02-11.