Posted in: Biology Science



BLAST Parameters for short query sequences

For searching sequence similarities within very short fragments, BLAST may not be the best choice. If you want to tackle this anyhow, the word size should be reduced to the minimum, and the expectation value should be adjusted as well. Minimal settings for word size are -W 7 for blastn, and -W 2 for blastx in conjunction with reducing the neighborhood word threshold score to -f 8 or below (this is only necessary for blastx). Expectation value should be -E 100. Yes, that’s no joke. When comparing against large databases like NT or NR, such high amounts of expected random hits have to be accepted. A lower eValue threshold could be used when only nearly exact matches are desired.

其实主要就是把-W 设成7.尝试过,1条8bp长的序列,就算是严格地在库中存在,如果只用默认参数,啥也搜不到。如果加上-W 7,就可以搜到。


最后更新于 2020 年 12 月 31 日 作者 springwood


您的电子邮箱地址不会被公开。 必填项已用*标注