What is a simple command line tool for doing Needleman-Wunsch pair-wise alignment on the command line
up vote
3
down vote
favorite
I have two DNA strings:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
and
AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I want a tool that allows me to do something like this on the command line:
$ aligner GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
and receive an ASCII visualization of a pairwise alignment. Something like this would work:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC---------
---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I prefer a tool that is packaged on bioconda. A tool that also does protein and RNA sequences is even more preferable.
sequence-alignment
add a comment |
up vote
3
down vote
favorite
I have two DNA strings:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
and
AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I want a tool that allows me to do something like this on the command line:
$ aligner GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
and receive an ASCII visualization of a pairwise alignment. Something like this would work:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC---------
---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I prefer a tool that is packaged on bioconda. A tool that also does protein and RNA sequences is even more preferable.
sequence-alignment
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
2 days ago
The sequences are not important. Edits are welcome.
– winni2k
2 days ago
1
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
2 days ago
add a comment |
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I have two DNA strings:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
and
AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I want a tool that allows me to do something like this on the command line:
$ aligner GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
and receive an ASCII visualization of a pairwise alignment. Something like this would work:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC---------
---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I prefer a tool that is packaged on bioconda. A tool that also does protein and RNA sequences is even more preferable.
sequence-alignment
I have two DNA strings:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
and
AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I want a tool that allows me to do something like this on the command line:
$ aligner GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
and receive an ASCII visualization of a pairwise alignment. Something like this would work:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC---------
---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I prefer a tool that is packaged on bioconda. A tool that also does protein and RNA sequences is even more preferable.
sequence-alignment
sequence-alignment
edited 2 days ago
terdon♦
3,7011726
3,7011726
asked 2 days ago
winni2k
915116
915116
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
2 days ago
The sequences are not important. Edits are welcome.
– winni2k
2 days ago
1
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
2 days ago
add a comment |
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
2 days ago
The sequences are not important. Edits are welcome.
– winni2k
2 days ago
1
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
2 days ago
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
2 days ago
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
2 days ago
The sequences are not important. Edits are welcome.
– winni2k
2 days ago
The sequences are not important. Edits are welcome.
– winni2k
2 days ago
1
1
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
2 days ago
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
2 days ago
add a comment |
4 Answers
4
active
oldest
votes
up vote
7
down vote
accepted
You are looking for the needle
program from the EMBOSS
suite. Available in bioconda.
http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needle.html
To read sequences from the commandline, you need so specify the format as asis
. To get the output on the screen, you'll need -stdout
and to use the default alignment parameters (gap penalty 10, extend_penalty 0.5) you'll need -auto
. Thus your query above would be:
$ needle -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG -auto -stdout
########################################
# Program: needle
# Rundate: Tue 27 Nov 2018 10:48:50
# Commandline: needle
# -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
# -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
# -auto
# -stdout
# Align_format: srspair
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 56
# Identity: 37/56 (66.1%)
# Similarity: 37/56 (66.1%)
# Gaps: 18/56 (32.1%)
# Score: 181.0
#
#
#=======================================
asis 1 GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC--- 47
|||||||||||||||||||||||||||||||||||||.
asis 1 ---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAG 41
asis 47 ------ 47
asis 42 AGGAGG 47
#---------------------------------------
#---------------------------------------
add a comment |
up vote
6
down vote
Install biopython with conda and use the following script:
#!/usr/bin/env python
import sys
from Bio import Align
aligner = Align.PairwiseAligner()
aligner.mode = "local"
alignments = aligner.align(sys.argv[1], sys.argv[2])
print(alignments[0])
The output is then:
.GGA-GGAGGGAG--AAGGAGGGAGGGAAGA-GGAGGGAG--AAGGAGGGAGGC
.|-|-||||||||--|||-|||-|||||-||-||||||||--|||-|||-|||.
AG-AAGGAGGGAGGGAAG-AGG-AGGGA-GAAGGAGGGAGGGAAG-AGG-AGG.
There are a LOT of alignments with that same score and you can tweak the details of the mode by changing aligner.mode
.
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
2 days ago
add a comment |
up vote
2
down vote
I'd just use something like t_coffee
and write a little bash function to run it on the strings you give:
aligner(){
tmp=$(mktemp);
args=("$@")
for ((i=0; i<$#; i++)); do
printf '>%sn%sn' $i ${args[i]} >> "$tmp"
done
t_coffee "$tmp" 2>/dev/null | /bin/grep -Pv '^CLUSTAL|^s*$'
rm "$tmp"
}
If you add that to your shell's initialization file (e.g. ~/.bashrc
if using bash on Linux, ~/.profile
if using bash on macOS), you can run align2seqs
with as many sequences as you want to align:
$ aligner 'GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG'
0 GG--AGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
1 nAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGG-GAAGAGGAGG
********* ******* ********* * ** * *
$ aligner 'GAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'GGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG' GAGGGAGGGAAGAGGAGGGAGAAGGATCCGAGGAAGAGGAGG AGAAGGAGGGAGGGAAGAGGCCCGAGATATATAAGGATCCGAGGAA
0 GA-GG----GAGAAGGAGGGAGGGAAGAGGAGGGAGAAGG--AGGGAGGC
1 GGAGG----GAGGGAAGAGGAGGGAGAAGGAGGGAG-GGA--AGAGGAGG
2 GA-GG----GAGGGAAGAGGAGGGAGAAGGATCCGA-GGA--AGAGGAGG
3 AGAAGGAGGGAGGGAAGAGGCCCGAGATATAT--AA-GGATCCGAG-GAA
* *** ** ** * * * *
add a comment |
up vote
0
down vote
Here's another option: https://github.com/noporpoise/seq-align; I've used it in the past as it is small and has no dependencies (just make
and you're set).
add a comment |
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
7
down vote
accepted
You are looking for the needle
program from the EMBOSS
suite. Available in bioconda.
http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needle.html
To read sequences from the commandline, you need so specify the format as asis
. To get the output on the screen, you'll need -stdout
and to use the default alignment parameters (gap penalty 10, extend_penalty 0.5) you'll need -auto
. Thus your query above would be:
$ needle -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG -auto -stdout
########################################
# Program: needle
# Rundate: Tue 27 Nov 2018 10:48:50
# Commandline: needle
# -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
# -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
# -auto
# -stdout
# Align_format: srspair
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 56
# Identity: 37/56 (66.1%)
# Similarity: 37/56 (66.1%)
# Gaps: 18/56 (32.1%)
# Score: 181.0
#
#
#=======================================
asis 1 GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC--- 47
|||||||||||||||||||||||||||||||||||||.
asis 1 ---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAG 41
asis 47 ------ 47
asis 42 AGGAGG 47
#---------------------------------------
#---------------------------------------
add a comment |
up vote
7
down vote
accepted
You are looking for the needle
program from the EMBOSS
suite. Available in bioconda.
http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needle.html
To read sequences from the commandline, you need so specify the format as asis
. To get the output on the screen, you'll need -stdout
and to use the default alignment parameters (gap penalty 10, extend_penalty 0.5) you'll need -auto
. Thus your query above would be:
$ needle -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG -auto -stdout
########################################
# Program: needle
# Rundate: Tue 27 Nov 2018 10:48:50
# Commandline: needle
# -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
# -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
# -auto
# -stdout
# Align_format: srspair
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 56
# Identity: 37/56 (66.1%)
# Similarity: 37/56 (66.1%)
# Gaps: 18/56 (32.1%)
# Score: 181.0
#
#
#=======================================
asis 1 GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC--- 47
|||||||||||||||||||||||||||||||||||||.
asis 1 ---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAG 41
asis 47 ------ 47
asis 42 AGGAGG 47
#---------------------------------------
#---------------------------------------
add a comment |
up vote
7
down vote
accepted
up vote
7
down vote
accepted
You are looking for the needle
program from the EMBOSS
suite. Available in bioconda.
http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needle.html
To read sequences from the commandline, you need so specify the format as asis
. To get the output on the screen, you'll need -stdout
and to use the default alignment parameters (gap penalty 10, extend_penalty 0.5) you'll need -auto
. Thus your query above would be:
$ needle -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG -auto -stdout
########################################
# Program: needle
# Rundate: Tue 27 Nov 2018 10:48:50
# Commandline: needle
# -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
# -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
# -auto
# -stdout
# Align_format: srspair
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 56
# Identity: 37/56 (66.1%)
# Similarity: 37/56 (66.1%)
# Gaps: 18/56 (32.1%)
# Score: 181.0
#
#
#=======================================
asis 1 GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC--- 47
|||||||||||||||||||||||||||||||||||||.
asis 1 ---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAG 41
asis 47 ------ 47
asis 42 AGGAGG 47
#---------------------------------------
#---------------------------------------
You are looking for the needle
program from the EMBOSS
suite. Available in bioconda.
http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needle.html
To read sequences from the commandline, you need so specify the format as asis
. To get the output on the screen, you'll need -stdout
and to use the default alignment parameters (gap penalty 10, extend_penalty 0.5) you'll need -auto
. Thus your query above would be:
$ needle -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG -auto -stdout
########################################
# Program: needle
# Rundate: Tue 27 Nov 2018 10:48:50
# Commandline: needle
# -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
# -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
# -auto
# -stdout
# Align_format: srspair
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 56
# Identity: 37/56 (66.1%)
# Similarity: 37/56 (66.1%)
# Gaps: 18/56 (32.1%)
# Score: 181.0
#
#
#=======================================
asis 1 GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC--- 47
|||||||||||||||||||||||||||||||||||||.
asis 1 ---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAG 41
asis 47 ------ 47
asis 42 AGGAGG 47
#---------------------------------------
#---------------------------------------
edited yesterday
answered 2 days ago
Ian Sudbery
2,036214
2,036214
add a comment |
add a comment |
up vote
6
down vote
Install biopython with conda and use the following script:
#!/usr/bin/env python
import sys
from Bio import Align
aligner = Align.PairwiseAligner()
aligner.mode = "local"
alignments = aligner.align(sys.argv[1], sys.argv[2])
print(alignments[0])
The output is then:
.GGA-GGAGGGAG--AAGGAGGGAGGGAAGA-GGAGGGAG--AAGGAGGGAGGC
.|-|-||||||||--|||-|||-|||||-||-||||||||--|||-|||-|||.
AG-AAGGAGGGAGGGAAG-AGG-AGGGA-GAAGGAGGGAGGGAAG-AGG-AGG.
There are a LOT of alignments with that same score and you can tweak the details of the mode by changing aligner.mode
.
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
2 days ago
add a comment |
up vote
6
down vote
Install biopython with conda and use the following script:
#!/usr/bin/env python
import sys
from Bio import Align
aligner = Align.PairwiseAligner()
aligner.mode = "local"
alignments = aligner.align(sys.argv[1], sys.argv[2])
print(alignments[0])
The output is then:
.GGA-GGAGGGAG--AAGGAGGGAGGGAAGA-GGAGGGAG--AAGGAGGGAGGC
.|-|-||||||||--|||-|||-|||||-||-||||||||--|||-|||-|||.
AG-AAGGAGGGAGGGAAG-AGG-AGGGA-GAAGGAGGGAGGGAAG-AGG-AGG.
There are a LOT of alignments with that same score and you can tweak the details of the mode by changing aligner.mode
.
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
2 days ago
add a comment |
up vote
6
down vote
up vote
6
down vote
Install biopython with conda and use the following script:
#!/usr/bin/env python
import sys
from Bio import Align
aligner = Align.PairwiseAligner()
aligner.mode = "local"
alignments = aligner.align(sys.argv[1], sys.argv[2])
print(alignments[0])
The output is then:
.GGA-GGAGGGAG--AAGGAGGGAGGGAAGA-GGAGGGAG--AAGGAGGGAGGC
.|-|-||||||||--|||-|||-|||||-||-||||||||--|||-|||-|||.
AG-AAGGAGGGAGGGAAG-AGG-AGGGA-GAAGGAGGGAGGGAAG-AGG-AGG.
There are a LOT of alignments with that same score and you can tweak the details of the mode by changing aligner.mode
.
Install biopython with conda and use the following script:
#!/usr/bin/env python
import sys
from Bio import Align
aligner = Align.PairwiseAligner()
aligner.mode = "local"
alignments = aligner.align(sys.argv[1], sys.argv[2])
print(alignments[0])
The output is then:
.GGA-GGAGGGAG--AAGGAGGGAGGGAAGA-GGAGGGAG--AAGGAGGGAGGC
.|-|-||||||||--|||-|||-|||||-||-||||||||--|||-|||-|||.
AG-AAGGAGGGAGGGAAG-AGG-AGGGA-GAAGGAGGGAGGGAAG-AGG-AGG.
There are a LOT of alignments with that same score and you can tweak the details of the mode by changing aligner.mode
.
answered 2 days ago
Devon Ryan♦
12.4k21135
12.4k21135
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
2 days ago
add a comment |
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
2 days ago
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
2 days ago
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
2 days ago
add a comment |
up vote
2
down vote
I'd just use something like t_coffee
and write a little bash function to run it on the strings you give:
aligner(){
tmp=$(mktemp);
args=("$@")
for ((i=0; i<$#; i++)); do
printf '>%sn%sn' $i ${args[i]} >> "$tmp"
done
t_coffee "$tmp" 2>/dev/null | /bin/grep -Pv '^CLUSTAL|^s*$'
rm "$tmp"
}
If you add that to your shell's initialization file (e.g. ~/.bashrc
if using bash on Linux, ~/.profile
if using bash on macOS), you can run align2seqs
with as many sequences as you want to align:
$ aligner 'GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG'
0 GG--AGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
1 nAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGG-GAAGAGGAGG
********* ******* ********* * ** * *
$ aligner 'GAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'GGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG' GAGGGAGGGAAGAGGAGGGAGAAGGATCCGAGGAAGAGGAGG AGAAGGAGGGAGGGAAGAGGCCCGAGATATATAAGGATCCGAGGAA
0 GA-GG----GAGAAGGAGGGAGGGAAGAGGAGGGAGAAGG--AGGGAGGC
1 GGAGG----GAGGGAAGAGGAGGGAGAAGGAGGGAG-GGA--AGAGGAGG
2 GA-GG----GAGGGAAGAGGAGGGAGAAGGATCCGA-GGA--AGAGGAGG
3 AGAAGGAGGGAGGGAAGAGGCCCGAGATATAT--AA-GGATCCGAG-GAA
* *** ** ** * * * *
add a comment |
up vote
2
down vote
I'd just use something like t_coffee
and write a little bash function to run it on the strings you give:
aligner(){
tmp=$(mktemp);
args=("$@")
for ((i=0; i<$#; i++)); do
printf '>%sn%sn' $i ${args[i]} >> "$tmp"
done
t_coffee "$tmp" 2>/dev/null | /bin/grep -Pv '^CLUSTAL|^s*$'
rm "$tmp"
}
If you add that to your shell's initialization file (e.g. ~/.bashrc
if using bash on Linux, ~/.profile
if using bash on macOS), you can run align2seqs
with as many sequences as you want to align:
$ aligner 'GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG'
0 GG--AGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
1 nAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGG-GAAGAGGAGG
********* ******* ********* * ** * *
$ aligner 'GAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'GGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG' GAGGGAGGGAAGAGGAGGGAGAAGGATCCGAGGAAGAGGAGG AGAAGGAGGGAGGGAAGAGGCCCGAGATATATAAGGATCCGAGGAA
0 GA-GG----GAGAAGGAGGGAGGGAAGAGGAGGGAGAAGG--AGGGAGGC
1 GGAGG----GAGGGAAGAGGAGGGAGAAGGAGGGAG-GGA--AGAGGAGG
2 GA-GG----GAGGGAAGAGGAGGGAGAAGGATCCGA-GGA--AGAGGAGG
3 AGAAGGAGGGAGGGAAGAGGCCCGAGATATAT--AA-GGATCCGAG-GAA
* *** ** ** * * * *
add a comment |
up vote
2
down vote
up vote
2
down vote
I'd just use something like t_coffee
and write a little bash function to run it on the strings you give:
aligner(){
tmp=$(mktemp);
args=("$@")
for ((i=0; i<$#; i++)); do
printf '>%sn%sn' $i ${args[i]} >> "$tmp"
done
t_coffee "$tmp" 2>/dev/null | /bin/grep -Pv '^CLUSTAL|^s*$'
rm "$tmp"
}
If you add that to your shell's initialization file (e.g. ~/.bashrc
if using bash on Linux, ~/.profile
if using bash on macOS), you can run align2seqs
with as many sequences as you want to align:
$ aligner 'GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG'
0 GG--AGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
1 nAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGG-GAAGAGGAGG
********* ******* ********* * ** * *
$ aligner 'GAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'GGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG' GAGGGAGGGAAGAGGAGGGAGAAGGATCCGAGGAAGAGGAGG AGAAGGAGGGAGGGAAGAGGCCCGAGATATATAAGGATCCGAGGAA
0 GA-GG----GAGAAGGAGGGAGGGAAGAGGAGGGAGAAGG--AGGGAGGC
1 GGAGG----GAGGGAAGAGGAGGGAGAAGGAGGGAG-GGA--AGAGGAGG
2 GA-GG----GAGGGAAGAGGAGGGAGAAGGATCCGA-GGA--AGAGGAGG
3 AGAAGGAGGGAGGGAAGAGGCCCGAGATATAT--AA-GGATCCGAG-GAA
* *** ** ** * * * *
I'd just use something like t_coffee
and write a little bash function to run it on the strings you give:
aligner(){
tmp=$(mktemp);
args=("$@")
for ((i=0; i<$#; i++)); do
printf '>%sn%sn' $i ${args[i]} >> "$tmp"
done
t_coffee "$tmp" 2>/dev/null | /bin/grep -Pv '^CLUSTAL|^s*$'
rm "$tmp"
}
If you add that to your shell's initialization file (e.g. ~/.bashrc
if using bash on Linux, ~/.profile
if using bash on macOS), you can run align2seqs
with as many sequences as you want to align:
$ aligner 'GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG'
0 GG--AGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
1 nAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGG-GAAGAGGAGG
********* ******* ********* * ** * *
$ aligner 'GAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'GGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG' GAGGGAGGGAAGAGGAGGGAGAAGGATCCGAGGAAGAGGAGG AGAAGGAGGGAGGGAAGAGGCCCGAGATATATAAGGATCCGAGGAA
0 GA-GG----GAGAAGGAGGGAGGGAAGAGGAGGGAGAAGG--AGGGAGGC
1 GGAGG----GAGGGAAGAGGAGGGAGAAGGAGGGAG-GGA--AGAGGAGG
2 GA-GG----GAGGGAAGAGGAGGGAGAAGGATCCGA-GGA--AGAGGAGG
3 AGAAGGAGGGAGGGAAGAGGCCCGAGATATAT--AA-GGATCCGAG-GAA
* *** ** ** * * * *
answered 2 days ago
terdon♦
3,7011726
3,7011726
add a comment |
add a comment |
up vote
0
down vote
Here's another option: https://github.com/noporpoise/seq-align; I've used it in the past as it is small and has no dependencies (just make
and you're set).
add a comment |
up vote
0
down vote
Here's another option: https://github.com/noporpoise/seq-align; I've used it in the past as it is small and has no dependencies (just make
and you're set).
add a comment |
up vote
0
down vote
up vote
0
down vote
Here's another option: https://github.com/noporpoise/seq-align; I've used it in the past as it is small and has no dependencies (just make
and you're set).
Here's another option: https://github.com/noporpoise/seq-align; I've used it in the past as it is small and has no dependencies (just make
and you're set).
answered 58 mins ago
Kirill G
1185
1185
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f5509%2fwhat-is-a-simple-command-line-tool-for-doing-needleman-wunsch-pair-wise-alignmen%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
2 days ago
The sequences are not important. Edits are welcome.
– winni2k
2 days ago
1
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
2 days ago