hi

i have an aligmenet file which contains 3 species generated from clustalx

AAAACGT Alpha
AAA-CGT Beta
AAAAGGT Gamma

i already sliced the aligment using the predefined indexing in biopython align[:,:4]

but now when i print the result i get:

AAAA Alpha
AAA- Beta
AAAA Gamma

the questions is: how can i get only the sub-aligmenet as without the species names(like a multiple aligment object without species names) e.g something like that:

AAAA
AAA-
AAAA

i tried : align[:,:4].seq but it did not worked.

any help would be appreciated

thank you

Recommended Answers

All 6 Replies

thank you.

i already found this, but for example how can i take the columns between 4 and 10 without getting the specie names ???

Doesn't the

print align[:,3:11]

work?

Mmmm, no, it prints the needed columns but with the sepcies names ...

maybe you can line.split(None, 1)[0] to get rid of last word in each line, but there must be a proper way also.

data = """
AAAACGT Alpha
AAA-CGT Beta
AAAAGGT Gamma""".splitlines()

print('\n'.join(line.split(None, 1)[0] for line in data if ' ' in line))

you cannot do this, because the result of align[:,3:11] contains something like that:

SingleLetterAlphabet() alignment with 3 rows and 7 columns
AAAACGT Alpha
AAA-CGT Beta
AAAAGGT Gamma

so the split will not work ..

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.