Bioinformatics Codelets: 22) Perl version of AVANA: BENANA!!!!!

The life changing program is here.....Ben's version of AVANA!!!

Past AVANA users will know that in the presence of any real gap/s in e.g. a nonamer window, AVANA will shift its sequence to cover the real gap/s. Furthermore, AVANA excludes any nonamers with padded gaps in its computations. However, BENANA has solved both issues...in that there is no shifting of amino acids to cover any gaps and it also includes padded gaps in its computations.

Things to take note:

1) After running BENANA, it will prompt you to enter your input file. Please note that BENANA only accepts .taln format as input files.

2) Next, define the window size that you are interested in. If you try to enter any thing besides numbers, the program will be terminated. If you did not enter any thing, the program will treat it as "9" by default.

3) Please ignore _output1.csv and _output2.csv as they are used for processing.

4) It is important to note that full gaps are not considered as a variant and is excluded from the variant list.

5) Output files: _intraserotype_position.csv, _intraserotype_variant_list_sorted.csv, BENANA_intraserotype_diversity_results.csv

What BENANA does:

For example, if your taln input file contains:

AAAACCCCDDDD
AAAANNNNMMMM
CCCCNNNNMMMM

It first generates two files output1.csv and output2.csv, which is used for processing (you may ignore them).

It then generates a peptide list at each position in the window that the user has defined (default: nonamer window). In the above example, 4 files are automatically generated:

1-9_intraserotype_position.csv:

1,AAAACCCCD,9
1,AAAANNNNM,9
1,CCCCNNNNM,9

2-10_intraserotype_position.csv:

2,AAACCCCDD,10
2,AAANNNNMM,10
2,CCCNNNNMM,10

3-11_intraserotype_position.csv:

3,AACCCCDDD,11
3,AANNNNMMM,11
3,CCNNNNMMM,11

4-12_intraserotype_position.csv:

4,ACCCCDDDD,12
4,ANNNNMMMM,12
4,CNNNNMMMM,12

From each peptide list at each position, BENANA also generates a variant list just like AVANA, with the predominant peptide displayed at the top and the count of each peptide in the alignment. In the case where there is no predominant peptide, the first peptide in the alignment will be displayed at the top:

1-9_intraserotype_variant_list_sorted.csv:

AAAACCCCD,1
AAAANNNNM,1
CCCCNNNNM,1

2-10_intraserotype_variant_list_sorted.csv:

AAACCCCDD,1
AAANNNNMM,1
CCCNNNNMM,1

3-11_intraserotype_variant_list_sorted.csv:

AACCCCDDD,1
AANNNNMMM,1
CCNNNNMMM,1

4-12_intraserotype_variant_list_sorted.csv:

ACCCCDDDD,1
ANNNNMMMM,1
CNNNNMMMM,1

Lastly, BENANA will generate the diversity file just like AVANA did. However, it does not compute entropy values at each position.

Diversity File:

StartPos,Predominant Peptide,EndPos,% Representation,No. of Variants,Total no. of Sequences with Valid Variants,Support %
1,CCCCNNNNM,9,33.33%,2,3,100.00%
2,AAACCCCDD,10,33.33%,2,3,100.00%
3,CCNNNNMMM,11,33.33%,2,3,100.00%
4,ANNNNMMMM,12,33.33%,2,3,100.00%

BENANA_open_source_v1.0.pl is downloadable from this link.

© 2009 ^BeNBeN^. All rights reserved.

Bioinformatics Codelets

Tuesday, September 22, 2009

22) Perl version of AVANA: BENANA!!!!!

No comments:

Post a Comment

Related Sites

Contributors