Lab Week 2

Analysis of six homologous Galectin-3 proteins

Galectin-3, which is a family of animal carbohydrate binding proteins, plays important role in inflammation and cancer (Leffler 2011). Its induction in inflammatory conditions, effects on immune cells, especially neutrophils, and modification in inflammatory responses suggest its proinflammatory functions. It has also been reported that Galectin-3 promotes tumor growth by anti-apoptotic and metastasis by alteration of cell adhesion (Leffler 2011). The gene that encodes Galectin-3 is LGALS3.

When six homologs of Galectin-3 (five mammals and one bird) were aligned using Clustal W, two pairs of different parameter sets were tested. First, two alignments with gap opening penalty of 6 and 12 respectively were performed and they turned out to be identically, surprisingly. This contradicts with the expectation that increasing the gap opening penalty by 2 folds favors less number of gaps in the final alignment. This is probably because the bird (Gallus gallus) sequence diverges so significantly from those of mammals that the deletions/insertions it contains are impossible to be solved by fewer gaps. Second, two alignments with gap extend penalty of 0.1 and 2.0 were carried out respectively. One of the nine gaps in the alignment using a gap extend penalty of 0.1 was shortened compared to that of a gap extend penalty of 2.0. This is in agreement with the hypothesis that increasing the gap-extend penalty leads to shorter gaps in the final alignment. However, whether this one reduction in gap length is statistically significant is unknown.

The two most closely related amino acid sequences are Galectin-3 from Sus scrofa and Bos Taurus. They are both hoofed animals, which belong to the order of Artiodactyla. The pairwise alignment using ALIGN of these two sequences differs from the one using CLUSTAL. In Clustal, a total of 8 gaps were introduced into the alignment, while using ALIGN, a significant reduction in gap number – only two – was shown in the alignment. The result is not too surprising because in the initial alignment using Clustal, five of the six Galectin-3 homologs are from mammals and they are quite conserved. The other homolog is from Gallus gallus, which diverges significantly from those of the mammals. Therefore, the Clustal alignment compromises to incorporate more gaps to match the sequence of Gallus gallus in order to achieve the optimal total alignment score. The sequences of Sus scrofa and Bos Taurus alone are more similar both in length and composition. The lack of deletions/insertions between the two requires fewer gaps. The two sequences were also aligned using LALIGN, which compares local similarity between two proteins and produces the best 10 local sequence alignments. The highest-score and most statistically significant alignment is in agreement with that obtained using ALIGN. Since LALIGN works better with shorter sequences to detect local similarity, the results are expected to differ between LALIGN and ALIGN only when the sizes of the sequences of interest are long.

The DNA sequences of LGALS-3 Sus scrofa and LGALS-3 Bos Taurus were aligned using ALIGN. The alignment is different from the amino acid sequence results obtained using ALIGN or Clustal. The DNA sequences have an identity of 77.6% between the two species compared to 88.6% identity for the amino acid sequences. One would expect the increase in identity on the amino acid level is due to the synonymous base changes in the third position of the codon, which does not change the amino acid in the sequence.

Leffler H. Galectins: roles and uses in inflammation and cancer. Section of microbiology, immunology and glycobiology. LUND University. Feb. 17th 2011.