Beginner Bioinformatics in Python — Part 2

def count_occurence(text, pattern):
return len(list(filter(lambda i: text[i:i + len(pattern)] == pattern, range(len(text) - len(pattern) + 1))))
def count_occurence(text, pattern):
return sum(1 for i in range(len(text) - len(pattern) + 1) if text[i:i+len(pattern)] == pattern)
Input
Text: ACGTTGCATGTCGCATGATGCATGAGAGCT
Length: 4
Output
CATG GCAT
def FrequencyMap(Text, k):
freq = {}
n = len(Text)
for i in range(n-k+1):
Pattern = Text[i:i+k]
freq[Pattern] = 0
for i in range(n-k+1):
Pattern = Text[i:i+k]
freq[Pattern]+=1
return freq
def FrequentWords(Text, k):
words = []
freq = FrequencyMap(Text, k)
m = max(freq.values())
for key in freq:
if(freq[key]==m):
words.append(key)
return words

def combine_dictionary_counts(a, b):
return dict(list(a.items()) + list(b.items()) + [(k, a[k] + b[k]) for k in set(b) & set(a)])

def frequency_map_for_kmers(text, length):
return reduce(combine_dictionary_counts, [{text[i:i + length]: 1} for i in range(len(text) - length + 1)])
def most_frequent_substrings(text, length):
frequency_map = frequency_map_for_kmers(text, length)
max_frequency = max(frequency for kmer, frequency in frequency_map.items())
return [kmer for kmer, frequency in frequency_map.items() if frequency == max_frequency]
def ReverseComplement(Pattern):
return reverse(dna_complement(Pattern))

def dna_complement(pattern):
return ''.join(map(character_dna_complement, pattern))

def reverse(s):
return s if len(s) == 0 else reverse(s[1:]) + s[0]

def character_dna_complement(character):
return {"A": "T", "T": "A", "G": "C", "C": "G"}[character]
  1. Biology is a lot more fun when it requires the person learning to think about it, form opinions, and solve problems.
  2. It is nice to see DS and A type problems occur in biology on such a regular basis. Proves that there are hard programming problems to solve everywhere, you just have to look for them.
  3. Seeing such concise code in Python was a pleasant change from Java, where it is difficult to reduce the amount of code beyond a certain point.
  4. Test First Approach helps, even if one is not doing TDD. This is true even for these tiny problems that we solved. Tests allowed me to play around with different paradigms and solutions.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store