Since less than 4000 people solved this problem, I thought to write about it.
It took me a couple of hours to get the trick, and literally 5 minutes to write the code in one go without any error.
The problem simply can be rephrased this way:
For a given string and GC-content, what is the log probability for every character to appear in the string?
So, for a given string ACGTA, and GC-component 0.129, solution is:
That’s it!
Further Optimization
Considering the formula:
Since the terms inside the log are being multiplied, you can simply take the cumulative sum of the log as you iterate through the string as shown below in the code.
Code
Warning
The math.log function in Python is natural logarithm (base e), so you have to use math.log10, which is the common logarithm (base 10).