Wednesday, December 2, 2020

Idle Rambling On DeepMind -- December 2, 2020

So, a bit more on DeepMind. See this post.

This is how I understand it.

1. The problem: posed by Christian Anfinsen in his acceptance speech for the 1972 Nobel Prize in Chemistry. Anfisen famously postulated that, in theory, a protein's amino acid sequence should fully determine its structure. 

That's really cool because in 1972 I was majoring in chemistry and biology in college and somewhere along the line that postulate was well accepted. 

The problem was that the number of ways a protein could theoretically fold before settling into its final 3D structure is astronomical. 
In 1969 -- the year I graduated from high school --  Cyrus Levinthal noted that it would take longer than the age of the known universe to enumerate all possible configurations of a typical protein by brute force calculation – Levinthal estimated 10^300 possible conformations for a typical protein. Yet in nature, proteins fold spontaneously, some within milliseconds – a dichotomy sometimes referred to as Levinthal’s paradox.

2. So, in 1994, some twenty years later, CASP was founded as an international competition to see if anyone could predict the folding structure of a protein based only on its amino acid sequence. CASP is a competition that is held every two years.

3.  Google's parent -- Alphabet -- set up a division in its company called DeepMind with the purpose of solving the "Anfinsen problem." Alphabet set up that company ten years ago -- they've been working on this problem for ten years.

4. DeepMind came up with a computer application called AlphaFold which led the pack in the competition two years ago, 2018.

5. This year, 2020, AlphaFold 2 solved the "Anfinsen problem." AlphaFold 2 proved that it could determine the folding structure -- the 3D structure -- of any protein by simply knowing the amino acid sequence of the protein. 

6. The competition between (among?) AlphaFold 2 and the other competitors was not even close. AlphaFold 2 was "head and shoulders" above any of the other competitors. 

7. In addition to the medical benefits of this, this is what really caught my attention: this is being done by the same folks that are working on algorithms for internet searches. Go back to Cyrus Levinthal:

Cyrus Levinthal noted that it would take longer than the age of the known universe to enumerate all possible configurations of a typical protein by brute force calculation – Levinthal estimated 10^300 possible conformations for a typical protein. 

AlphaFold 2 can now solve that problem in "a couple of days." 

Imagine the computing power and the software that is being written. And it has taken ten years to do this. Imagine the strides that this same company has made in internet searches and tracking folks like me who do a lot of surfing on the net. 

Good, bad, or indifferent, it's absolutely incredible.

Oh, by the way, I can't remember if I have posted this before. I think everyone knows what an IP address is: it's a numerical address that identifies every electronic device that connects to the internet (including wi-fi and Bluetooth connectivity through a computer). Apparently, I have been told, that Google can "identify" an IP by the search history of any given computer. In other words, Google does not need to know my actual IP to be able to identify "me" as the one doing a specific search. Based on the history of my searches over time, Google can assign an IP to track me. With each search, the ability to "identify" me improves. In other words, Google no longer needs my IP address to track me. Which means that our NSA can track hackers who think they can spoof or hide their IP addresses.

Again, this is how I see it.

Five links:

No comments:

Post a Comment