Meet the scientific ‘power couple’ who turned their weekend pet project into a COG UK-approved mega-database
Fourth up in our Pandemic Pioneers series is husband and wife duo Agnel and Sony, who together have created a mega-database that’s already proving vital in the global fight against Covid-19.
“CoVal is basically a bridge. We’re building a bridge.”
So says 34-year-old computational biologist Sony Malhotra, half of a husband and wife team based at the CCP-EM Scientific Computing Department, Rutherford Appleton Laboratory on the Harwell Science and Innovation Campus outside Oxford. Beside her sits 37-year-old partner Agnel Praveen Joseph, Senior Computational Scientist and Coronavirus Structure Task Force member.
Together this scientific ‘power couple’, both valued members of STFC’s Scientific Computing Department have turned their weekend pet project into a COG UK-approved mega-database that’s already proving vital in the global fight against Covid-19.
“I think the idea came over lunch” says Agnel modestly.
“It was actually when we were splitting childcare” adds Sony, “the nurseries were shut down so one of us was working in the morning, another in the afternoon. I have experience with genomics data and developing web servers, Agnel knows about Cryo-EM structures. We were stuck in the same house, so we just thought of putting our expertise together.”
“Cryogenic electron microscopy is just a way of looking at molecules inside cells” explains Agnel. “I’ve been studying these 3D structures for a long time. And these models often come with errors and ambiguities.”
“So we were like, ‘maybe it’s better to look at it more systematically than just one structure at a time’” adds Sony.
Thus CoVal (‘Validation of Covid Structures’) was born.
“CoVal is basically a database of all the mutations in the SARS-CoV-2 genomes” she continues. “It gives you a list of mutations from the variants, and the geographical location of those mutations, and maps them onto the three-dimensional structures and how good quality those structures are.”
“Normally the two data are looked at independently — the genome sequencing is a different set of experiments from the data for structural determination. There’s no standard resource that connects the two so if we build a bridge between these two data and try to map the genome information variant onto the structures it helps to look at the structures in the context of the mutation, and the impact of the variants on the structures.”
Why did this database not already exist?
“These sorts of data are kept in different archives. It’s very difficult for a non-expert or even a clinician to link the two, the numbers don’t match, the IDs don’t match, that sort of thing. So, I guess Covid forced our hand.”
CoVal, which already hosts about half a million mutations, and is available as a tool on the COG-UK (COVID-19 genomics UK consortium) website for public data analysis. “Anyone with an interest can access it” says Sony proudly. You don’t have to prove you’re a clinician.”
The couple, who met at Bangalore’s National Centre for Biological Sciences in 2012, admit the course of true love did not always run smooth.
“It’s a bit filmy” laughs Sony. “We couldn’t stand each other at the beginning.”
“I was a post-doctoral fellow in the same group” clarifies Agnel. “It took a good few months for us to get talking and know each other.
The couple married in 2013 and now have a four-year-old daughter. But does their professional rivalry persist?
“I don’t know if you can call it rivalry” says Agnel. “It’s like two different brains thinking about the same thing, so there can be clashes at times. But I respect that, because if we agree on everything it’s not interesting is it?”
The creation of CoVal did involve other sacrifices. “I knew how to program and I had the prototype. I just had to move everything to Coronavirus” says Sony. She admits to spending at least six weeks of her free time so far getting CoVal up and running, whilst Agnel helped with the data processing. “I wasn’t paid for it. I was essentially doing it either at night or on the weekends. But it’s paid off.”
Like many who work at Harwell the pair appreciate the collaborative opportunities such an environment offers. “We have everything on site” says Sony. “Even aside from CoVal the project I’m doing is very collaborative, I work with the Electron Bio-Imaging Centre, The Collaborative Computational Project for electron cryo-microscopy and Diamond Light Source so you have all this expertise on site, plus as a data scientist you have access to all this data so that’s very handy.”
So what is the future of CoVal?
“Well Corona isn’t going away is it?” says Sony.
“We’ll need to track the new variants for a few years down the line and understand their impact” adds Agnel. “For example we are looking more closely at the Indian variant at the moment, that double mutation which is a variant of concern.”
“A few ideas have come up from bridging the two data sets” he continues. “We are thinking of standardizing this for other infectious diseases, looking at the impact of mutations in, for example, cancer or other viruses. We’ve been seeing a big technological boost in machine learning and artificial intelligence, so hopefully we can make use of these as well to extend this model to other diseases.”
As our conversation draws to a close, I find myself wondering how their four-year-old daughter has coped with such a science-heavy lockdown?
“Well, talking of coronavirus all the time, and hearing it on the news she’s now got herself a coronavirus cookie cutter” laughs Sony, “and yesterday we made some cookies.”
“She also keeps making up songs. The other day we had: ‘If you don’t wash your hands you’re going to get coronas in your house!’”
Surely a future coronavirus scientist in the making.