Kids Corner

People

Mathematician Gurjeet Singh Uses 'Big Math' To Tame Big Data

PETE CAREY

 

 

 




Sikh-American Gurjeet Singh is Chief Executive Officer (CEO) and co-founder of a company that reflects an emerging trend in Silicon Valley -- the intersection of computer science and mathematics to tackle real world problems.

Born and brought up in Punjab, he earned his Ph.D. in computational mathematics from Stanford University's Computational and Mathematical Engineering Institute, which prides itself on doing "Big Math."

Building on research in data analysis by his adviser, Stanford math professor Gunnar Carlsson, Gurjeet developed "Mapper," a kind of software robot that automatically ferrets out interesting information from complex masses of data.

Sensing a potentially vast opportunity for use in the real world, he secured venture funding and, with Carlsson and Stanford co-researcher Harlan Sexton, formed Ayasdi in 2008.

The pioneering technique has been applied to problems as diverse as fraud detection and traumatic brain injury in football players.

Ayasdi's stated goal is to make the analysis of complex data "as simple as online shopping or searching the web." The company's name is the Cherokee word for "seek."

Q:  One of your co-founders is a Stanford math professor, you have a degree in computational mathematics and your other co-founder is a Stanford math grad as well. We hear about startups from Stanford's computer science department all the time, but isn't it unusual for three mathematicians to start a company?

GURJEET SINGH (GS): The attitude in a pure mathematics department is very ivory-towerish, in general, so it is extremely unusual for people to start companies out of the math department. The attitude toward going to the "dark side" is very strong.

Q:  You all seem to have weathered that well enough. So let's talk about mining big data, the reason a company like yours exists. What are the burning issues in big data analysis?

GS: The term "big data analysis" is a very weird term. It acknowledges one problem we have with data, which is that it is very big. But that's not the main problem we have with data. The problem we have is complexity. People believe the best way to learn from the data is to have a hypothesis and then go check it, but the data is so complex that someone who is working with a data set will not know the most significant things to ask. That's a huge problem.

Q:  The solution?

GS: Technologies like Ayasdi's exist now to automatically discover information from data without having someone making guesses up front.

Q: So you just send it off, let it prowl around the data for a while and then come back and tell you what it found?

GS: Correct. It churns for a while, tells you things that are significant, that you need to know and then we're done.

Q:  How do you know they are significant?

GS: Because we do statistical tests. We run statistical validity tests on everything we find, so we are actually able to guarantee that whatever we find is present in the data.

Q:  Give me an example of something where your company put a computer in charge of ferreting out interesting information and bringing it back for you to interpret.

GS: Everything we do in this company is of that nature. One of my favorite things, one of first things we did at Ayasdi, was a study of a very old breast cancer data set. A decade and a half ago, the Netherlands Cancer Institute collected genetic samples from breast cancer tumors from a few hundred people. They believed that if they analyzed that data they would be able to discover types of breast cancer and they might be able to discover treatments for these types of breast cancers that they just discovered.

By and large it came true. But there was a population with a certain type of breast cancer that clinicians saw in the field, but that researchers could never pull out. They spent a decade and half trying to do it. We threw it into our software and within minutes we were able to discover it without looking for it. It just popped out. Think about it. A few minutes, versus a decade and a half. We published it in Nature. There are so many examples like that.

Q: Another example?

GS: Mt. Sinai Hospital in New York is one of our customers. They collected roughly 20,000 genetic samples from people with Type 2 diabetes along with their clinical histories. They wanted to figure out, is Type 2 diabetes a disease or is it a symptom? Are there underlying molecular diseases that display the same outward symptoms but are actually distinct, and thus require very different treatment regimes? In fact, using our Topological Data Analysis system, they were able to discover multiple types of Type 2 diabetes. That obviously has a huge impact on all the hundreds of millions of people who have Type 2 diabetes in the world because they don't actually have Type 2 diabetes. They have Type 2 diabetes, type 1, or Type 2 diabetes, type 6.

Q:  What's good about that?

GS: If you know people with Type 2 diabetes, there's a high likelihood they will have different medication regimes and different lifestyle options. When we label all these various types as the same thing, we treat them the same way and they should not be treated the same way.

Q: In ceding the search to computers, aren't you losing control of your research?

GS: The point is not to cede control, go home and the computer will do your work for you. The point is you will be able to do much more work and you will be much more productive than you have ever been in the past. There is a vast under-appreciation of what machines and algorithms are capable of today. We certainly have the means to change the fundamental way we do things in society. That's the stuff I'm most excited about.


BIO

Birthplace: Ludhiana, Punjab
Age: 33
Education: B.S in Instrument and Control Engineering, Netajai Subhas Institute of Technology, 2002; Ph.D., Institute for Computational and Mathematical Engineering, Stanford University, 2008
Work: CEO, Ayasdi, 2001-present; research scientist, Stanford University, 2008; intern, Google, 2005; software design engineer, Texas Instruments, 2002-2003.
Residence: Palo Alto, California, USA
Family: Married, one child.


Five things about Gurjeet Singh you ought to know:

1  He began coding at the age of 6, with a ZX Spectrum game computer his father bought for him.
2  He builds robots in his spare time.
3  He's a huge science fiction fan.
4  He still writes software even though he's a CEO.
5  He was accepted to Stanford's Ph.D. program and came to the U.S. with only enough savings to pay for one quarter.


[Courtesy: San Jose Mercury News. Edited for sikhchic.com]
December 27, 2014
 

Conversation about this article

1: Baldev Singh (Bradford, United Kingdom), December 27, 2014, 9:14 AM.

This is awesome stuff ... the stuff of science fiction!

2: Harinder Singh (Punjab), December 27, 2014, 11:30 AM.

Wow! The type 2 diabetes technique looks like a Nobel prize-winning discovery. Keep up your genius in maths, we eagerly await even begger things from you and your team. I am sure you will find many types of Alzheimers, Parkinsons, hypertension, cancers, etc. which in turn may lead to more focused treatment. All power to you!

3: Kaala Singh (Punjab), December 28, 2014, 1:11 AM.

Great going. People like you are our future!

Comment on "Mathematician Gurjeet Singh Uses 'Big Math' To Tame Big Data"









To help us distinguish between comments submitted by individuals and those automatically entered by software robots, please complete the following.

Please note: your email address will not be shown on the site, this is for contact and follow-up purposes only. All information will be handled in accordance with our Privacy Policy. Sikhchic reserves the right to edit or remove content at any time.