Three Seas Dive

Finding the Needle in the Data Stack: Advice from a Facebook Data Scientist

Alumna Delia Mocanu is a double husky and 2014 PhD recipient in Physics. During her time at Northeastern, she developed a passion for network science, working on data projects with incredible scale. Now at Facebook, she finds herself working on one of the largest data systems in history— News Feed.

She participated in a written engagement with the Northeastern COS on going industry, why epidemiology works better in the dark, and the most important skill to succeed in data science.

Growing up, did you find that you were interested in physics? Or did that interest develop later in life?

I grew up in Romania and I did not like Physics much at the time because it was too formulaic. In a weird twist of events, as a freshman in college here in the US, I actually switched from Chemical Engineering to Physics/Math double major.. 

At some point I realized that Physics made a lot of sense for me and I just enjoyed pure science more than engineering. I liked the rush of solving problems from scratch.

Why Northeastern?

I jumped around a bit until I landed on what I wanted to do. I was originally curious about Astroparticle Physics and I wanted to know everything about how the universe worked. During my first year at NEU, it became clear that I was seeking something more fast-paced. [The irony is not lost on me, this was very antithetical to why I switched from Engineering to Physics four years earlier.]

At Northeastern, seeing the kind of work Barabasi and Vespignani were doing, I immediately recognized that this interdisciplinary field (Network Science/Complex Systems) was more aligned to my existing interests and personal values. 

 

Is there anything that stands out about your graduate/Phd experience?

My advisor really emphasized the idea of ownership, and I liked that we were held to very high standards. Looking back at it now, I always felt like what I was putting my time in truly mattered. Prof. Vespignani was very good at instilling energy to the group. 

Did your experience working in Professor Vespignani’s MOBS Lab and other research facilities shape your career decisions?

Absolutely. What I cherish about this PhD experience is that we felt very plugged in; we had well funded projects that were designed to solve real problems, in real time. 

We loved doing something that mattered. I was simultaneously learning something and solving a problem involving millions of people. Little did I know I was going to reach billions later.

You’re currently working at Facebook as a Data Scientist. What does your current role entail?

I work in News Feed, and I’ve been here since I started at Facebook. What makes this role especially stimulating is building solutions that work at scale. I still very much rely on the thought processes and models of the world that I adopted during my PhD. Right now, I couldn’t imagine a better place to apply these. 

However, my favorite part of my job is actually identifying opportunities. When I find something worth investing in, I put all my energy into making it a reality. That last part is the most rewarding and it is really more about general problem solving than it is about any specific math/engineering skill. 

Have you spoken to friends or colleagues about Professor Vespignani’s COVID-19 models and what they are trying to accomplish? Has it been a source of pride, frustration, or a little of both?

To my friends mostly, yes. I touched epidemiology models a bit during grad school. At the time I remember thinking ‘I hope this software makes a difference someday,’ but I never thought we’d see something like this. Healthy skepticism is good, but I’m quite surprised to see the amount of pushback against these predictions at large, so I would say this has caused a bit of frustration. 

Frankly, epidemiology is best when you don’t know that it exists and when the predictions don’t come true. Otherwise, it is not too dissimilar from weather forecasting; every modeling exercise comes with error bars, but one can tell the difference between a major hurricane and a light summer rain. However, in epidemiology you ‘can’ actually turn the would-be hurricane into a light summer rain. Taking action invalidates the original prediction, that’s really the goal. Whereas if your predictions come true, you have failed; that’s the curse of this field. 

Computational epidemiology has advanced so much in the past two decades, that it’s quite challenging to establish a common language even with other highly technical folks.

What do you find most rewarding about data science?

It’s the most rewarding job you can possibly imagine. It’s always changing and you are constantly learning new things or building new tools so that you can iterate faster. 

It’s not just about the act of doing the analysis, but more about where the data fits. A lot of data science is problem solving, which is what I liked about Physics in the first place. You don’t have a solution and no one has ever solved this problem before. There are no instructions and every single day feels like a journey. That dynamic aspect  is very important to me.

What was the most important thing you learned at Northeastern?

Professor Vespignani wanted nothing short of perfection. He would sometimes ask you to iterate on the same chart a dozen times before it felt right. It’s about communicating this data in the best way possible. If I have to repeat the same steps several times before I get it right, then I do it, and I think it’s worth it. As a result, I do notice when others take shortcuts.

I can’t stress this enough: the analysis is not an end in itself.

Is there advice you would give to students who are interested in this field or the type of work you’re doing now?

Don’t be afraid of change, your interests will continue to evolve over time. Look at your PhD program as a time in your life to discover what you like doing and work with your advisor through that process, as they should guide you in making the most out of your career. 

Your goal in academia is to publish papers and advance knowledge, while you may not necessarily implement them right away, and that’s ok. If you choose the industry path, your focus will be on the application itself. The optimal mathematical solution may need a 20-fold simplification so that you can enable the rest of the team to be part of it. 

Anything else you’d like to add?

I do want to acknowledge the fact that Northeastern did an incredible job bringing professors from other universities and building great research programs, and not just in network science.

It’s incredible. I do think that I was very lucky to be part of Northeastern because that sort of environment so focused on research is very, very important. I really loved it.

Network Science Program
Physics