Hi everybody. I'm so lucky, we're so lucky, to have one of my favorite people on our podcast today. She is a renowned AI researcher and AI expert, someone who I like to call for advice to learn more about the current trends in AI. She's such a distinguished researcher and leader in this space. So happy to have you here today, Dietry. Thank you so much for having me, Melanie. Pleasure to be here. Can you tell us a little bit about yourself, the highlight reel
of accomplishments, because I will be able to do justice. So my journey in machine learning research began as an undergrad. I was lucky enough to work in Tom Mitchell's labs, one of the fathers of machine learning as we know today, and was working on this very interesting predictive model trying to figure out future areas of brain interaction based off of past areas of interaction. So imagine being able to predict someone's thoughts based on what they were thinking a few moments ago.
And that's what got me really interested, I would say, in pursuing ML and AI more broadly and comprehensively. After graduating from Carnegie Mellon, I worked at Meta for about four years, years. And part of that work involved figuring out how we can use predictive signals to understand what high quality learning content looks like. So imagine, you know, having so much context in a company and not knowing why certain decisions were made. And, you know, having all of these sort of disjoint
resources, not knowing where to find important documents. So distilling important signals and building models to flag information that's really important, and building that into a system that was used by thousands of employees internally. Yeah, after that, I decided to take a little break from industry and go back to school. So I went to Stanford, took lots of AI classes, and also had the opportunity to do a lot of research in generating educational content with AI that was standards aligned and fit for
actual high school instruction. And then also working on personalization. So imagine creating a Melanie chat GPT that gives responses exactly tailored to the style that you would like to hear in. So figuring out, you know, how can we understand what users like without just very surface level demographic information, like Melanie is a half Japanese, half Chinese female in her 20s. So taking not considering that, but rather more nuanced signals, like how does the user interact with HGBT today, observing their
interactions and trying to understand how we can use that. And right now I'm a machine learning research engineer at this company called Anton, which is a one stop shop for essentially shopping with AI. So trying to understand, you know, what you're trying to look for, if you enter a query, like find the perfect couch for my Brooklyn apartment under $3,000. That's awesome. Obviously, it's such an incredible list of different
accomplishments. Can you tell us about what your degrees were in specifically? Yeah, so as an undergrad, I studied computer science and also creative writing. And at Stanford, I did a master's in computer science focusing on AI. Yeah, so I grew up in India in New Delhi. Both my parents are dentists, so not much of a tech background. They try to convince me very hard to also become a dentist and take over the family practice. But it wasn't something that I particularly enjoyed. I really loved math as a child and doing
puzzles and brain teasers, but I'd never written a single line of code. And but I also had visited the US a couple times since my family lives here and always knew that I wanted to move at some point. It just seemed much easier to bring one's dreams into a reality here. And so that was something since age 12, I knew I wanted to work towards. So having that in mind, I did a lot of research myself, looked into all of the tests that I would need to take to get here, like the SATs. In conjunction, I was preparing
for med school in India to satisfy my parents. But then also making sure that I was fulfilling all the requirements that it would need to take to get into school here. I remember being very intimidated at first when I got to Carnegie Mellon, because everyone else had been like programming since age 11, writing their own operating systems. And here, I barely knew what a for loop was. So it was definitely a learning curve at first. But I'm so glad that my journey, you know, got me to the place that it did and that I was
persistent and stuck with it. So can you talk about the early education and schooling? Were you integrated into like a sort of AB, AP, IB system at like an international school? Were you at a local school? What are the hurdles that you have to overcome as someone who's not American preparing for American university? So I did have, so we didn't really have AP or IB in our school. But luckily, Indian science, STEM education is extremely comprehensive. So it was far above the level, I would say that will be required for APs here. So
luckily, there wasn't too much extra prep that I needed to do to take like AP tests and subject SATs. Yeah, and I was able to bring my love of creative writing to write all of those college essays. That's amazing. And then tell us a little bit about the research that you've been working on. It's been so groundbreaking and interesting. Yeah, so we recently published a paper in ACL with my collaborators at Stanford. And we were trying to understand how we can build language models that are more personalized
to specific users and actually respond in ways that would appeal to them. So in order to do this, we studied pairwise preferences of users. So if you've ever seen ChatGPT give you two alternative responses and ask you which one you like more than the other, that's sort of the data that we were working with. So we collected a lot of these examples of preferences from users. And then we're trying to use LLMs to understand why a given user might prefer one response over another. And a lot of prior research has
focused on very toy examples of preferences like, oh, this user likes the color red over green, for instance. But here, since we're focusing on very nuanced responses, a lot of them around people's values, people's interests,
kind of like data sets in the wild, we were able to get a lot more interesting signals around what people actually prefer. And it's sometimes hard to understand why a user would prefer one response over another, like, is it the content of it? Or is it the presentation? Like, do they enjoy the fact that it's shorter, longer, has more humor in it versus the actual factual information? So we were trying to kind of disambiguate and understand these things. And as a result of our research, we were actually able to
perform much better than the state of the art at being able to predict users' future preferences based off of past preferences with significantly more accuracy. So that was something very cool that came out of this. And this has a lot of downstream applications as well in terms of being able to create personas for users in general and using that to inform personalization in pretty much any setting. So if you think about an e-commerce setting, if we have interaction data from users, we can use that to
inform how the homepage should look for them or what items we should surface for them. And it's pretty like endlessly generalizable, this sort of framework. Hmm. Thank you for breaking it down for especially for those that may not be very active in the world of AI. It's still very easy to understand. I would love if you wanted to break down some of the key AI concepts for the layperson, let's say in middle America, to understand. Being in San Francisco,
sometimes we live in a bit of a bubble where 18 year olds just have their tech startups and have been using AI forever, even though it in the form that we know it only came out a couple of years ago. But I would say one of the core trends that I've noticed is around using AI coding tools, where I think I've noticed a lot of people still being very hesitant or thinking that they have to be a software engineer to understand how to use these tools. And the statistics around this are pretty surprising, like
cloud code, which is so great at scaffolding code, building entire systems, only 300,000 people worldwide are using it right now. And I think that more people should try to get their hands dirty and play around with these things. Because there's really endless possibility in terms of what you could build. I have a friend who actually built an app hands-free while running a marathon just by being able to talk into his phone essentially and modify whatever app was being created during his breaks on the
marathon, like being able to look at the screen and see what the current state was. So I think a lot of people have this misconception that you have to be super technical to build something, but I don't really think that's the case. The second really interesting trend that I've noticed is world models. So a world model in the context of AI is just when we're building a model essentially to simulate real world interactions, be it like physics, understanding how to seamlessly show motions in video, in generative
video, or even being able to predict the stock market. And I think there's so much interesting work there around being able to use datasets more effectively using, gathering better multimodal data, enriching datasets that already exist out there, and collecting data in more interesting ways. There's a lot of robotics companies out there now, and before robotics data was kind of hard to come by, but now a lot of people are utilizing haptic gloves so they can actually model human movements much
more effectively than before. And so I think this new rich data is really enabling a lot of developments and building world models. And, you know, another thing that I like to talk about is how people, AI is very polarizing. There's one group that thinks it's all doom and gloom, and we're going to wake up in some sort of terminated Terminator reality, and then another group that thinks that, oh, like,
I don't know if you've ever seen that meme where there's an Excel spreadsheet and you type in January, and instead of February, March, April, it's like marguary, or like apruary. So like the predictive completions are very inaccurate. So there's another camp of people that thinks we can't really get to something super crazy or usable. And I like to think that it's sort of somewhere in the middle. I do think that we've gotten to AGI loosely in the sense that AI is now smart enough to do a lot of
tasks that many humans are not capable of. It's self-learning in the sense that it can improve with feedback loops. Agents can, you know, go off and spin and decide based off of the outputs of their current tasks and states. But yeah, I think that to get to true AGI, I still think we're maybe a couple steps away, both in terms of compute and in terms of architecture. So I think that more developments in quantum computing and getting to stable quantum systems faster is really going to
be able to unlock the next generation of models. And then also a couple architectural improvements over what we have right now. I'm really bullish on companies like Liquid AI that are building small language models that can work, you know, on device and are using more novel architectures, liquid foundation models that's actually inspired by earthworm brains. So I think trying to mimic biological systems a bit more with architecture could lead to some really interesting results in my opinion.
Hmm. Three really exciting and engaging topics to keep top of mind. If we take a step back and think about for the average person who may not know anything about AI, maybe we can break down, you can give us like a quick, you know, AI 101 course. So artificial intelligence has been around for a long time. I know it's mostly entered the broad conversations only in the past couple of years since ChatGPT came about, but it's been
around since the 50s and the actual term artificial intelligence was actually coined sometime in 1955, I believe. And the way that I like to think of it is artificial intelligence is just any general concept of machines learning to mimic humans in whatever sense that may be. And machine learning is really a subset of artificial intelligence, where systems are learning primarily from different data sources. So in AI, like we could just have,
you know, rules and algorithms that don't necessarily use data out there, but machine learning is much more around using like large data sets to understand patterns and apply that to generalize. AI is artificial intelligence and AI is artificial general intelligence and the concept that AIs can eventually get to the point where they're as quote unquote good as humans or, you know, can mimic human processes of thinking. And honestly, I'm not sure how far we are away from that point. People have varying
definitions of what AGI even means. Like some people might think we're already here because like AI coding tools have gotten so good and coding is something that's quite complex in terms of, you know, having to manage large contexts, understanding how different things connect together, building systems that are reusable at scale. But other people say that no, unless, you know, we develop, unless machines can develop some sense of a soul or,
you know, feelings, then we aren't really at AGI. But that's also interesting to think of because like, what is a soul or a soul or feeling really like, is it just neurochemical like components and neurons firing in our brains producing what we think of as a soul? Is it purely physical? Or is there another component to it? So I think it's almost an existential question in some sense. Like what is AGI? Will we ever get to it? People have varying opinions on this.
I would say for someone who's not super well versed in AI to actually build their chops, just try building like ideas that you have. If you, if there's an app that you've always wanted to see, but it's not on the app store, just ask ChatGPT or, you know, Claude, which is Anthropics AI, how you can go about building it. And you'd be shocked and surprised to see how quickly you can create something of your own. Another really effective way I found is going to hackathons. Now outside of the Bay Area, this isn't really that much of a thing. But hackathons have really helped me see how people are using tools out there and building with the latest models.
Like whenever Gemini comes up with a new video generation model, for instance, they'll host a lot of hackathons and give users credits to see what they can come up with, like innovative projects they can build. And I think that's a really cool space through those hackathon demos. Even if you don't participate, you can see what's possible to create. And there's so many free courses and resources out there. There's OpenAI Academy, which is completely free and managed by OpenAI.
They publish lots of cool tutorials around how you can create your own models, fine tune them, play around with different models and embed them in your system. So I think just working with those and even basic prompt engineering can get you so far, like if you're or like using agents, if you're, you know, anything, any function from growth to marketing to software engineering can benefit so deeply from simple processes and hacks that they can build into their workflows.
And I think that AI should be thought of as a tool and a thought partner, not necessarily just to completely offload your brain to and rely on cognitively as the primary source of ideas and inspiration. But rather, once you have ideas and inspiration, learning to direct and put systems together in a way that makes sense. And for software engineers of the future, I think systems engineering will become even more important, just trying to understand the broad structure of how basic components fit together and understanding how to orchestrate them.
And in terms of any advice that you could give, for example, for founders who are technical, working on interesting new ideas, how they can be thinking about the future of AI tools, what you're interested to play around and explore that you can share with others. Yeah, I first of all, please use AI coding tools. Don't, don't write too much code by yourself. Now it's, it's just not scalable. And tools are now smart enough that they will make smart engineering decisions and set up your system the way that you would like.
But I would say don't rely on them for, you know, languages that are super obscure or trusting them to write tests that actually make sense. So I would say trust, but always verify have really good evals in place and make sure that you're monitoring system inputs and outputs, making sure that they make sense. I would encourage founders and CTOs, like whether they're technical or non-technical to build as much as they can.
And if there's a problem that they're stuck on, like try to create a solution for themselves instead of seeing what exists out there. Because I think we're also at the point where there's no good norms around AI systems. And it's, we're at such a disruptive point where we can like reimagine flows, reimagine UI. Even if you think about AI coding tools right now, it's just this little text box that you type into and then code shows up on a couple of different panes and then you can modify.
But I think that it's, the field is so ripe in terms of reimagining what that would look like. Or, you know, even moving beyond this chat bot experience that so many websites are using, like what could the next thing look like? And I think we still don't know. What are you looking forward to in the future? I am really excited about where we're going with robotics and like RL systems getting more sophisticated and getting to the point where we, all of us might have at home robots to take over mundane tasks.
Sounds great. Yeah. I don't know if we necessarily need humanoid robots. Like if you think about it, it doesn't quite make sense to have a full humanoid folding laundry. Like I can imagine there being a better system to do that because if you have humanoid, there's, you have to coordinate all the limbs. Like there is very, like fine grained motions and movements that we need to mimic with our hands. So I'm excited to see what people come up with in the robotic space for mundane day to day chores, because I think that would free up so much time and energy.
And I'm also really excited about developments in quantum computing. I know you almost talked about putting, you know, quantum computing clusters in space and stuff like that, which kind of makes sense because of the cooling requirements and things like that. If you think about traditional computing, we have bits that store and process information. So bits are zeros or ones. But if you think about quantum computing, you can think about a bit being in two states at once.
So imagine being able to be either zero or one. And without getting too technical about it, this just enables us to store and compute and simulate things a lot faster than we would be able to with just plain old bits. And there's a lot of nuances, though, around quantum systems and how they need to be free of interference. And that essentially means that we need to be as close to absolute zero as possible, which is very hard in practical conditions to happen.
And that's why it makes sense to have, you know, like quantum computing systems in space where the cooling requirements wouldn't really be as much of a problem. And this quantum computing also then just enables a lot more speed of computing beyond what traditional GPUs could ever allow us. So I think it just will unlock a lot more ability. And if we develop model architectures that are able to use, make use of quantum computing concepts, I think that would be really cool.
I don't know off the top of my head what that would look like, but I'm excited. Amazing. Anything else you want to leave the audience with? Things that they should be excited about in the space, how they can be more actively engaged in learning and using AI? I'd say don't limit yourself and never think that you have to be technical to use AI. I have so many friends, doesn't matter what they're doing, lawyers, marketing experts, even doctors that have been able to 10x their abilities with AI.
And all it takes is a willingness, you know, sometimes just to play around with GPT and ask questions. And you'd be shocked at what you can build and what it can help you with without even going super technical and writing any code at all. So I'd say just be willing to explore and try things out. And anything you want to promote, talk about in the last couple of minutes. Yeah, I would say that I am just so excited about creating art with AI now.
And there's just so much that we are unlocking in that space. So creatives have traditionally been quite wary of AI because they don't want to offload their artistic integrity to a machine and have it produce sloppy content that just goes toward the norm of what we're used to in the world and doesn't really think about things that could be truly creative. But I'm really excited about training models to essentially replicate the soul of an artist and see like, hey, can we capture the essence of what Pablo Neruda might create and use that to generate more art in his artistic voice?
And I think this goes across modalities from audio to, you know, written text to video. And so that's something that I'm really excited about. And in my spare time have been dabbling around with. It sounds awesome. Where can people find you? Keep in touch with you. Yeah, I do need to create a Twitter, but I haven't. But other than that, I do use LinkedIn a lot. So Aditri Pagirath is my name and you'll be able to find me very uncommon first and last name.
And also through email at aditribe at gmail.com. And we'll post the links to that. Awesome. We'll include that. Well, thank you so much for being here and for breaking down some of the common and important things that we need to know in AI. It's been such a pleasure getting to share your story and your vision with our audience. Please stay in touch with everything that she's working on and we'll see you on the next episode.
Thank you so much. Thank you so much, Melanie. Thank you.