Posts by Tags

aws

Sharing AWS data with collaborators

I’ve recently become our lab’s AWS sysadmin, and my first task is to give our collaborators access to some of our data. In this post, I’ll briefly go over how I set that up and explain the different options that our collaborator has to access the data. ... Read more

coding

The 64 bus

This morning, my roommates and I were discussing our bus-taking strategies: it was around 9 am, and one of them was about to go catch the 64 bus going to Kendall/MIT whereas the other one was planning to wait for the next bus, which goes to University Park. This got us talking about which route was faster: the Kendall/MIT route, which gets you closer to campus but seems to take a longer route there, or University Park, which drops you off farther from campus but gets there more directly. I had actually meant to look into this question in my previous commute blog post, so felt this was a great opportunity to do so! ... Read more

Commuting from Allston

About a year ago, I moved from my lovely Beacon Hill apartment (300 yards from the subway) to a house full of my friends (a 20-25 minute walk from the nearest subway stop). I’m super happy in my new house (we have chickens!) and it was totally the right decision, but at the time my new commute felt daunting - and many of my friends told me I’d regret giving up the convenience of my amazing Beacon Hill location. So, I did what any aspiring data scientist would do and started gathering data to prove them wrong. (See a theme in my data collection posts yet? XD) ... Read more

Hinge online dating experiment

A couple of months ago, I was having dinner with a friend who was trying to convince me to start online dating - he’s a hopeless romantic, and perhaps the only person on this earth who genuinely enjoys it. I really dislike online dating for many reasons and we’d had this conversation many times before, so I wasn’t interested in his arguments. But as he was telling me about the new app he was using, an idea started to form… Because of the way the app is set up, I realized I could test one of my longtime hypotheses, and in the process get some much-needed validation for why online dating sucks and definitively win our debate about whether or not I should sign up. ... Read more

Gendered experiences at a male-dominated conference

Last week, I attended a workshop focused on developing software for a popular bioinformatics platform in my field, which is a space that is much more skewed toward men than I’m used to (as a bio*engineer, I’ve been mostly spared from situations with extreme gender imbalance). It was an interesting experience, and overall incredibly positive. However, we live in imperfect world and I had an interesting gendered experience that I want to reflect on here. ... Read more

Developing a qiime2 plugin for non-developers

As a side project from the meta-analysis, we developed a method to correct for batch effects in microbiome case-control studies. When we posted the preprint on biorxiv, Greg Caporaso emailed Sean and asked him if he’d like to put our method into qiime2. I happily volunteered - I’d heard a presentation about qiime2 and was super pumped about their plugin setup, where anyone can incorporate their method into qiime’s suite of tools, and I was excited to see how doable it was. The learning curve was a little steep at first, but not as bad as I expected! Here, I’ve cleaned up my notes into a guide through my development process. I hope this is helpful to others like me, who aren’t trained computer scientists/developers, but who are keen and able to learn the programming stuff to make their tools more useful to more people. ... Read more

Slopegraphs in python

Slopegraphs are always introduced as being introduced by this Edward Tufte post, though this page is my top Google hit for “slopegraph.” I’m not sure if the kind of plot I’m talking about is technically a slopegraph, but in my academic circles that’s usually the term we end up settling on after a conversation that almost always sounds like, “you know, those plots which are kind of like boxplots except the paired points are connected with lines.” ... Read more

Blogging with jupyter notebooks and jekyll

One of the last parts before my full-fledged transition to github pages from wordpress was figuring out how to post nicely formatted jupyter notebooks. This was actually the reason I wanted to switch in the first place, but it turns out it wasn’t as straightforward as I’d hoped! I think I’ve found an acceptable, though imperfect, way to do this: here’s the general process I’ve settled on. ... Read more

Sharing AWS data with collaborators

I’ve recently become our lab’s AWS sysadmin, and my first task is to give our collaborators access to some of our data. In this post, I’ll briefly go over how I set that up and explain the different options that our collaborator has to access the data. ... Read more

Scatter plotting in python

In the past year or so, I’ve become a full-fledged tidy data convert. I use pandas and seaborn for almost everything that I do, and any time I figure out a new cool groupby trick I feel like I’ve PhD-leveled up. ... Read more

conda

conflict-management

data

A year in the work life of a computational PhD student

I started tracking the time I spend on various “work-related” activities near the beginning of my third year of grad school. When I started, I hadn’t yet discovered the magic of tidy data so I kept putting off analyzing the data. Now that it’s been over a year since I converted to tidy data, it’s time to dig in and see how I really use my time! ... Read more

data-science

The 64 bus

This morning, my roommates and I were discussing our bus-taking strategies: it was around 9 am, and one of them was about to go catch the 64 bus going to Kendall/MIT whereas the other one was planning to wait for the next bus, which goes to University Park. This got us talking about which route was faster: the Kendall/MIT route, which gets you closer to campus but seems to take a longer route there, or University Park, which drops you off farther from campus but gets there more directly. I had actually meant to look into this question in my previous commute blog post, so felt this was a great opportunity to do so! ... Read more

Commuting from Allston

About a year ago, I moved from my lovely Beacon Hill apartment (300 yards from the subway) to a house full of my friends (a 20-25 minute walk from the nearest subway stop). I’m super happy in my new house (we have chickens!) and it was totally the right decision, but at the time my new commute felt daunting - and many of my friends told me I’d regret giving up the convenience of my amazing Beacon Hill location. So, I did what any aspiring data scientist would do and started gathering data to prove them wrong. (See a theme in my data collection posts yet? XD) ... Read more

Hinge online dating experiment

A couple of months ago, I was having dinner with a friend who was trying to convince me to start online dating - he’s a hopeless romantic, and perhaps the only person on this earth who genuinely enjoys it. I really dislike online dating for many reasons and we’d had this conversation many times before, so I wasn’t interested in his arguments. But as he was telling me about the new app he was using, an idea started to form… Because of the way the app is set up, I realized I could test one of my longtime hypotheses, and in the process get some much-needed validation for why online dating sucks and definitively win our debate about whether or not I should sign up. ... Read more

Replication non-crises in science

I came across this great blog post again today while doing some literature search for one of my projects. I remember really enjoying this post when I first encountered it, and it was as much of a joy to read the second time around! What I appreciate about this article is that it doesn’t try to refute the contentious claim that “most [biomedical] research findings are false” but instead puts a “yes and…” spin on it. ... Read more

Women in Data Science Conference 2016

Yesterday, I attended the Women in Data Science Conference in Cambridge. I went in hoping to learn more about data science as a field, to identify career opportunities in data science for computational biologists interested in public impact, and to feel inspired by being in a room full of women doing science. I’d say the conference wasn’t well-structured enough (i.e. tied together by a common theme) for the first goal and not varied enough in topics for the second one. That third goal, though - nailed it. ... Read more

Human-centered data

I just read two articles from my DataScienceWeekly email (so good! You should subscribe!) which do a really good job of humanizing data, talking so respectfully about its potential downfalls while also recognizing its tremendous opportunities for impact. ... Read more

diversity

Talk about my science

I had a tense week with the internet a few weeks ago. I posted a tweet calling out a conference I was invited to for only having male speakers: ... Read more

Gendered experiences at a male-dominated conference

Last week, I attended a workshop focused on developing software for a popular bioinformatics platform in my field, which is a space that is much more skewed toward men than I’m used to (as a bio*engineer, I’ve been mostly spared from situations with extreme gender imbalance). It was an interesting experience, and overall incredibly positive. However, we live in imperfect world and I had an interesting gendered experience that I want to reflect on here. ... Read more

Hitting the diversity wall

Some reactions to a recent Insider Higher Ed article on “Hitting the [Diversity] Wall”. The tl;dr of my thoughts: (1) Yep, the wall is real. Finding other students working to remind ourselves that we’re not alone in this fight is critical. (2) The fight is for transformation beyond “diversity and inclusion” - it’s about transformation of power structures. That’s what makes it hard and inspiring. (3) The strategies we take are important: when do we work with our departments and when do we demand change? ... Read more

Quotas work

In conversations about improving diversity in STEM, I tend to run into “well-meaning” faculty who are resolutely against quotas for fear that they will only exacerbate impostor syndrome and other negative perceptions (e.g. “you only got in because you’re black”). This is such a frustrating position and although I haven’t quite found a time, place, or way to push back against it yet, in my deepest heart of hearts I know it’s fundamentally foolish. ... Read more

Professor superlatives

Our department is hosting an event called “Profscars” (like the Oscars, but for profs). The social chairs organizing this event emailed our entire student body asking for nominations for superlatives for each professor. A friend of mine pointed out that basically all of the women got superlatives based on their clothes/looks or mom status. I took a closer look and felt similarly appalled/taken aback, and then I did what any aspiring data scientist would do and decided to do some stats. ... Read more

Thinx again

Saw this on my Facebook today, which links to this longer article about turmoil at Thinx. https://www.facebook.com/teenvogue/posts/10154553768466312?match=dGVlbiB2b2d1ZSx0aGlueA%3D%3D Some comments: ... Read more

Reflecting mansplaining

Active listening is a hallmark of conflict management. One of the most important parts of active listening is reflecting, which means that you basically say what the other person just said. Sometimes you can also reframe what they said to a positive, future-focused message (e.g. reframing a complaint into what they would want to be the case). Reframing is actually quite difficult to sustain, but reflecting is really easy - and transformative! ... Read more

Women in Data Science Conference 2016

Yesterday, I attended the Women in Data Science Conference in Cambridge. I went in hoping to learn more about data science as a field, to identify career opportunities in data science for computational biologists interested in public impact, and to feel inspired by being in a room full of women doing science. I’d say the conference wasn’t well-structured enough (i.e. tied together by a common theme) for the first goal and not varied enough in topics for the second one. That third goal, though - nailed it. ... Read more

gender

Gendered experiences at a male-dominated conference

Last week, I attended a workshop focused on developing software for a popular bioinformatics platform in my field, which is a space that is much more skewed toward men than I’m used to (as a bio*engineer, I’ve been mostly spared from situations with extreme gender imbalance). It was an interesting experience, and overall incredibly positive. However, we live in imperfect world and I had an interesting gendered experience that I want to reflect on here. ... Read more

grad-school

Professor superlatives

Our department is hosting an event called “Profscars” (like the Oscars, but for profs). The social chairs organizing this event emailed our entire student body asking for nominations for superlatives for each professor. A friend of mine pointed out that basically all of the women got superlatives based on their clothes/looks or mom status. I took a closer look and felt similarly appalled/taken aback, and then I did what any aspiring data scientist would do and decided to do some stats. ... Read more

Impostor syndrome

My thesis proposal is on Tuesday, which of course means that I’ve been thinking a lot about impostor syndrome. The way I process difficult emotions is by talking about them to my friends, and in this process it’s crystallized to me that impostor syndrome comes in so many different flavors, some of which are much harder to address than others. ... Read more

grad-schools

jekyll

Blogging with jupyter notebooks and jekyll

One of the last parts before my full-fledged transition to github pages from wordpress was figuring out how to post nicely formatted jupyter notebooks. This was actually the reason I wanted to switch in the first place, but it turns out it wasn’t as straightforward as I’d hoped! I think I’ve found an acceptable, though imperfect, way to do this: here’s the general process I’ve settled on. ... Read more

microbiome

Developing a qiime2 plugin for non-developers

As a side project from the meta-analysis, we developed a method to correct for batch effects in microbiome case-control studies. When we posted the preprint on biorxiv, Greg Caporaso emailed Sean and asked him if he’d like to put our method into qiime2. I happily volunteered - I’d heard a presentation about qiime2 and was super pumped about their plugin setup, where anyone can incorporate their method into qiime’s suite of tools, and I was excited to see how doable it was. The learning curve was a little steep at first, but not as bad as I expected! Here, I’ve cleaned up my notes into a guide through my development process. I hope this is helpful to others like me, who aren’t trained computer scientists/developers, but who are keen and able to learn the programming stuff to make their tools more useful to more people. ... Read more

ml-health

phd

A year in the work life of a computational PhD student

I started tracking the time I spend on various “work-related” activities near the beginning of my third year of grad school. When I started, I hadn’t yet discovered the magic of tidy data so I kept putting off analyzing the data. Now that it’s been over a year since I converted to tidy data, it’s time to dig in and see how I really use my time! ... Read more

plotting

Slopegraphs in python

Slopegraphs are always introduced as being introduced by this Edward Tufte post, though this page is my top Google hit for “slopegraph.” I’m not sure if the kind of plot I’m talking about is technically a slopegraph, but in my academic circles that’s usually the term we end up settling on after a conversation that almost always sounds like, “you know, those plots which are kind of like boxplots except the paired points are connected with lines.” ... Read more

Scatter plotting in python

In the past year or so, I’ve become a full-fledged tidy data convert. I use pandas and seaborn for almost everything that I do, and any time I figure out a new cool groupby trick I feel like I’ve PhD-leveled up. ... Read more

publications

python

A year in the work life of a computational PhD student

I started tracking the time I spend on various “work-related” activities near the beginning of my third year of grad school. When I started, I hadn’t yet discovered the magic of tidy data so I kept putting off analyzing the data. Now that it’s been over a year since I converted to tidy data, it’s time to dig in and see how I really use my time! ... Read more

Developing a qiime2 plugin for non-developers

As a side project from the meta-analysis, we developed a method to correct for batch effects in microbiome case-control studies. When we posted the preprint on biorxiv, Greg Caporaso emailed Sean and asked him if he’d like to put our method into qiime2. I happily volunteered - I’d heard a presentation about qiime2 and was super pumped about their plugin setup, where anyone can incorporate their method into qiime’s suite of tools, and I was excited to see how doable it was. The learning curve was a little steep at first, but not as bad as I expected! Here, I’ve cleaned up my notes into a guide through my development process. I hope this is helpful to others like me, who aren’t trained computer scientists/developers, but who are keen and able to learn the programming stuff to make their tools more useful to more people. ... Read more

Slopegraphs in python

Slopegraphs are always introduced as being introduced by this Edward Tufte post, though this page is my top Google hit for “slopegraph.” I’m not sure if the kind of plot I’m talking about is technically a slopegraph, but in my academic circles that’s usually the term we end up settling on after a conversation that almost always sounds like, “you know, those plots which are kind of like boxplots except the paired points are connected with lines.” ... Read more

Scatter plotting in python

In the past year or so, I’ve become a full-fledged tidy data convert. I use pandas and seaborn for almost everything that I do, and any time I figure out a new cool groupby trick I feel like I’ve PhD-leveled up. ... Read more

qiime2

Developing a qiime2 plugin for non-developers

As a side project from the meta-analysis, we developed a method to correct for batch effects in microbiome case-control studies. When we posted the preprint on biorxiv, Greg Caporaso emailed Sean and asked him if he’d like to put our method into qiime2. I happily volunteered - I’d heard a presentation about qiime2 and was super pumped about their plugin setup, where anyone can incorporate their method into qiime’s suite of tools, and I was excited to see how doable it was. The learning curve was a little steep at first, but not as bad as I expected! Here, I’ve cleaned up my notes into a guide through my development process. I hope this is helpful to others like me, who aren’t trained computer scientists/developers, but who are keen and able to learn the programming stuff to make their tools more useful to more people. ... Read more

reproducible-science