CS109b: Advanced Topics in Data Science, Harvard Extension School
A review (the Good)
The good
Variety!
Good solid progression of work
High standard for presentation
TAs who go the extra mile
Comedy in lectures with musicals
Good tour of strong ML applications
Collaboration between students encouraged
The bad
Keras! C’mon…
Substandard sound and video
Ambiguous instructions regarding collaboration
“Busywork” overload
Poor timing of instructions which created assignment roadblocks.
Intro
I took CSCI-E 109B in Spring 2021, without taking the prerequisite subject CSCI-E 109A. That’s because I’m impulsive and get my credit card out too quickly when I’m online shopping, even for education.
Technically, I wasn’t allowed to take this without CSCI-E 109A. So I had to email the professor to request an exemption.
Hello Professor Mark!
Quick summary: Please don't kick me out of CSCI E-109B because I have not done CSCI E-109A
Full story:
Hi Mark,
Michael Woodburn here, junior doctor from Australia taking classes in HES data science, hoping to write the radiology software of the future.
I was so excited to register for 'core' subject in the MLA Data Science in early 2021 that I didn't notice that...
A grade of B- or higher in CSCI E-109a [is required]. Students who have not completed CSCI E-109a should contact the instructors before registering.
Oh dear. My bad.
Before I get booted, let me make my case as to why I believe I should be able to take this class.
(ya-da, ya-da, talking about my accomplishments).
I received a response:
“We have a vetting process that is handled by our head TF Chris Gumb, cc-ed here. He will work with you to establish whether your background is sufficient for taking 109b.”
A background check, great. I definitely come up on a few lists. The good news is that they’re all the right lists.
So we were on.
I soon found that this subject was all about work.
A lotta work.
Drilling, almost. The professor said that he had designed the subject so that we would build, build, build neural networks all day long, until it was second nature to us.
I liked that!
It reminded me of a scathing comment from a machine learning engineer I’d been having drinks with. He said, of applicants for a machine learning job which he had been hiring for,
“They do one bootcamp, train a Resnet on MNIST, and call themselves machine learning engineers!”.
Well, nobody’s going to say that about me now. Because I’ve also trained a Resnet on cats and dogs.
Was this class hard?
No.
Definitely not compared to Advanced Python for Data Science.
In fact, this class was a fairly smooth ride. I really enjoyed watching the lectures, since the professors were clearly having a good time. It seemed to go by really fast… perhaps because I wasn’t expecting to fail every week (like I did in Advanced Python).
So, here’s all the good.
The good
Variety!
Data is fun when you’re working with the data that you really like.
I like time series. Time on the x-axis, and however many variables you want on the y-axis. It’s elegant, t’s objective, it’s sensible. It tends to pop up a lot.
I don’t like graph theory. No nodes, please. No edges. Graph theory is my potions class.
This class had a great setup. Towards the end, in the lead up to the final project, you have many options in regards to which data you would work with. If none of them interest you, you can propose your own project.
It’s not easy running a class like this. It certainly angles it towards a collaborative, explorative theme rather than a competitive, ranked schema (i.e it’s hard to say what your mark really means when everyone has done a different project).
But in the end, it affirms that you do have a say in what direction you take your degree and you do have an opportunity to develop niche skills.
Good solid progression of work
I really appreciated the slow build into advanced ML, starting with Convolutional Neural Networks, and building into Autoencoders., RNNs, LSTMs, and then State-of-the-Art stuff like BERT.
As I was going through it, I was thinking this is what you don’t get with a bootcamp.
The course introduced more advanced elements one at a time, with a key exercise for each. I had plenty of time to sleep on the significance of each step up. Since the pace of the course wasn’t frantic, and I wasn’t wasting time doing web builds or continuous integration, I really felt like I was getting the feel of each tool.
Good stuff. I imagine that this course has been moulded over time by the helpful comments of Harvard students on the Q magazine subject evaluations.
High standard for presentation
One thing I loved about this class was that they took Jupyter Notebook seriously. As in, your notebook better look professional, or you will lose marks.
So my notebooks had everything labelled, font size perfect, colours right, comments where they should be. The last hour of each assignment was spent making sure the work was done right, not just ‘finished’.
TAs who go the extra mile
The TA I had was a great guy.
As always, doing this course from an Australian timezone wore me down a little, because I never felt like I was really a part of the live sessions.
I eventually reached out to the TA, Hayden Joy, and he gave me plenty of good advice and encouragement.
I wish that I’d reached out sooner!
Getting a written message every now and again is very sustaining. If it was good enough for people in the 1800s, it’s good enough for me now.
Comedy and music in lectures
Finally, some comedy in the lectures!
Mark Glickman was a legend on the guitar. The only problem is that performing live into a webcam is like listening to a song on the bus with a Nokia flip-phone.
The song can be fire - but I wouldn’t know.
Someone please get Mark into a recording studio and do a Bo Burnham ‘Outside’ production!
Chris Gumb, clearly a good-natured guy, made me feel great about coming to the lectures.
Pavlos Protopapas, probably my favourite. It’s so refreshing to have a Harvard Professor who just comes out and says ‘Yeah, I’m amazing, that’s why you’re here.’ (not his exact words).
Yeah, that is why I’m here!
Good tour of strong ML applications
When I think of ML, I think of bullshit. Sorry to say that, but it’s true.
The field of ML is full of bullshit. Bullshit research, bullshit projects, bullshit promises. I brace myself for disappointment every time I have to listen to a new ML idea.
Data Science, on the other hand: pure unadulterated nectar of the Gods. When I hear someone say ‘Well, clearly we need to do a thorough exploratory data analysis before we decide on a task’, I practically pass out with satisfaction.
So It’s important that a class shows strong applications of ML, where actual good results can be achieved. This class achieved that.
I’m talking image classification.
I’m talking text classification.
I’m talking… time series classification.
Yeah, love me some classification!
I’ll also tolerate a little image generation if I’m feeling really adventurous.
But please, no prediction of the next note in a piece of music. I’m tone deaf anyway.
Collaboration between students encouraged
Finally, some good collaboration!
I’ve been out here… in fucking Oceania... in a social desert! The government has me under house arrest. Too much time hammering away at my keyboard alone.
Do I dare seek out a homework partner?
Getting close to someone is the first step to getting hurt.
I got over my shyness and tentatively put some feelers out with this post on the class forums:
G'day everyone!
Sorry to sound like a hardarse in the below message but I don't have time to mess around.
LOOKING FOR A HW PARTNER
This is my life for the next 6 weeks: I'm a trainee doctor covering haematology, oncology, and palliative care at night. Those are the three sickest units in the hospital. I don't have direct supervision - only phone calls I can make when I need them. I'm not telling you this because it's cool. I'm telling you this because I want anyone I'm partnering up with to understand that every day I go to work with a huge amount of dread and work under a huge amount of stress.
So I don't want any more stress about HW and projects.
It's actually not hard to achieve that:
Full communication, direct and to the point. I can only work with people who are upfront and assertive, because I just don't have time for either of us to worry about hurt feelings.
Routine (daily) contribution to the HW/project. I want to start on day 1 and go right through. Occasionally I have a day off where I can sit down and do a huge amount of work but I'm not going to rely on that.
High standards. Self explanatory. We're here to get an A.
If you're also looking for a HW partner who will not fuck around, get in touch with me at miw205@g.harvard.edu
Send me a picture of your grades or some other proof that you're up to it.
“send grades”.
Hmmm, I must have been in a bad, bad mood.
In my defense, I was really stressed because I was around dead bodies a lot of the night.
Fortunately, a true hyperproductive beast of a partner did reach out to me and made the subject a pleasure.
Working side by side, elbow to elbow, yeah that’s the good stuff!
I’ll post more about where the subject could do better when I’m feeling up to it.
Sorry, just a bit of a softie at the moment.