I am going to update this as I progress through the fellowship starting at week 0. The fellowship officially begins on September 14th, 2020. I am in the Data Science program cohort 2020C.
Week 0: The week before the program we needed to ensure that our development space was ready for action on day 1. I had Anaconda on my home machine, but it needed to be updated. I created a virtual environment especially for Insight :). Insight fellows have the opportunity to sign up for AWS resources. I know that I will need this (ran into this issue during my dissertation when I lacked computational resources for parts of the project), so I signed up for credits fast. If you ever have an opportunity to use AWS do it! I have met some of the other fellows and I see that many of us have pets and like board games! We communicate frequently on a Slack like app which is incredibly convenient.
Week 1: We began the week with introductions, a coaching and development workshop and giving serious thought to what project we will move forward with. The expectation is that we will have decided on a scalable data science project by the end of the week. Each day promptly begins at 10am CT (for me is 9am MT), and ends roughly at 6pm. What I am enjoying about this fellowship is there is a coaching and development component in order to prepare us for the job market. We learned that a resume is a marketing tool designed to highlight relevant transferable skills. When I began my career, I obviously did not know this and had some pretty embarrassing resumes. We spoke about giving and receiving feedback in a professional environment. My question that I posed to the coaching and development team: what about receiving objectively poor feedback? They shared that we can be selective in what feedback we choose to accept. We do not need to act on every piece of feedback. So, if and when I receive poor feedback again: feedback based on something I cannot change such as my appearance, my voice, my hair rather than something professional-I will 100% reject it. When I have received terrible feedback in the past that I did not ask for , I was deeply hurt. One piece of advice that was posed to all fellows is that we should be specific when asking for feedback as a means to direct the feedback to more or less what you are looking for. For instance, if you want feedback specific to your presentation skills, you can say, “I would like feedback related to my presentation abilities in areas of x and y.” If you leave it open-ended, you may receive feedback related to things that are not important to you.
The first day of the program was exhausting. I was not expecting the entire day to be accounted for. Let me repeat: the entire day. The day is filled with a combination of presentations and project deliverables. This is an intense program, but the quality is quite impressive. I will add this caveat: not all of the sessions are winners, in my opinion. But as with life, you can’t win em’ all. On our first day we met the founder of Insight, Jake Klamka. This was particularly impressive to me because I have rarely met the founder of any company before. Jake shared his background and why he founded Insight. He commented on my Linkedin post when I was accepted as a fellow, and gave a hearty congratulations! Jake immediately struck me as passionate, caring and thoughtful. It is no surprise that every staff member I have interacted with reflects those same values.
The program is focused on doing as a way of learning. We will move and build at an industry pace. There were no classes, but we had several breakout activities/activities where we needed to quickly think on our toes.
My group scraped data from Google trends, and we had some issue with the setup, lol. Toward the end of our 30 minute allotted time, I presented and sort of winged it, but I recall of needing to do that in industry as well.
We learned about what constitutes a data product(the thing we are building in this fellowship). A data product’s primary objective is use data to facilitate an end goal. This article does a wonderful job of explaining the concept of a data product in detail. Examples of data products: Zillow Zestimate, Fitbit, coupons recommended based upon purchase history, and Instagram suggestions.
After the first two days of the program, our focus has relied mainly in project ideation. The project ideas that I thought were excellent are actually not as feasible as I hoped. I have the habit of first determining if the data set exists, then trying to identify a problem and subsequent solution. I have been advised to think in this order:
- User (who will be the user of this product?).
- What problem will this product solve (the solution needs to be actionable).
- There needs to be data available for this product!
So, I can think of an amazing data product only to discover I cannot locate a datatset. Or I can locate a datatset, but not be able to think of a actionable solution. I am hoping that I can remedy this issue soon. Luckily, I am pitching some of my project ideas to other fellows and they are incredibly helpful in offering feedback. Looking forward to week 2 of the program.
Week 2: After the first week of the program, I was beyond exhausted. I have never been so exhausted in my life. I remember being so exhausted I was just staring at the computer screen unable to think anymore, lol. Insight program directors recommend taking a full 24-hour day off on the weekend (no work, not thinking about the project), and let me tell you-it is vital. I needed every bit of a full day off. As I have mentioned before, the program is intense, with sessions scheduled the entire day with few breaks. I had some issues in finding a feasible project to work on in week 1, but at the start of week 2, I finally have something! So, my project will be an image classification project with anime/manga characters! The tool will allow users to upload photo(s) of a character, then the tool will identify the character, and subsequently recommend websites to purchase a figurine. How fun is that!? This is completely out of my comfort zone (I don’t have image recognition experience at all), but I will learn. I can use some of my own figurines and box sets to test the ability of the model!
Here are things that I am thinking about for my project during this week
-Where is the data coming from?
-What do you want the model to do?
-What information do you need from the user?
-What output will the user receive?
-How will you validate the model?
My first step with my project is to scrape images and place them into a PostgreSQL database. I will need to narrow the scope down because the anime universe is massive. I have been in contact with a very nice former fellow that shared her Github code with me and even directed me to the Github of code she used as inspiration! This was beyond helpful! My attitude this week is taking things one step at a time. I am refreshed to see that other fellows are a bit lost just like I am, so we are all stumbling together.
There was a coaching and development career clarity workshop. What I appreciated about this is that we learned how to articulate what we are looking for in our next working environment. When I was starting my career, I never gave much thought to this. I only discovered what I needed in a workplace after working in several environments (and experiencing what I didn’t want). I think it is important to think about your values and align yourself with the right organization. We have started this week off with several interesting company visits. I enjoy attending the company sessions just to learn about a domain that I would not have been exposed to otherwise. Data science is being utilized in every business space imaginable, with competent data scientists from a wide variety of diverse backgrounds.
This week we are focusing more attention on resumes. We are assigned an advisor to assist us in refining our resumes and giving helpful feedback. It appears that I use vague terms to describe the impact that I have had in past organizations. I now understand that it is best to be specific, and to articulate why the reviewer should care. As in: give the reviewer a sense of the extent or breadth of your project to better put your experience and expertise into context. What was the point of your project? How large was the data set(s)? How much did you improve a process (efficiency, performance, reduced error rate, etc).
We had a social hour last week that I failed at. We used this platform and with your avatar you can chat in a 2D virtual environment. Because I was so tired (mentally and emotionally), by the end of the day all I could focus on was putting glasses on my avatar, change the shirt color and aimlessly walk around. I wandered to the campfire, crossed the river, found a bookcase, and then stared at the computer screen once I did those things. It reminded me of the old school NES Legend of Zelda, and so I kept thinking I could walk up to another avatar and have a 2D conversation, ha. I told my husband, and he laughed…because it’s funny.
Towards the end of week 2 I was able to build out a very base idea of a product. I have named my product, findmyfigurine and chose a VGG16 Convolutional Neural Network in order to classify the character images. Some of the challenges I have faced include not enough images to train on, and some of the product’s functionality was not there at the close of week 2. In order to get started on the product, I tested the model on a subset of just two characters; Kaneda Shotaro from Akira and Uzumaki Naruto from Naruto and Naruto Shippuden. I actually ran out of viable images on the second epoch! I have located a large database of images, but some of those images include fanart and cosplay, so I am worried I may not have a streamlined result. I will need to move forward at this point. Next steps are to ensure that I receive an output of the predicted character label and include where the recommended 5 websites to purchase a figurine. This last part is quite necessary and how my application will distinguish itself from other similar tools. I plan to deply the application on Streamlit as a way to test the functionality as a user, and my final product will likely be deployed via Flask and AWS. There has been so much work that has happened in just a few short days. By the end of week 3 I should have a fully functional product that I can demo!
We have had some “interesting” company visits during week 2. As I have mentioned before, not all of them are going to be winners. I am keeping a tight lip on this one, but there was one company visit in particular that was surprising. This is the part of the fellowship that does not feel as fun. On Thursday I did not sleep attempting to have something to demo during week 2. I could not just say, “my model failed, but I am working on it”, I needed to show that I tried. I was up all night training the model and testing various images. I tested on flowers, cats, dogs, rock paper scissors and lastly anime characters. I am using Google Colab because my computer continusously dies. I can build simple applications on it, but deep learning is a no. My husband asked if I wanted a new computer, and that is a YES PLEASE. This computer has served me well for a few years, but now I think it’s an old hat. This is the part about data science people aren’t so forthcoming with: Iteration. Reiteration. Failed models. No direction and needing to figure out this work sometimes completely on your own. I know that I will have an interesting product in the end, but it will not be as epic as I am imaging in my mind and I will need to make peace with the imperfection. There is no time to continue to “tweak” models and results. At this point, I need to just get it to a point of “good enough” and move on. It has been a long week, and week 3 will likely not be any more fun. But the light at the end of the tunnel quickly approaches. Onwards to week 3. Cheers.
Week 3: The emphasis during week 3 is to have a completed project ready to demo. The staff made it very clear that we needed to have an actual product to demo on Friday, so make it happen. The stress was over 9000. I can say that I do have a project that can be demo’ed although it does not work perfectly. I have made some slight tweaks to the model and chose an InceptionV3 architecture over the previous VGG16 framework from before. This model was chosen because it is far more lightweight and it runs lightning fast. The pretrained model needed to run through 80 epochs in order to get a decent accuracy and loss. I will need far more images than the 12,830 that I started with. The dataset included cropped illustrated faces from the now defunct site moeimouto. This dataset contained illustrations for 172 characters. Resolution was not always good. The illustration styles vary greatly and most of the images were pornographic. I suppose it does not matter since the user will not be seeing what data was fed into the model. But, even with the cropping it was clear what was happening, lol.
I deployed my app on Streamlit and I am now a fan of this deployment tool. I was able to get a sense of what my front end looked like in just a few minutes. I deployed from Google Colab as well. Overall I can say this: I have created a product from start to finish and did it in a VERY short amount of time. Now I know the power of prototyping. Fail fast and then move on. So, at week 3, I am now at peace with the imperfections and looking to improve the performance. But, now I no longer feel the deep sense of dread that I had before when the results were not what I expected them to be. So, is this what being an industry data scientist is about?
As I mentioned before, now the next step is to deploy via Flask on AWS and that looks like quite the task. I will stumble through that part somehow, lol. I tried to do some of it last week and could not even with it. There were so many steps and it was not a straight forward process. I imagine maybe I was just sleep deprived. We have spent quite a deal of time on our resumes, and I now can see that my resume is far more concise than it once was. Before, I had some jargon on there that made little sense to others. Now I see in interviews why most recruiters always asked, “please help me to understand x”, and it was literally always the same job. It was revealed to me that my resume did not convey what I actually did or what problems I solved. I tested my new resume format out and it has been well-received by several people. This particular week we have had several interesting company visits, that had tangible products that they produced. I am excited about working towards an actual product. Or something I can see in the end. In some data science environments, some of the work never makes it into production.
I will say weeks 2 and 3 were a nightmare and I am really hoping the rest of the fellowship will be slightly more enjoyable. Learning about data science while you are delivering a product is challenging. Let’s see what week 4 is about. Onward.
Week 4: Week 4 was not better than 3. It was actually harder for me than 3. This week, we focused on demo’ing our finished product and having a polished presentation. The time to meet is 4 minutes. In that 4 minutes we give motivation for how/why our app is useful, a brief demo of the product, how our model works, validation and summary. I received some valuable input from different folks, so I began to rework my presentation, add more graphs and charts, and simply make it an enjoyable 4 minutes. My first pass at my presentation I was told it was very academic in nature. I suppose boring. Ouch. Somewhere around Tuesday, I lost steam. Everytime I opened my presentation I seemed to go blank. Stress? Loss of interest? Not sure. But, I do know that creating a succint presentation in 4 minutes was incredibly challenging for me and it was like I had no longer had the will to continue. The fact I felt so lost was the most frustraing part.
I presented the product on Friday (today), but had to skip some unfinished parts which I was a bit embarrassed about. I still presented even though I had an uncontrollable nosebleed (dry New Mexico air). I appreciated how supportive everyone was while I showed up on camera with tissue in my nose and bloodshot eyes. During this week, we focused on marketing ourselves further, and submitting our completed our resumes to Coaching and Development. I am happy with my resume now. It looks good.
As I am typing this, I am retraining my model and going to add how I validated my results to my presentation. Basically, my model appears to be overfitting because I know that I simply do not have enough data points for each character. The loss increases with each epoch as well. I did take some time and scrape Google images for roughly 15 characters, and before data augmentation, the validation set is performing at 60% (unseen data). This actually falls in line with what I have noticed when I feed in a new photo into the classifier…it misses quite a bit. Boo.
So, besides these issues with the app, the concept of the app is kinda cool. I like anime figurines and having them on my desk makes me happy. I showed my husband a photo of figurines from DC and Marvel and we had fun naming all the characters. I am proud I identified Winter Soldier :). Us 90’s kids are grown up so we have money to buy merchandise. I chose to take a hike in Bandelier last weekend instead of working, but this weekend I am ready to finalize my work. I feel as though I am far behind everyone else (I can confirm my insecurity is true, lol), but at this point I am going to focus on doing my best. I need to have a more presentable answer for how this model works, and why accuracy was the best validation metric to use in this case.
There was a point in the week where everything seemed to be due at once: our resume, need to fill out a company interest form (it was long), and provide information to where our Google Slides are located, Github repo and Linkedin profile. I will be explicit in stating this: those that are flexible in relocation will likely be able to find a company sooner. That is something you want to keep in mind, if you are interested in Insight. Not all companies are open to remote employees even though we are in the middle of a pandemic. So the expectation from most companies in our cohort is that the candidate will relocate.
Next week, we will begin the process of getting interview ready, with “Question of the Day”, sessions that helps us to answer that pesky: “Tell me about yourself”, or “Walk me through your resume.” So in short, this week was HARD. I had a hard time. I am still here though. Still pushin’ through. May you have a pleasant weekend and stay tuned for week 5.
Week 5: This was an extremely challenging week. I was able to retrain my model, but because I am inexperienced with computer vision I was not sure how to overcome some incredibly common and frustrating issues:
- Extreme over fitting (the more I trained the model, the worse the over fitting became. And the validation accuracy did not improve).
- I realized my classes were imbalanced (I had the same number of classes, but not number of photos in train/validation).
- And I CANNOT CONFIRM IF I AM DOING ANYTHING CORRECTLY, LOL
I gave my demo to Insight staff and was removed from the schedule for week 6 demo’ing my project to companies. The staff felt that my demo was not a good display of my skills. I ran into several unexpected issues that I could not remedy in the time frame the program needed me to. So, I have decided that perhaps, I will continue working on my project in my free time, but I will not demo to companies through Insight, period. I have reached a ‘diminishing returns’ scenario, where I was not making enough progress all while maintaining pace with other responsibilities. And as I have mentioned in another week, the stress level was OVER 9000. Kamehameha.
This week was an incredible blow to my ego and work that I put in. I did indeed try, but the feedback evolved and at the last demo I only incorporated feedback from the last session. I felt humiliated on more than one occasion when staff members give feedback. Feedback is given in groups, and usually not one on one. So when you suck, everyone will know.
In short, I am planning to focus on packaging myself as a professional and utilizing interview prep. Despite my current project being too challenging for me to complete in 2-3 weeks, I know with more time and experience I can indeed devise a great computer vision project. I just spoke to a more experienced computer vision engineer at Apple, and he mentioned that this was too challenging for a first computer vision project and I should have reduced the classes. I had 173. A good starting project would have been flowers, MNSIT, dogs vs. cats or something of the like, to become familiar with CNN’s. But, I took a challenging problem and took a full swing at it which is something I do not regret. But, I have learned some things in hindsight. There are just two more weeks of the program left. I am going to finish as strongly as I can. Despite the setbacks and unexpected turns, I am going to do my best. I have met incredible fellows in this program, that I would consider my friends (hope they feel the same way?, lol), and I have received great advice from the Coaching and Development team on how to create a pretty killer resume. The network of Top companies Insight has to offer cannot be beat. I have been exposed to companies I would have never known about otherwise.
But, in all honesty there were aspects to this fellowship I was not expecting. There is very little technical instruction. I thought I was going to be receiving expert instruction so that I was not walking into a project as blindly as I did. I made alot of noob mistakes. You devise a problem to solve, and it is up to you to bring it to fruition. You can ask your fellow fellows, but most of the time they were just as stressed as lost as yourself. And the expectation is you execute your project correctly, and sell yourself well. Your feedback may conflict as well. One week, you will work on a particular set of feedback, and the next it will change, completely blindsiding you. The feedback you receive may be harsh and it is done in a group setting. As I have mentioned before, if you suck, everyone in the group will know!
I also was unaware that you can be pulled from the demo schedule if the staff has an issue with your demo…like I was. Lastly, do you like Google Docs? Insight uses them ALOT. There are Google Sheets and Docs for everything.
Is this a great program? Yes and no and it fully depends on where you are in your career. If you are looking to start a career in data science and would like to be considered by top companies-this is a great program. But, be prepared to work and be under pretty intense pressure. If you have experience in data science and can likely get hired by a company, I would recommend looking into other options. I did get some value out of this program, but not the value I was expecting. So, I hope this blog post sets some realistic expectations of the program. Do not walk into this blindly. Understand what you are getting into. Peace out, see you in week 6.
Week 6: The majority of fellows began their demos for companies. Some fellows received call backs for interviews this week! Pretty exciting. Sadly, I cannot partake in this joy, so I am feeling a bit displaced (see week 5 for an explanation). In coaching a development workshop, we discussed how to develop our interview strategy. One piece of advice that I am a firm believer is is to conduct mock interviews. It was during a recorded mock interview in 2014, that I discovered I made faces when I was stumped. I had no idea I was doing that! So since then, I practice when possible. It also helps me with anxiety before interviews.
We had a diversity summit towards the end of the week to chat about tech diversity issues. As I have mentioned in a post on Linkedin, there is a serious retention issue regarding people of color within tech. Tech has ALOT OF WORK TO DO.
In addition to company visits, we began studying data science concepts starting in week 5, so the area I still would like to continuously work on is SQL. This is a staple in any data scientist toolkit, so the sooner proficiency in gained here the better! Some other topics we worked on are: python, behavioral interview questions, machine learning and statistics/probability. I had a mock interview with our assigned technical advisor and I was given a scenario: your machine learning model alienates 100 customers, but creates additional revenue of $100,000. Is this model worth pursuing? The feedback I received pointed out I failed to consider the overall revenue for the company/business line. This company may not be able to afford to lose 100 customers. Or, this company earns so much in revenue that $100,000 is not worth pursuing.
To be honest, this week I took it easier than before. For the past 5 weeks I pushed myself far past my limit. I was so mentally exhausted by Wednesday, that I decided to rest. Next week is the last week of the program.
Week 7: In Coaching and Development we worked on how to properly negotiate an offer. The art in doing this is to state why you are deserving of an increase above the original offer. I plan to use these tips going forward. Here are the key points to remember:
Express enthusiasm for the offer and specific excitement for the role/opportunity.
Outline your top skills/experiences that you’re bringing to the company/role.
Specify the increased base salary that you feel is more in line with what you can contribute.
Incentivize them to take action. What are you willing to do if they can meet you on the salary? Will you sign the offer right away? Etc.
Before Insight, I failed to view negotiating in this light. I just thought, I needed to do it, but in hindsight I was probably pretty dry in my approach. Using the tips above can be more persuasive to a company in why you are deserving.
Towards the end of the week, I watched a fellow so effortlessly walk through a case study. A case study is a method in which companies want to understand how a candidate may approach problems, as well as assessing if that is someone they can see themselves working with.
I have been asked case study questions in interviews before, in which I was making it less a conversation and feeling more like I was being grilled. I really appreciated how comfortable he was in the case study and he made it seem more like a casual conversation about data science. I plan to do this!
Lastly, this was the last official day of Insight. The program will continue beyond this day, but I plan to only interact in a limited capacity. There are pros and cons to Insight, and the information below is simply my opinion:
The other fellows in the program are amazing
- I have yet to meet a nicer group of people. They are all incredibly smart and pleasant.
Insight has an amazing network
- I have had great exposure to companies that are solving interesting data science problems. It was neat to hear how they are leveraging data.
Some daily sessions are incredibly useful and insightful, while others are not
- There were some sessions that I gained great value from, while others should have been an email.
There is NO technical instruction, and depending on your PD (project director), there may also be little guidance
- I went into this thinking technical instruction was a feature of this program. I was surpised to find I was incorrect. As I have mentioned before, I struggled on my own and still needed to meet very demanding deadlines. If I was going to struggle on my own, I could have done that without committing to a 7 week program. Because this was a remote session, perhaps the program operates far more efficiently in person?
I felt humiliated and condescended
- I did not find feedback sessions helpful; just incredibly humiliating experiences. I have watched fellows absolutely be destroyed during times in which they were vulnerable and really needed empathetic feedback and guidance. I have left feedback sessions early because I found them uncomfortable (I have been uncomfortable before, this was next level). In order to ensure we were on track, we received constant direct messages or requests to jump on a zoom call. If we admitted we were struggling, we were again spoken down to. This was highly unexpected from Insight. I originally thought the goal of Insight was peer collaboration, but instead it felt more like “helicopter parents” ensuring their children do something.
I was pulled off of the week 6 demo, because it was determined my project was not up to par
- I had no idea that this was an option, nor was it made clear to me this could happen. The day it happened, I was incredibly hurt and angry that my hard work was disregarded.
The sessions are tightly packed onto the schedule
- You will be hard pressed to find time to use the restroom or eat some days. There should be breaks built into the schedule.
Far too many disparate spreadsheets, Google docs, and links
- Instead of information being located in a central place, you will need to keep track of a multitude of resources. I recommended Insight working on this as soon as possible. It was incredibly confusing.
As I’ve mentioned in a prior week, I would only recommend Insight to those that are looking to get a start in Data Science and have very little experience. This may be an opportunity to be hired by a company. Your chances for being hired improve if you are mobile and able to relocate.
But, keep in mind you will not get technical instruction, and/or technical guidance. There are tight deadlines to meet.
If you have experience and can find a data science role on your own: I would not recommend Insight.
I will say this: Jake Klamka reached out to me personally when he learned that my experience with Insight was less than ideal. The feedback I provided seemed to be taken to heart and hopefully will be incorporated into future sessions. For those that are interested in pursuing Insight, I wish you luck and much success in the program. Now that you are aware of some of the cons with the program, you can proceed more informed than myself. I like to tell people the truth and let them decide!
Thank you for everyone that has kept up with my updates during the past 7 weeks. May you all have a fantastic weekend and a spooky Halloween! I leave you with a Zoom photo of our cohort!