Projects are a great way to learn Data Science. They provide a meaningful, self-guided way to improve your skills and possibly solve real problems for you and others. Projects are also a great way to showcase your skills in a portfolio. While there are some small projects that you just “do” in a day or two to get familiar with certain skills, libraries or topics there may be projects that are a little larger and require a little more upfront planning.
I have made the mistake of just starting a project and then losing track of all the little tasks I wanted to accomplish and ideas for solutions I had. In the end I created a lot of extra work for myself by not creating a plan beforehand. To save you the hassle I want to provide three frameworks you can use to plan your project and guide yourself from start to finish. The steps presented for each framework are of course not set in stone and you should adapt them to your specific situation. …
You’ve analyzed a data set and found interesting insights, you have built a machine learning pipeline, but so far they just live within your jupyter notebook. If others want to view your work they have to read through your notebook and view every output, that’s only ideal in a few cases. It’s time to take your work and showcase it interactively.
This does not have to be hard and you don’t require the help of a front-end developer. You, as a data scientist or someone on the path to becoming one, can deploy and host your project or application.
All you need is streamlit. …
Google Sheets is a common alternative to Microsoft Excel and is especially useful if you wish to work collaboratively on your spreadsheet or share your data. I want to quickly share two ways of reading Google Sheets data into a pandas DataFrame through the .read_csv()
method without having to safe the sheet locally first or having to use the GoogleAPI.
All you need is a Google Sheets file with one or more sheets and of course some data. The file needs to be set to the sharing option which allows everyone with the link to view the data.
Music plays a large role in many people’s lives, it’s hard to imagine a day without it. I, like many others mainly consume it through streaming on Spotify. If you ever scrolled down to the end of your playlist you have met Spotify’s recommendations. Once in a while they seem spot on, at other times I wonder whether the recommender system took a wrong turn somewhere. But maybe that’s just me and my diverse taste in music.
If you are satisfied with your recommendations you can still use this article to learn how to use the Spotify API with spotipy and how to successfully go through the authentication process. …
There is a vast and growing number of Data Science resources. It can be hard to find the best ones for you. It may even be hard to find the right “Roadmap for Data Science” or “Top Skills to Learn for Data Science”.
I don’t claim to have the best resources or the correct path to a career in Data Science. What I have is a list of useful resources and if even one of them furthers your learning my goal is accomplished. …
The Data Science application process can seem very daunting, especially if you are just trying to break into the field and don’t come from a “traditional” background like computer science or math. It certainly sometimes does for me.
There are many factors that play into that feeling. One of them certainly is the application process itself. Even if you heard back from companies and are invited to interviews, tests etc. you might feel uncomfortable and lack confidence in your acquired skills.
Now before I go on, this is not an actual guide on how to technically fail an interview or a code challenge, I think we can agree that everyone could find ways to do that without any outside guidance. What I want to do is talk about using “failures” in the application process or “kinda went well” experiences and turn them into something useful. …
You probably already created some data visualizations when you learned how to use a library such as matplotlib or plotly. Maybe you even frequently use visualizations for exploratory data analysis (EDA) or dashboards. Or you are just a beginner and still in the process of figuring everything out. Wherever you are in your journey, I want to invite you to look past the code and discover the basic qualities of great data visualizations, so that you can use them as well.
“Most of us need to listen to the music to understand how beautiful it is. But often that’s how we present statistics: we just show the notes, we don’t play the music.” …
Have you ever wanted to use deep learning in a project and show your results in an interactive way? Do you want to start your deep learning journey and quickly see your results as a working application? Let me show you how.
One of the easiest ways to deploy a small data-driven or machine learning app with Python is streamlit. Streamlit allows you to create WebApps with a lot of functionality using just a few lines of code as I will demonstrate in a minute.
Now that covers the deployment part, but what about the machine or deep learning? If you want to dive into deep learning, fast.ai is a great place to start. Their MOOC, now also their book and especially the fastai library provide you with everything you need to start your deep learning journey. …
About