The Data Science Handbook by Carl Shan Henry Wang William Chen & Max Song

The Data Science Handbook by Carl Shan Henry Wang William Chen & Max Song

Author:Carl Shan, Henry Wang, William Chen & Max Song [Carl Shan, Henry Wang, William Chen and Max Song]
Language: eng
Format: epub
Publisher: leanpub.com
Published: 2015-05-03T16:00:00+00:00


For my fellow students and me, we’re very fortunate to be in this environment where our only duty is to learn. But there are many data scientists out there who feel like they’re missing some knowledge and are trying hard to fill the gap. My question is in reaction to those data scientists. What’s the best way to keep on learning after university?

I noticed that’s a trap that people fall into, thinking, “I’m perpetually feeling unprepared.” It’s a dangerous way of thinking - that until you know X, Y, Z and W, you’re not going to be able to do data science. Once you start learning this thing, you realize there are four other things you need to learn. Then, you try to learn those things, and you realize you don’t have this, this, and this.

You do need some basic foundation in statistics and CS skills, but both statistics and computer science are enormous fields that are also rapidly evolving. So, you need durable concepts. Right now, for people that want to do data science, I highly recommend learning R and Python. But in 10 or 20 years, who knows what the main languages will be?

It’s a mistake to think, “why am I learning R now? R won’t be used in 20 years.” Well, first of all, R might still be used in 20 years, but even if it isn’t, there’s going to be a need for the thinking that produced R. The people who create the successors to R will have probably grown up using R. So, they’re still going to have that frame of reference.

You want the skills that are language-independent. You need fundamental ways of thinking about uncertainty and communicating those thoughts in a way that is not that dependent on any particular programming language. It’s definitely important to have that kind of foundation, but keep in mind that it’s hopeless for anyone to actually know all the relevant parts of statistics and CS, even for some small portion of data science. It’s not feasible for anyone, but it doesn’t mean that you can’t make useful contributions.

In fact, I think it’s a good idea to continue learning something new every day. The way you can learn something, and really remember it, is by using it in your work. Instead of saying, “I need to study these five books so that I will know enough to become a data scientist,” it should be about getting a basic level and foundation. Then, start immersing yourself in a real, applied problem. You will realize what types of methods you need. Then, go and study the books and papers that are relevant for that. You will understand them so much better because they’re in the context of a problem that you care about.

You have to be energetic and work really hard, but not get discouraged just because you don’t know everything. And just because you don’t know everything, it doesn’t mean you can’t contribute useful things while gradually expanding your understanding and knowledge.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.