How to start learning data science - EliteDataScience

There have been a plethora of resources for learning data science & analytics (DSA) and many of them come at a great cost. I stumbled on this website called EliteDataScience and thought I would share this incredible and well thought-out plan for learning DSA. The various guides are geared toward your goals and designed to achieve a specific (or broad) topic. It was a great way for me to refresh my memory on several topics that needed refreshers. The best part is all of these guides are FREE.

Dataviz Color Palettes

canvacolors

Whenever I am creating an analytical visual/dashboard, I start/finish the visual with absolutely no colors. This was something I picked up during one of Cole Nussbaumer’s podcasts—I highly recommend for anyone looking to learn/improve your data visuals skills. Now after finishing the visual, you can then look at the entirety of the visual to determine what values/items need to stand out. But first consider, can you use shading to help it stand out? If colors are appropriate, do they relate to what is being presented?

I came across this website by canva.com and have used it often when I needed eye-appealing color combinations. Bookmark it, I think it could be very handy as you build out your visuals.

Clear Difference: Data Scientist vs Data Analyst

Data Scientist, the true unicorn.
Data Analyst, show me my business.

I will often browse my feed LinkedIn to learn how my network is keeping up with their projects, employment status, accomplishments, and other related topics. Something that stood out to me was the high number of connections who are now “data scientist” after 2 - 3 years (or less) in the career path. I always thought this was quite odd considering my expectations of what a data scientist represents within the workforce—a highly seasoned data analyst with experienced at all levels of data analytics (data viz, modeling, statistics, business acumen, and a high level of programming).

The Data Scientist Venn Diagram

The Data Scientist Venn Diagram

Here are some bullet points (Source: Glassdoor) to sum up the graph above:


Data Scientist

  • Master’s or Ph.D. in statistics, mathematics, or computer science

  • Experience using statistical computer languages] such as R, Python, SQL, etc.

  • Experience in statistical and data mining techniques, including generalized linear model/regression, random forest, boosting, trees, text mining , social network analysis

  • Experience working with and creating data architectures

  • Knowledge of machine learning techniques such as clustering, decision tree learning, and artificial neural networks

  • Knowledge of advanced statistical techniques and concepts, including regression , properties of distributions, and statistical tests

  • 5-7 years of experience manipulating data sets and building statistical models

  • Experience using web services: Redshift, S3, Spark, DigitalOcean, etc.

  • Experience analyzing data from third-party providers, including Google Analytics, Site Catalyst, Coremetrics, AdWords, Crimson Hexagon, Facebook Insights, etc.

  • Experience with distributed data/computing tools: Map/Reduce, Hadoop, Hive, Spark, Gurobi, MySQL, etc.

  • Experience visualizing/presenting data for stakeholders using: Periscope, Business Objects, D3, ggplot, etc.

Take notice at the number of times “experience” is mentioned. Now let’s take a look at a reasonable Data Analyst (Springboard Blog) description:


Data Analyst

  • Degree in mathematics, statistics, or business, with an analytics focus

  • Experience working with languages such as SQL/CQL, R, Python

  • A strong combination of analytical skills, intellectual curiosity, and reporting acumen

  • A solid understanding of data mining techniques, emerging technologies (MapReduce, Spark, large-scale data frameworks, machine learning, neural networks and a proactive approach, with an ability to manage multiple priorities simultaneously

  • Familiarity with agile development methodology

  • Exceptional facility with Excel and Office

  • Strong written and verbal communication skills


In these job descriptions, the word experience is mentioned eight times more for a data scientist than for a data analyst. So how is it then possible that these individuals who have recently started in the path of data science becoming data scientist within five years? Or even better, they become a data scientist right after graduation (undergrads or grads). I do see the Ph.D. level as a stepping stone toward becoming a data scientist due to the amount of time, research, and application that is put into earning a Ph.D. — however I cannot say the same for those with an undergraduate degree in CS or a one-year Masters degree in a similar topic (unless previous work experience says otherwise).

Ryan Thorpe @ Towards Data Science wrote a great post on the topic and here is a snippet:

Data Scientist, the true unicorn.

For those that haven’t heard this, it’s a common description for the role used in the field. A true data scientist possess these skills:

  1. STRONG — Business acumen

  2. STRONG — Math/Statistics

  3. STRONG — Computer Science/Ability to sling code

The unicorn is someone who’s perfect at all three. This is seldom the case. The most likely scenario is someone who lacks, or is weaker in one of the three.

Data Analyst, show me my business.

Data Analytics is very similar, they possess these skills:

  1. STRONG — Business acumen

  2. MODERATE — Math/Statistics

  3. MODERATE — Computer Science/Ability to sling code

As you can see, a Data Science requires a strong skill-set in all three categories. However, both roles require the same skills. The biggest difference is how they apply these skills. Let's clear up the misconception.

What exactly does “acumen” mean when mentioned in business acumen? The definition according to English Oxford Living Dictionaries: The ability to make good judgements and take quick decisions. Now business experience is not required to have this ability, but it sure is a strong correlation among those who do have a strong business acumen. It almost seems to me that there is even an additional stepping stone toward prior to even becoming a Data Analyst—a Business Analyst? This is me thinking on the fly, but I think is worth pondering.

This is purely my opinion, but when I come across a “Data Scientist” on LinkedIn, I am more than likely looking at a Data Analyst (maybe even Business Analyst) who has the aspirations of someday becoming a Data Scientist. Nothing wrong with this—as a matter of fact, these are my current aspirations. But businesses should become increasingly aware of the differences between the titles and match their expectations accordingly. After all, you do not want to hire several Data Scientist at six figures each and come to find out that their value truly amounts to what are known as Data Analyst (or Business Intelligence Analyst). This would set-up failure for all those involved within a long-term value.

My Simple Data Scientist Path

Details are always changing, but here is a high level idea. While using the descriptions above, here is my current plan (living document of course!) towards becoming Data Scientist:

  1. Business Analyst (1+ years)

  2. Data Analyst (5-10 years + candidate in either MS Data Science or PhD Data Science)

  3. Data Scientist (MS Data Science or PhD Data Science)

Of course these high-level variables depend on specific tasks given within each role, but it does reflect a reasonable ballpark range within the development of a UNICON (Data Scientist).

Additional Reads on the topic:

Data Analyst vs Data Scientist — What’s the difference?

Career Comparison: Data Analyst vs Data Scientist—who does what?

Difference Between Data Scientist and Data Analyst

Blurred Lines: Data Analyst vs Data Science

"It builds up, like compound interest."

“Read 500 pages every day. That’s how knowledge works. It builds up, like compound interest. All of you can do it, but I guarantee not many of you will do it.”

— Warren Buffett

“I read a lot of things I don’t agree with. I need to understand their point of view.”

— Charles Koch

It was during my internship at Koch Industries where my recent love for reading was discovered. I witnessed for the first time a company culture built off basic economic principles that encourages a free society. Koch Industries identifies this guiding framework as Market-Based Management (MBM). After weeks of continuous self-study on MBM, I became highly impressed with such a framework and needed to know of its initial development.

I later learned that Charles Koch developed a hobby for reading in high school and eventually used this hobby to learn and apply principles in his professional environment. The numerous concepts Charles would read, learn, and apply became overwhelming for employees to follow and so it was then the MBM framework was created.

I became inspired to not only learn and apply these principles/concepts of Market-Based Management, but to also read the books that inspired those (there is now an internal dedicated team for MBM at Koch Industries) who contribute in its development. This is precisely when (May 2018) I started to read consistently and thoroughly with all intentions to create value to whom I interact with throughout my life. Good profit.