If you are an absolute beginner, DataQuest just might be the best way to learn data science. That’s the conclusion I reached after months of using their online platform.
I have to admit, it’s quite a claim.
There are obvious alternatives like DataCamp – the online market leader, MOOCs offered by major universities as well as conventional tech programs offered by universities.
After all, you really aren’t going to rate an average DataCamp certificate holder against a Stanford Ph.D. in Applied Mathematics and Data Engineering. But then again, whether you are a data science enthusiast or an employer, you really don’t have to – the demand is diversified and growing at a massive scale.
It’s easy to get confused about what data science exactly means.
For outsiders, the term encapsulates topics such as artificial intelligence, machine learning, and big data. With every modern philosopher and social scientist focusing on AI, you would do well to guess that this is where all the action is.
If you had been hiding in Costa Rica for the last ten years, you might be surprised to know that these are the most trending topics on the tech aggregation website HackerNews for a couple of years now.
So What Is Data Science, Exactly?
Berkley’s School of information describes data scientists as ‘well-rounded, data-driven individuals with high-level technical skills who are capable of building complex quantitative algorithms to organize and synthesize large amounts of information used to answer questions and drive strategy in their organization.’
I’ll offer a more functional definition. Data Science involves the capturing, processing, analyzing, and communication of, you guessed it, data.
This gets a bit tricky, however. Just like the daily tasks and responsibilities of a marketing executive may differ completely from one company to the next, there is a lot of variation here as well. There are significant differences in the roles of data analysts, data scientists, and data engineers, but in real-world scenarios you can expect to see a lot of overlap.
Here is a simplified description of how these roles generally differ.
Data Analyst
A data analyst is someone who collects, processes and performs statistical analyses of data. He or she can translate numbers and data into plain English in order to help organizations and companies understand how to make better business decisions.
Source: careerexplorer.com
Data Scientist
A data scientist is a professional responsible for collecting, analyzing and interpreting extremely large amounts of data. The data scientist role is an offshoot of several traditional technical roles, including mathematician, scientist, statistician and computer professional. This job requires the use of advanced analytics technologies, including machine learning and predictive modeling.
Source: TechTarget.com
Data Engineer
Data engineers implement methods to improve data reliability and quality. They combine raw information (data) from different sources (such as log files, APIs, scrapped content, and sensors) to create consistent and machine-readable formats. They also develop and test architectures that enable data extraction and transformation for predictive or prescriptive modeling.
Source: workable.com
The DataQuest Learning Model
From the very beginning, DataQuest takes an intuitive approach in teaching it’s subject content to the student. You begin by choosing your learning path, which for most people is the Data Science path. Conveniently enough, the first half of the data science path consists of the entire length of the Data Analyst path.
DataQuest famously skips the video-based teaching method adopted by so many MOOCs. And this is where it supersedes most other online courseware for this subject.
Let’s face it, videos are great, but there is an issue with learning a fairly difficult subject through videos, no matter how tightly-knit they are.
The Interface
Paths are broken down into modules, which are further decomposed into courses and missions. A mission typically consists of a few paragraphs of teaching text followed by instructions to complete a related exercise to practice what you’ve just learned.
This reinforcement mechanism does wonder in making one feel that they are making real progress, which is crucial in a steep learning subject matter like data science.
Practice exercises are done on a notebook interface embedded onto the right panel of the student’s browser window. Students complete the micro-task and click submit upon which DataQuest’s AWS hosted app quickly checks to see if it is correct.
Each module finishes with a challenge: a small project with minimal instructions that require you to exert a bit more enterprise than the practice exercises. These challenges are done on a Jupyter notebook embedded into the browser, which makes sense because the application is widely used in practical data science work.
Why It Works
Let’s say DataQuest was a video-based platform. This is how the typical mission would look like:You watch a 3-5 minute long video on a technical subject, for e.g. list comprehension. The video contains a theoretical overview followed by practical implementations of the code.
If you’re watching it on multiple monitors, you’re free to follow through on an IDE set up on an adjacent screen.
However, you will probably need to watch the video multiple times to truly digest it’s message. Also, you will just be replicating the coding behavior on the screen rather than solving anything of your own. The second part is left for homework assignments and challenges given to the student to tackle on their on or in groups.
This might work for many, but it certainly isn’t the optimal experience for me – and as far as I know, many others like me.
DataQuest forces you to tackle the new topic with an immediate exercise, which you must complete before moving forward. Thankfully, these are exercises aren’t composed of riddles and guesswork. The process to solve this small exercises is available on the same screen, which takes out a lot of the frustration for the student.
Platforms like DataQuest and DataCamp have been criticized for overt ‘hand-holding’. This practice is seen as having adverse affects on the development of students as well as their viability for real world jobs. I disagree.
Each student, except the most mathematically prone, requires some hand-holding at the very beginning of their data science journeys. This is particularly important when fundamental concepts pertaining to functional programming, memory management, and database management are being taught.
DataQuest avoids this common mistake; one that most MOOCs make. To be fair, you really cannot blame the instructors here: they have to teach a subject given a finite set of resources (time, primarily) to a diverse population of students. As a course moderator or instructor, you make a set of assumptions: how much does a student currently know and how much they can absorb during one incremental unit of teaching.
Understandably, the prerequisites for these courses are usually an array of ‘must-have, ideally have, should be’ qualifying statements with a lot of gray areas.
DataQuest takes no such chances. It takes in all comers. Anyone with a high school diploma who can use a web browser is pretty much qualified to start their data science journey. How far they get is whole other story.
A Journey of Learning
The only reason for me to state DataQuest is perfect is if I was being paid a shitload of money for saying so. Since I am not, I will make it obvious: a lot can still be improved for DataQuest to reach it’s potential.
Quality Control
The Data Science and Data Engineer paths start off strong but really falter in the end. The teaching text and instructions are readable but do not meet quality requirements expected from a growing IT company nestled in the heart of the tech Mecca. There should be no reason that basic grammatical mistakes should be repeated mission after mission.
To their credit, the staff at DataQuest replied positively to my complaints. A lot of the text at the latter end of the courses has been rewritten or improved, with most grammatical mistakes amended.
Transparent Pricing
Okay, so here’s the deal: DataQuest sucks when it comes to their pricing. No, it isn’t very expensive. Paying $25 a month for the Data Analyst path or $45 a month for the Data Science path shouldn’t be a big deal for a course that can potentially earn you exponentially more in the future.
But it’s tough taking a company seriously when they literally run the same ‘limited-time’ offer for one year straight. DataQuest has been offering it’s 50% discount on the annual subscription for at least 10 months now; and advertises it as a limited time must-try-now offer everywhere.
Development Inertia
There is a sense of inertia when it comes to DataQuest. Sure, their educational pipeline seems to work and does churn out a fair number of ‘graduates’. I have used the DataQuest platform for almost 9 months on and off, and aside from a few missions here and there, the platform has seen few changes. The poster of the Indian student on the login screen is the same as before, so it is pretty much everything else.
As a visual platform, you want to keep changing things around to keep students on their toes. Perhaps add in a video after a particularly daunting mission.
Syllabus Gaps
There are two major follies that afflict pretty much every online course out there.
First, it’s breaking the trust of the student by making abrupt leaps in the learning difficulty curve. You don’t want to spend 3 hours telling the student how to create a method that multiplies two integers and follow it up with a challenging project that urges them to create a binary search tree from scratch. Although this example is far fetched, you get the idea.
DataQuest is infinitely more intuitive than most platforms out there. But it does falter sometimes, particularly during it’s final laps.
What can be deemed guilty of is leaving critical components of any data science skillset, such as Apache Spark, for the fag end of the relevant path and that too only for a few brief introductory lessons.
The Verdict
Yes, DataQuest is the best online platform out there for everyone who does not have the $20,000 to churn on a bootcamp or the $80,000 to spend on a legitimate Data Science degree. In a field where more and more Ph.Ds are joining the ranks, it may be a bit counter intuitive to make an entry with an online certificate.
But then again, if most IT unicorn founders had the same mentality, they’d be Ph.D’s and not billionaires. I’d reckon most people would prefer to be billionaires who hire Ph.D’s rather than the other way around.
As mentioned earlier, the first few courses in DataQuest are completely free. Give it a try. Who knows, you might find yourself vying for a second career or a skillset that will add greatly to your existing one.
Leave a Reply