Main insights (with lecture notes) from Course 1 of Machine Learning Engineering for Production (by DeepLearning.AI & Andrew Ng)

Photo by on

For all the hype around machine learning models, they are not useful unless deployed into production to deliver business value.

Andrew Ng and DeepLearning.AI deeply understood this, and created the to share their practical experiences on productionized ML systems.

In this article, I summarize the lessons so that you can skip the hours of online videos while still being able to glean the key insights.

Contents

(1)
(2)
(3)

This summary article covers Course 1 of the 4-course MLOps specialization.


Find out how to create mock and dummy data for your data science projects

Photo by on

We all know that data is important. The problem is that many times we just do not have (enough of) it. As we develop data applications or pipelines, we need to test it with data that resembles what might be seen in production.

It is difficult to manually create realistic datasets that are of sufficient volume and variety (e.g. different data types, characteristics). Furthermore, hand-created data is prone to subconscious and systematic biases.

Fortunately, there are free online resources that can generate realistic fake data to test with. Let’s take a look at some of them:

(1)
(2)
(3)
(4)


Automatically review the readability and quality of your Python scripts based on PEP-8 style conventions

Photo by on

Programming is an indispensable skill of a data practitioner’s toolkit, and while it is easy to create a script to execute basic functions, writing good readable code at scale requires more work and thought.

Given Python’s popularity in data science, I will be delving into the use of pycodestyle for style guide checking to improve the quality and readability of Python code.

Contents

(1)
(
2)
(
3)
(
4)
(
5)

About PEP-8

The pycodestyle checker provides code recommendations based on the PEP-8 style conventions. So what exactly is PEP-8?

PEP stands for Python Enhancement Proposal, and


Key concepts, examples and Python implementation of measuring Optical Character Recognition output quality

Photo by on

Contents

(1)
(2)
(3)
(4)
(5)

Importance of Evaluation Metrics

Great job in successfully generating output from your OCR model! You have done the hard work of labelling and pre-processing the images, setting up and running your neural network, and applying post-processing on the output.

The final step now is to assess how well your model has performed. Even if it gave high confidence scores, we need to measure performance with objective metrics. …


Reviewing the controversial implementation of Video Assistant Referees in English football using Python

Image by from

Match Highlights




TL:DR

  • Son Heung-Min was most frequently involved in VAR overturn incidents
  • VAR incidents tend to spike in the middle of each half and peak at the end of each half
  • No apparent bias of VAR decisions in favor of the Big 6 teams
  • Amongst teams present in both EPL seasons where VAR was implemented, Brighton had the highest proportion of overturn decisions in their favor (67.9%), while West Bromwich Albion had the lowest (25.0%)
  • Link to GitHub repo of this project


Thoughts and Theory

Keep your neural network alive by understanding the downsides of ReLU

Photo by on

Contents

(1)
(
2)
(
3)
(
4)

Activation functions are mathematical equations that define how the weighted sum of the input of a neural node is transformed into an output, and they are key parts of an artificial neural network (ANN) architecture.

Activation functions add non-linearity to a neural network, thereby allowing the network to learn complex patterns in the data. …


Using Python and Flourish to visualize rank and revenue trends of the world’s largest companies

Designed by Vectorarte /

Companies rise and fall amidst the intense and ruthless global competition, thus it would be fascinating to visualize the progress of the top global firms over the past few decades.

The is an annual ranking of the top 500 corporations worldwide as measured by revenue, and it serves as a good source of data for running visual analysis. I figured it would also be an enriching experience to generate bar chart race animations using code (Python) and no-code (Flourish) solutions. Let’s get started!

Table of Contents

(1)
(2)
(3)
(4)


Obtain a Tableau Specialist certification to showcase your data visualization skills and product knowledge

Photo by on

Contents






Introduction to Tableau Certifications

The software is one of the most popular visual analytics platform in the market. With its focus on business intelligence, Tableau makes it easy for users to explore and manage data, and to quickly discover and share insights. Given how commonly it is used across various industries, securing a Tableau certification will certainly help you to differentiate yourself from the crowd.

Front-end users will most likely be using Tableau…


Simple trick to create a dynamic table of contents to allow easy scroll navigation for your readers

Photo by on

Table of Contents



Introduction

By now you should be aware that the Medium platform does not allow writers to automatically generate a dynamic hyperlinked table of contents. This is an issue because writers frequently use sections to organize their stories, and what we see, more often than not, are static tables of content that are just lists of text.

Having a dynamic table of contents significantly improves the user experience for readers, making it easier for them to scroll to sections…


Step-by-step sentiment analysis with NLP (Stanza, NLTK Vader and TextBlob) on COVID-19 vaccine tweets

Photo by on

The COVID-19 pandemic has presented itself as one of the gravest global threats, and is still very much an ongoing menace. In equal measure, we are in the midst of the biggest vaccination campaign in human history. According to , a staggering 68.1 million doses in 56 countries have been administered so far (as of 26 Jan 2020).

While the vaccine has offered renewed hope in the fight against COVID-19, it has also ignited aggressive anti-vaccine movements. It would thus be interesting to gauge the public’s perception towards the COVID-19 vaccine with (in Python) on recent Twitter data.

TL:DR

Kenneth Leung

Data Scientist @ AXA | Pharmacist | Master of Science (Business Analytics)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store