Open in app

Sign In

Write

Sign In

Kenneth Leung
Kenneth Leung

2.6K Followers

Home

About

Published in DataDrivenInvestor

·Jan 2

How to Swap Day and Month of Incorrectly Formatted Excel Dates

A simple Excel trick to switch and flip the day and month parts of dates — The Problem Working with time series data in Excel can quickly become a nightmare when the date columns are formatted wrongly. One common problem is that the arrangement of the day, month, and year components is stored initially in a specific format but parsed differently when opened in Excel. For example, dates…

Excel

3 min read

How to Swap Day and Month of Incorrectly Formatted Excel Dates
How to Swap Day and Month of Incorrectly Formatted Excel Dates
Excel

3 min read


Published in Towards Data Science

·Dec 27, 2022

Practical Guide to Transfer Learning in TensorFlow for Multiclass Image Classification

Clearly-explained step-by-step tutorial for implementing transfer learning in image classification — Often we do not have access to a wealth of labeled data or computing power to build image classification deep learning models from scratch. Fortunately, transfer learning empowers us to develop robust image classifiers for our specific classification tasks, even if we have limited resources. In this easy-to-follow walkthrough, we…

Transfer Learning

14 min read

Practical Guide to Transfer Learning in TensorFlow for Multiclass Image Classification
Practical Guide to Transfer Learning in TensorFlow for Multiclass Image Classification
Transfer Learning

14 min read


Published in Towards Data Science

·Sep 13, 2022

PyMySQL — Connecting Python and SQL for Data Science

Easily access MySQL databases and execute SQL queries in Python — SQL and Python are indispensable tools for data practitioners to work effectively with data. A common use case would be the initial retrieval of data from relational databases using SQL queries, followed by subsequent manipulation and analysis of the data in Python with libraries such as pandas. But did you…

MySQL

6 min read

PyMySQL — Connecting Python and SQL for Data Science
PyMySQL — Connecting Python and SQL for Data Science
MySQL

6 min read


Published in Towards Data Science

·Aug 24, 2022

Imputation of Missing Data in Tables with DataWig

Implementing Amazon's DataWig in Python to impute missing values in tabular data — Missing values in real-world datasets is a common phenomenon that poses a key challenge for all data practitioners. This issue is even more challenging when the dataset contains heterogeneous data types. In this article, we look at how DataWig can help us perform the imputation of missing values in tabular…

Data Science

8 min read

Imputation of Missing Data in Tables with DataWig
Imputation of Missing Data in Tables with DataWig
Data Science

8 min read


Published in Geek Culture

·Aug 23, 2022

Quick Primer on Types of Missing Data and Imputation Techniques

Get up to speed with the various data missingness types and methods for imputation — Contents (1) Types of Missing Data (2) Imputation Techniques (3) Python Packages for Imputation (1) Types of Missing Data There are three general types of missing data, best explained with examples. (i) Missing completely at random (MCAR) The likelihood of missing values in a feature is unrelated to any other data features (observed or unobserved). …

Data Science

4 min read

Quick Primer on Types of Missing Data and Imputation Techniques
Quick Primer on Types of Missing Data and Imputation Techniques
Data Science

4 min read


Published in Towards Data Science

·Aug 9, 2022

Feature Selection with Simulated Annealing in Python, Clearly Explained

Concept and implementation of the global search algorithm to select the best features for machine learning — Feature selection is vital in machine learning as it boosts computational efficiency and predictive performance by keeping only the most relevant predictors. Beyond popular feature selection classes like filter and wrapper methods, global search methods like simulated annealing are powerful techniques at our disposal.

Simulated Annealing

9 min read

Feature Selection with Simulated Annealing in Python, Clearly Explained
Feature Selection with Simulated Annealing in Python, Clearly Explained
Simulated Annealing

9 min read


Published in DataDrivenInvestor

·Jun 22, 2022

Real-World Data Science Use Cases in the Insurance Industry

Exploring examples of data science applications across the insurance value chain — The insurance sector is one of the world’s largest industries based on the value of gross premiums, the scale of investments, and its ubiquitous societal role in covering personal and commercial risks. The sheer size of the industry brings a wealth of data and business opportunities, paving the way for…

Data Science

9 min read

Real-World Data Science Use Cases in the Insurance Industry
Real-World Data Science Use Cases in the Insurance Industry
Data Science

9 min read


Published in Towards Data Science

·Jun 7, 2022

How to Dockerize Machine Learning Applications Built with H2O, MLflow, FastAPI, and Streamlit

An easy-to-follow guide to containerizing multi-service ML applications with Docker — Given Docker’s impressive capabilities of building, shipping, and running machine learning (ML) applications reliably, it is no surprise that its adoption has exploded and continues to surge within the data science field. This article explains how to utilize Docker to containerize a multi-service ML application built with H2O AutoML, MLflow…

Docker

9 min read

How to Dockerize Machine Learning Applications Built with H2O, MLflow, FastAPI, and Streamlit
How to Dockerize Machine Learning Applications Built with H2O, MLflow, FastAPI, and Streamlit
Docker

9 min read


Published in Towards Data Science

·May 23, 2022

Key Learning Points from MLOps Specialization — Course 4

Final insights (with lecture notes) from the Machine Learning Engineering for Production (MLOps) Course by DeepLearning.AI & Andrew Ng — Realizing the potential of machine learning (ML) in the real world goes beyond model training. By leveraging the best practices of MLOps, teams can better operationalize and manage the end-to-end lifecycles of ML models in a sustainable manner. In this final article of the 4-part MLOps Specialization series, I summarize…

Data Science

9 min read

Key Learning Points from MLOps Specialization — Course 4
Key Learning Points from MLOps Specialization — Course 4
Data Science

9 min read


Published in Towards Data Science

·Apr 18, 2022

Build an Anomaly Detection Pipeline with Isolation Forest and Kedro

Developing and managing a data science pipeline for detecting fraudulent credit card transactions — Data science promises to generate immense business value across all industries through the compelling capabilities of machine learning. However, a recent report by Gartner revealed that most data science projects fail to progress beyond experimentation despite having sufficient data and intent. To unlock the full potential of data science, machine…

Kedro

13 min read

Build an Anomaly Detection Pipeline with Isolation Forest and Kedro
Build an Anomaly Detection Pipeline with Isolation Forest and Kedro
Kedro

13 min read

Kenneth Leung

Kenneth Leung

2.6K Followers

Data Scientist at BCG | Tech Writer | Pharmacist | linkedin.com/in/kennethleungty

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech