Open in app

Sign In

Write

Sign In

Kenneth Leung
Kenneth Leung

8.6K Followers

Home

About

Published in

Towards Data Science

·Pinned

Running Llama 2 on CPU Inference Locally for Document Q&A

Clearly explained guide for running quantized open-source LLM applications on CPUs using Llama 2, C Transformers, GGML, and LangChain — Third-party commercial large language model (LLM) providers like OpenAI’s GPT4 have democratized LLM use via simple API calls. However, teams may still require self-managed or private deployment for model inference within enterprise perimeters due to various reasons around data privacy and compliance.

Large Language Models

11 min read

Running Llama 2 on CPU Inference Locally for Document Q&A
Running Llama 2 on CPU Inference Locally for Document Q&A
Large Language Models

11 min read


Published in

Towards Data Science

·Aug 19

When AI Goes Astray: High-Profile Machine Learning Mishaps in the Real World

A tour of infamous machine learning blunders and failures that caught the world’s attention — The transformative potential of artificial intelligence (AI) and machine learning has often made headlines in the news, with plenty of reports on its positive impact in diverse fields ranging from healthcare to finance. Yet, no technology is immune to missteps. While the success stories paint a picture of machine learning's…

Artificial Intelligence

6 min read

When AI Goes Astray: High-Profile Machine Learning Mishaps in the Real World
When AI Goes Astray: High-Profile Machine Learning Mishaps in the Real World
Artificial Intelligence

6 min read


Published in

Towards Data Science

·Apr 18

arXiv Keyword Extraction and Analysis Pipeline with KeyBERT and Taipy

Build a keyword analysis Python application comprising a frontend user interface and backend pipeline — As the amount of textual data from sources like social media, customer reviews, and online platforms grows exponentially, we must be able to make sense of this unstructured data. Keyword extraction and analysis are powerful natural language processing (NLP) techniques that enable us to achieve that.

Data Science

12 min read

arXiv Keyword Extraction and Analysis Pipeline with  KeyBERT and Taipy
arXiv Keyword Extraction and Analysis Pipeline with  KeyBERT and Taipy
Data Science

12 min read


Published in

DataDrivenInvestor

·Jan 2

How to Swap Day and Month of Incorrectly Formatted Excel Dates

A simple Excel trick to switch and flip the day and month parts of dates — The Problem Working with time series data in Excel can quickly become a nightmare when the date columns are formatted wrongly. One common problem is that the arrangement of the day, month, and year components is stored initially in a specific format but parsed differently when opened in Excel. For example, dates…

Excel

3 min read

How to Swap Day and Month of Incorrectly Formatted Excel Dates
How to Swap Day and Month of Incorrectly Formatted Excel Dates
Excel

3 min read


Published in

Towards Data Science

·Dec 27, 2022

Practical Guide to Transfer Learning in TensorFlow for Multiclass Image Classification

Clearly-explained step-by-step tutorial for implementing transfer learning in image classification — Often we do not have access to a wealth of labeled data or computing power to build image classification deep learning models from scratch. Fortunately, transfer learning empowers us to develop robust image classifiers for our specific classification tasks, even if we have limited resources. In this easy-to-follow walkthrough, we…

Transfer Learning

14 min read

Practical Guide to Transfer Learning in TensorFlow for Multiclass Image Classification
Practical Guide to Transfer Learning in TensorFlow for Multiclass Image Classification
Transfer Learning

14 min read


Published in

Towards Data Science

·Sep 13, 2022

PyMySQL — Connecting Python and SQL for Data Science

Easily access MySQL databases and execute SQL queries in Python — SQL and Python are indispensable tools for data practitioners to work effectively with data. A common use case would be the initial retrieval of data from relational databases using SQL queries, followed by subsequent manipulation and analysis of the data in Python with libraries such as pandas. But did you…

MySQL

6 min read

PyMySQL — Connecting Python and SQL for Data Science
PyMySQL — Connecting Python and SQL for Data Science
MySQL

6 min read


Published in

Towards Data Science

·Aug 24, 2022

Imputation of Missing Data in Tables with DataWig

Implementing Amazon's DataWig in Python to impute missing values in tabular data — Missing values in real-world datasets is a common phenomenon that poses a key challenge for all data practitioners. This issue is even more challenging when the dataset contains heterogeneous data types. In this article, we look at how DataWig can help us perform the imputation of missing values in tabular…

Data Science

8 min read

Imputation of Missing Data in Tables with DataWig
Imputation of Missing Data in Tables with DataWig
Data Science

8 min read


Published in

Geek Culture

·Aug 23, 2022

Quick Primer on Types of Missing Data and Imputation Techniques

Get up to speed with the various data missingness types and methods for imputation — Contents (1) Types of Missing Data (2) Imputation Techniques (3) Python Packages for Imputation (1) Types of Missing Data There are three general types of missing data, best explained with examples. (i) Missing completely at random (MCAR) The likelihood of missing values in a feature is unrelated to any other data features (observed or unobserved). …

Data Science

4 min read

Quick Primer on Types of Missing Data and Imputation Techniques
Quick Primer on Types of Missing Data and Imputation Techniques
Data Science

4 min read


Published in

Towards Data Science

·Aug 9, 2022

Feature Selection with Simulated Annealing in Python, Clearly Explained

Concept and implementation of the global search algorithm to select the best features for machine learning — Feature selection is vital in machine learning as it boosts computational efficiency and predictive performance by keeping only the most relevant predictors. Beyond popular feature selection classes like filter and wrapper methods, global search methods like simulated annealing are powerful techniques at our disposal.

Simulated Annealing

9 min read

Feature Selection with Simulated Annealing in Python, Clearly Explained
Feature Selection with Simulated Annealing in Python, Clearly Explained
Simulated Annealing

9 min read


Published in

DataDrivenInvestor

·Jun 22, 2022

Real-World Data Science Use Cases in the Insurance Industry

Exploring examples of data science applications across the insurance value chain — The insurance sector is one of the world’s largest industries based on the value of gross premiums, the scale of investments, and its ubiquitous societal role in covering personal and commercial risks. The sheer size of the industry brings a wealth of data and business opportunities, paving the way for…

Data Science

9 min read

Real-World Data Science Use Cases in the Insurance Industry
Real-World Data Science Use Cases in the Insurance Industry
Data Science

9 min read

Kenneth Leung

Kenneth Leung

8.6K Followers

Data Scientist at Boston Consulting Group (BCG) | Tech Writer | 1.5M+ reads on Medium | linkedin.com/in/kennethleungty | github.com/kennethleungty

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech

Teams