Research organization patterns, research process patterns, and my preferences

When I started the PhD, I knew nothing about the academia and thus spent a lot of time and efforts in mining the unspoken rules. I wish that someone could have lent me a hand, rather than leaving me wandering in the darkness. This painstaking experience has inspired me to help those younger so that they could have a smoother sailing in their intellectual journeys. In this post, I will try something similar but more profound.

Read More

Use my package TSCV for nested cross-validation

Recently, some reader asked me whether my time series cross-validation package TSCV can be used for nested cross-validation. I mulled it over and found the answer to be favorable. I planned to tell him this good news, but the answer quickly became lengthy. Therefore, I decided to turn the answer into a standalone post to address this question. In the following, I will explain the concept of nested cross-validation and its advantage as well as how to use TSCV or any similar packages for it. The same content is also hosted on GitHub. If you have any question, you can ask in either place (preferably in both places).

Read More

The 100th anniversary of Moore-Penrose inverse and its role in statistics and machine learning

All men are equal, but not all matrices have inverses. For instance, rectangular matrices do not have inverses; square matrices without full rank do not have inverses. The matrix rights activists (i.e. E. H. Moore, 1920; Arne Bjerhammar, 1951; and Roger Penrose, 1955) among mathematicians thus stood out and spoke for these computationally unfavorable matrices. Thanks to their continual efforts, every matrix finally got an inverse, dubbed the Moore-Penrose (pseudo) inverse. These previously unfavorable matrices have since contributed to the academia and revolutionized statistics and machine learning. In memory of its 100th anniversary, let me talk, in this post, about the Moore-Penrose inverse and its applications.

Read More

Yet another guide to deploy Plotly Dash on AWS Elastic Beanstalk

In August, I got interested in Amazon Web Service (AWS) and spent some time to get an AWS Cloud Practitioner certificate. To put into practice what I have learned during the training, why not develop a web application, I asked myself. Thus, I decided to create a Plotly Dash dashboard and deploy it on AWS. The service that I chose is AWS Elastic Beanstalk. You can find, on the Internet, several guides written by amateurs to teach you how to deploy Dash on AWS. However, there is something lacking in all these guides. Therefore, I, also an amateur, decided to write a guide myself. In the following, I will show you how to achieve this “feat” step by step. To understand this guide, it is a prerequisite to know how to develop a Dash application and what AWS Elastic Beanstalk is.

Read More

A walk-through of Hao Huang's solution to the sensitivity conjecture

Earlier this month (July, 2019), mathematician Hao Huang posted a proof of the Sensitivity Conjecture, which has troubled mathematicians for 30 years. To people’s surprise, this proof is only 2 page’s long and involves only undergraduate level math. On the Internet, you can find some reports, written for the general public, about the background story and the interpretation of the sensitivity conjecture. Also, several experts, such as Terence Tao, are elaborating on it. Here, writing for students and non-experts, I will summarize the key steps in Hao Huang’s proof, in an attempt to help them quickly grasp the essential.

Read More

Using the Nash Equilibrium to build a minimalist PvE list for Pokémon GO

Abstract. This post introduces the concept of Nash Equilibrium into Pokémon GO meta-game, with the intention to build a minimalist PvE list. The idea is to build an all-round team for gym battles or for boss raids with few Pokémon. This approach allows the players to concentrate their resources to build a small team of strong Pokémon instead of a large group of mildly strong Pokémon. To demonstrate the usage of the Nash Equilibrium, I use Timeout as the win condition for gym battles and total damage output for boss raids. The resulting minimalist lists are for reference; I provided, at the end of the post, the dataset and the code necessary for the readers to build their own minimalist lists.

Read More

Why Elo ratings are less efficient for Yugioh Duel Links than for chess?

Yugioh Duel Links is a digital collectible card game (CCG), which could be played on mobile devices. Like many other CCGs, there is an in-game Ladder system, where players compete with each other to prove themselves as the best duelist in the world. However, many players complain about the mechanism of this Ladder and suggest replacing it with the Elo system. In this post, I can show you, thanks to the cutting-edge research of DeepMind, that the Elo system, or any other systems using averaging, is unavoidably inefficient for Duel links.

Read More

Important inequalities in convex optimization, proofs and intuition

Many talk about data science and machine learning with enthusiasm, but few know about one of the most important building components behind them – convex optimization. Indeed, nowadays nearly every data science problem will first be transformed into an optimization problem and then solved by standard methods. Convex optimization, albeit basic, is the most important concept in optimization and the starting point of all understanding. If you are an aspiring data scientist, convex optimization is an unavoidable subject that you had better learn sooner than later.

Read More

A simple model to understand free-to-play games

Free-to-play (F2P) and Pay-to-play are two different business models. Pay-to-play requires the players to make a fixed-amount purchase to play the game. Examples include World of Warcraft, Eve Online and so forth. Free-to-play usually adopts a freemium model and does not require any payment in advance to access the game. Instead, players can make in-game micropayments to enhance their gaming experience, which is also the method with which the company makes money. Examples include League of Legends, Dota 2, Hearthstone, Clash Royale, Pokémon GO and so forth. F2P has been a great commercial success, and more and more online games follow this model.

Read More