Leetcode每周碎碎念——参加了一个训练计划,每周输出一个Leetcode题目,除了有特别想总结的专题,通常会把内容都倒到这里。
Intuition Between Monotonic Stack and Deque
Easy and Almost-Bug-Free Way to Solve Parsing Problems on Leetcode
What it solves for
Parsing problems on leetcode are usually hard to write, hard to debug and variable for different situations, which makes it time-consuming. Sample problems might be the series of calculators. If you are trying to find a general, easy way (almost no annoying bugs after you finish it too!) to solve this kind of problems, then you should read this.
Basically this post introduce simple BNF and a easy-to-write Recursive Descent Parsing template to implement BNF.
Dynamic Programming Over Digits
Introduction
There are many kinds of dynamic programming. DP over digit, just as its name shows, is doing dynamic programming over digits of a number. In this post, I will write a general template for this kind of problems.
Problem definition
First of all, we need to figure out what kind of problem it solves. The description of the problem is usually like:
Given a interval [lower, upper], find the number of all numbers $i$ that satisfy $f(i)$.
Here, the condition $f(i)$ is usually irrelated to the size of the number, but the composition of this number.
Sample Problems On Leetcode
Linear Regression

延续上一篇关于预测理论和ML的内容,本文旨在从数学角度理解和推导线性回归。
内容主要包括:
- 回归问题中的统计模型——从有输入到无输入模型
- 建模和评估风险(风险函数,损失函数等)
- 作出最优预测
- 线性回归模型
- 线性回归是啥
- 数据拟合参数的方法
- 最大似然估计 MLE
- 经验风险最小化 ERM (Empirical Risk Minimization)
- ERM 怎么解
- ERM 表现评估(overfitting等内容)
All You Need to Know About Binary Search
Binary search is such an easy to write algorithm but there are usually some hidden annoying bugs, such as condition of while statement (easy to get an endless loop). Besides, there are some variants like upper_bound and lower_bound searching, which makes this harder. However, if you can hold on to a same criteria of definition, things will become much easier.
In a nutshell, If you are struggling to write a CORRECT & BUG-FREE Binary Search, this might be what you need.
Basics of Prediction Theory 2(预测理论基础二)
上文讲到,一般来说人们是怎么对无输入的事件建模,做出预测和评估,本文则泛化到如何给标记数据建模,预测和评估。以下遵循上一篇文章的逻辑结构,先阐述我们在理想状态,即知道模型原分布参数的条件下下如何作出最优预测以及评估,然后扩展到现实情况,即未知参数情况下的分析。
Basics of Prediction Theory 1(预测理论基础一)
如今人工智能盛行当道,主要得益于近年机器学习和神经网络的大力发展,而这些发展实际上都离不开数学的理论支撑。在这些数学内容里面,最重要的莫过于概率论和线性代数,预测理论属于概率论数理统计的一部分,掌握好理论基础对于后续概念的学习和理解十分重要。
这里简单记录下自己学习的机器学习相关的预测理论部分内容。特别的,本文是关于如何给无输入且只有两个可能结果的事件建模,做出预测和评估,下一篇则泛化到如何给标记数据(即每个数据点包括输入x和输出标记y)建模,预测和评估。
Kernel-Tricks
Part of the reason why ML algorithms can be so versatile is perhaps the use of Kernelization.
The following shows how to apply kernelization in ridge regression and shows how it can be incorparated in other algorithms.
Hadoop HA搭建集群

(图片引用自https://techvidvan.com/tutorials/hadoop-high-availability/)
这次暑假的实训是搭建一个HA的Hadoop集群,去年在University of Missouri的暑期项目搭载过spark和hdfs但只是在EC2上面的三个机子搭建,也没有实现HA
此次有实验室的十几个机子可以用,也引入了一些其它的概念像standby和zookeeper实现高可用性,同时因为配置需要,也更加深刻地理解了各个配置文件各项的含义
这个博文记录这次的配置以及遇到的问题和可能的解决方法