pydata

Keep Looking, Don't Settle

Data Engineering and Modeling 01: predict defaults with imbalanced data

This is an real question with a sample data from internet. We want to predict the defaults from the imbalanced data(default rate is about 0.08%). All the variables are hidden so we need to explore the variables attributes to find some realtion. Also it is imbalanced so we need to apply some sampling methods to balance the data.

Data Engineering and Modeling 01: predict defaults with inbalanced data

This is an real question with a sample data from internet. We want to predict the defaults from the inbalanced data(default rate is about 0.08%). All the variables are hidden so we need to explore the variables attributes to find some realtion. Also it is inbalanced so we need to apply some sampling methods to balance the data.

Build Recurrent Neural Network from Scratch

The previous blog shows how to build a neural network manualy from scratch in numpy with matrix/vector multiply and add. Although there are many packages can do this easily and quickly with a few lines of scripts, it is still a good idea to understand the logic behind the packages. This part is from a good blog which use an example predicitng the words in the sentence to explain how to build RNN manually. RNN is a little more complicated than the neural network in the previous blog because the current time status and ourput in RNN will depends on the status in the previous time. So the Backpropagation part will be more complicated. I try to give the details in mathematic formula about how to get the gradients recursively in the partial derivatives.

Build Neural Network from Scratch

Use two examples to show how to build the neural network from scratch: define the activation function on each layer, define the lost function, and calculate the partial derivatives using chain rules.

tensorflow简介--07

这是以前看卷积神经网络做的笔记.我很喜欢另外一种学习方法:不是从理解什么是CNN开始,也不是去关注后面的数学公式,也不是怎么来用随机梯度下降的办法来最小化cross-entropy函数求解参数。而是直接用tensorflow来build一个简单的卷积神经网络,观察它是怎么工作的,最后的结果是什麽。理解了这些以后,然后在回过头来理解那些理论的部分和数学的部分。这篇blog的绝大部分内容都是来自于stanford的卷积神经网络课程的讲义。用中文整理出来,因为还是看中文快一些。