to avoid conflict, it is better to install tensorflow/theano in the virtuenv in python. Another way is to install them in the conda enviroment. This is the notes how to do that.
Exploratory analysis of Two Sigma Financial Modeling Challenge
Two sigma provides the interesting data: y is a series of capped and floored time series which converged by time. The explainaroty variables have three types: fundmental, derived and techinical. This data also has a lot of missing values. All together makes the prediction interesting.
Exploratory analysis of Two Sigma Connect: Rental Listing Inquiries
this is the exploratory analysis of the data in kaggle Two Sigma Connect: Rental Listing Inquiries. The data itself is very easy to understand. Here it focus on figureing out the relation between the explainatory variables and the dependend variable. Exploring the relation between x and y is very important in building a predictive and powerful model. This is the step one.
install docker, dl package, ssh access for ubuntu
install docker, and dl-package including tensorflow, theano, keras and caffe on docker for dl class, set up port forwarding for remote ssh access on the new installed ubuntu 16.10
increase disk space on vmware for ubuntu
Ubuntu was install on vmware. More spaces are needed since lots of data download. This is a note how to increase the disk space: add new partition, set up file system, and mount to the dir.
linear regression in python, Chapter 3 - Regression with Categorical Predictors
This chapter will cover the linear regression with categorical variables: how to create dummy variables, how to run the categorical variables regression directly, what does the categorical mean, and what does it mean when there is interactrion, especially when there is interactions with continuous variables. For the original document, plrease refer to UCLA ATS
Happy New Year
Happy New Year
linear regression in python, Chapter 2
This chapter will cover the diagnostics of linear regression. For the original document, plrease refer to UCLA ATS
linear regression in python, Chapter 1
UCLA ATS has very good introduction of Applied Statistics, including using R/SAS/Stata to do hands-on projects. Here I am trying to provide a python version of the web book about linear regression. At least I will try to cover their first 3 to 4 Chapters based on my time schedule. I will focus on Chapter 2 to discuss linear regression diagnostic. In Chapter 1 I will introduce how to run linear regression in python statsmodels to get the same result as R or SAS. And how to do data analysis and data visualization in python.In the future, I will try to introudce machine learning in sklearn and deep learning in Theano and Tensorflow.
linear regression in python, outliers / leverage detect
in section I will introduce how to detect ourliers and high leverage points in the linear regression. I also shows in graph how the ourliers will affect your regression fitting. More details of detecting using cook's distance, dffits, dfbeta will be in section 2 -- regression diagnostic.