pydata

Keep Looking, Don't Settle

2019-05-25 Week 21

python regular expression to clean the RMP data.

import re

strtest = """  3602433631519" />                                </td>
                <td> 7 </td>
                <td>< = "> </a></td>
                <td>HRB_HighClaim_Sideline</td>

                <!-- Align Rule condition to variable expression -->
                    <td></td>

                <td>MVEL</td>
                <td>
                        <pre class="code">get(&quot;$BFS.hrb_claims_by_customer_us.n_claim_count&quot;)!\
=empty &amp;&amp; 
get(&quot;$var_001&quot ...

2019-05-18 Week 20 -- awk

awk -F, '{OFS="\t";print $3,$4}' mo_orders_weekly_2019-04-27.txt   ==>   awk '{print $4","$5}' mo_orders_weekly_2019-04-27.txt

cat mo_orders_weekly_2019-04-27.txt | cut -d ',' -f3    ==>    cat mo_orders_weekly_2019-04-27.txt | cut  -f3-4

awk -F, '{OFS=",";print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$20,$21}' infile.csv > outfile.csv

something else to consider ...

2019-04-20 Week 16

Shares From Internet

  1. code-of-learn-deep-learning-with-pytorch

  2. 从信用卡欺诈模型看不平衡数据分类:这个作者总结了kaggle上面不同的人使用不同的抽样办法,不同的模型,以及异常值检验的办法。我以前也做过这个project,还是挺有意思的一个项目。

  3. 何用Deep Autoencoder实现信用卡欺诈侦测建模:这是另外一个办法,使用autoencoder来压缩数据

  4. The Syncthing Project: 一个开源软件用来备份文件夹

  5. PyTorch中文文档