purpose: in sql, try to get first row group by a group variable order by the value of a numeric variable.
for example, the data is like: for each business_month, for each borrower, there are multiple date(fac_auth_eff_dt) and different rating.
What I want: for each borrower, I want to get the rating at the first date.
How:
1: for each borrower, add a column called row_number(like order) by their date [row_number() over (partition by borrower_uen order by fac_auth_eff_dt) rn]
![png](/figures/sql_first_row_by_group_1.png)
2: now for each borrower, row_number = 1, means the earliest fac_auth_eff_dt, row_number = 1 means the second earliest fac_auth_eff_dt and so on.
3: finally pick the data with row_number = 1 to get the first one
![png](/figures/sql_first_row_by_group_2.png)
with testtable as(
select borrower_uen, fac_auth_eff_dt, row_number() over (partition by borrower_uen order by fac_auth_eff_dt) rn, risk_rating as origination_risk_rating
from bslh.bcs_lending_airb_m
where fac_commitment_amt > 1000 and borrower_rrs = 'CM1' and fac_auth_eff_dt is not null
order by borrower_uen
)
select *
from testtable
where rn = 1