Vintage计算


1. 前言

  1. 本文来自Vintage分析表计算过程详解 - 知乎 (zhihu.com)

2. 基础表

还款计划表

/***
    表名:还款计划表
    码值:
        1. 缺失值,必需为空
***/
create table if not exists dm_risk.lilso_plan_table (
    plan_no            STRING comment   "计划号(主键)",
    loan_no            STRING comment   "借据号",
    term_no            STRING comment   "期序",
    due_date           STRING comment   "应还日",
    repay_date         STRING comment   "实还日",
    overdue_days       STRING comment   "逾期天数",
    loan_amt           STRING comment   "借据金额",
    remain_amt         STRING comment   "剩余本金",
    prin_amt           STRING comment   "应还本金",
    intr_amt           STRING comment   "应还利息",
    penalty            STRING comment   "应还罚息",
    act_amt            STRING comment   "实还金额",
    act_prin_amt       STRING comment   "实还本金",
    act_intr_amt       STRING comment   "实还利息",
    act_penalty        STRING comment   "实还罚息",
    repay_sts          STRING comment   "结清状态"
) comment '还款计划表';

月末时点表

  1. SQL思路:基于还款计划表获取还款月份对应月末
create table dm_risk.lilso_month_end_table stored as orc as 
select 
    date_sub(concat(substr(t1.repay_date, 1, 7), '-01'), 1) as month_end    -- 月末
from dm_risk.lilso_plan_table as t1            -- 还款计划表
group by 1;
  1. Python思路:直接构造需要时间段的月末
month_end_list = pd.date_range('2022-07-01', periods=13, freq='M').astype('str')

3. Vintage底表

逾期天数计算

SQL思路:使用还款计划表与月末时点表进行笛卡尔积,计算逾期天数

逾期天数计算逻辑如下, 步骤一:

create table dm_risk.lilso_mob_overduedays_stat as
select a.plan_no, a.term_no, a.due_date, a.repay_date, a.prin_amt, a.act_prin_amt
     , a.loan_no, a.loan_term, a.loan_date, a.loan_amt
     , b.mob_date
     , months_between(mob_date, last_day(loan_date)) as mob
     , case 
          when due_date >= mob_date then 0
          when due_date <  mob_date and repay_sts = '未结清' then datediff(mob_date, due_date)
          when due_date <  mob_date and repay_sts = '已结清' and repay_date >= mob_date then datediff(mob_date, due_date) 
          when due_date <  mob_date and repay_sts = '已结清' and repay_date <  mob_date then datediff(repay_date, due_date) 
      else 0 end as ever_overdue_days
     , case 
          when due_date >= mob_date then 0
          when due_date <  mob_date and repay_sts = '未结清' then datediff(mob_date, due_date)
          when due_date <  mob_date and repay_sts = '已结清' and repay_date >= mob_date then datediff(mob_date, due_date) 
          when due_date <  mob_date and repay_sts = '已结清' and repay_date <  mob_date then 0 
      else 0 end as curr_overdue_days
from dm_risk.lilso_vintage_loan_plan as a  -- 还款计划表
cross join dm_risk.lilso_mob_month_end as b   -- 月末时点
where mob_date <= '2023-10-25';

Python思路:取放款日期小于月末时点的数据(不做笛卡尔积),逾期计算逻辑同上,算逾期天数时,可一并计算剩余本金

剩余本金计算,步骤二

create table if not exists dm_risk.lilso_loan_ever_m1_stat as
select a.loan_no         -- 借据号
      ,a.loan_term       -- 借款期限
      ,a.loan_amt        -- 借款本金
      ,a.loan_date       -- 借款日期
      ,a.mob             -- 账龄
      ,a.mob_date        -- 观察点
      
      ,sum(if(due_date < mob_date, prin_amt, 0)) as due_prin_amt -- 到期应还本金
      ,sum(if(due_date < mob_date and ever_overdue_days > 30, prin_amt, 0)) as ovd_prin_amt_ever  -- 逾期本金

      ,max(if(due_date < mob_date and ever_overdue_days > 30, 1, 0)) as ovd_flag_ever
      ,if(max(if(due_date < mob_date and ever_overdue_days > 30, 1, 0)) = 1 -- 在观察点前,只要任意一期满足m1+标记,则该借据为m1+
         ,loan_amt - sum(if(repay_date <= mob_date, act_prin_amt, 0)) -- 借款本金 - 已还本金
         ,0
       ) as ovd_loan_bal_ever -- 剩余本金
     
from dm_risk.lilso_mob_overduedays_stat as a
group by a.loan_no  
      ,a.loan_term
      ,a.loan_amt 
      ,a.loan_date
      ,a.mob      
      ,a.mob_date;

vintage透视表呈现

本金余额计算口径:放款后在每个自然月的月末,当月总放款额中有多少的本金余额是处于逾期M1+的状态,并观测其变化趋势。

create table if not exists dm_risk.lilso_vintage_m1_plus_stat as 
select 
    substr(loan_date, 1 ,7) as loan_month -- 放款月份,即vintage
    ,mob, loan_term -- MOB、期限(之后作为透视表筛选维度)

    ,count(loan_no) as cnt  -- 放款订单量#
    ,sum(loan_amt) as amt   -- 放款金额量¥

    ,sum(ovd_flag_ever) as m1_plus_cnt      -- 逾期M1+放款订单量#
    ,sum(ovd_loan_bal_ever) as m1_plus_amt  -- 逾期M1+剩余金额量¥

from dm_risk.lilso_loan_ever_m1_stat
where mob_date <= '2020-10-25'
group by substr(loan_date, 1, 7), mob, loan_term;

文章作者: lilso
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 lilso !
  目录