后续系统构建方案
管理
后续系统构建方案
最后更新: 2026-06-07
版本: v1.0
基于:瑞美云数据分析体系迁移指南 + 光合盐现有架构审计 + 三层数据架构设计
总体路线图
Phase 1: 基础加固(已完成 ✅)
├── ✅ 每日 SQLite 自动备份(3:00)
├── ✅ 全 Profile Holographic 记忆
├── ✅ Web UI 升级 v0.6.11
├── ✅ 光合盐BI管理手册(9个核心文档)
└── ✅ Dify Plugin Daemon 运行 + 知识库导入(8个文档)
Phase 2: 三层数据仓库建设(当前阶段 ⏳)
├── 2A: PostgreSQL 建库 + ETL + DWD层
├── 2B: DWS + ADS 层(21个物化视图)
├── 2C: 展示层配置(Dify + BI)
└── 2D: 知识库持续补充
Phase 3: 企业级增强(规划中)
├── SQLite → PostgreSQL 迁移评估
├── AI 客服训练管线
├── 操作审计日志
└── 高可用/灾备
Phase 2A: 基础设施 + DWD 明细层(预计 1 天)
任务 2A-1: PostgreSQL 建数据仓库
sql
-- 创建数据仓库数据库
CREATE DATABASE ruimeiyun_dw;
-- 创建 analytics schema(物化视图统一放在此 schema)
\c ruimeiyun_dw
CREATE SCHEMA analytics;
任务 2A-2: MySQL → PostgreSQL ETL 脚本
编写 Python 脚本,将 MySQL ruimeiyun 的 8 张基表同步到 PostgreSQL:
| 目标表 | 来源表 | 同步方式 |
|---|---|---|
stg_出纳结算单 | MySQL.出纳结算单 | 全量/增量 |
stg_到院明细 | MySQL.到院明细 | 全量/增量 |
stg_客户盘点 | MySQL.客户盘点 | 全量/增量 |
stg_回访记录 | MySQL.回访记录 | 全量/增量 |
stg_712执行业绩 | MySQL.712执行业绩 | 全量/增量 |
stg_预约记录 | MySQL.预约记录 | 全量 |
stg_入库记录 | MySQL.入库记录 | 全量 |
stg_白鲸门店日报 | MySQL.白鲸门店日报 | 全量 |
ETL 逻辑(已踩坑经验):
- 字段类型映射(MySQL VARCHAR → PostgreSQL TEXT)
- 日期格式标准化(统一为 TIMESTAMP)
- 空值处理(空字符串 → NULL)
- 增量同步依据:
修改时间字段 - 初次部署全量,后续增量
任务 2A-3: 创建 DWD 层物化视图(2个)
sql
-- DWD 1: 客户到院归一
CREATE MATERIALIZED VIEW analytics.dwd_customer_visit AS
SELECT
vip_num,
DATE(biz_time) AS visit_date,
store_name,
visit_type,
-- 同一客户同一天 = 1次
ROW_NUMBER() OVER (PARTITION BY vip_num, DATE(biz_time) ORDER BY biz_time) AS visit_seq
FROM stg_到院明细;
-- DWD 2: 客户咨询师归属
CREATE MATERIALIZED VIEW analytics.dwd_customer_consultant AS
SELECT
c.vip_num,
c.customer_name,
COALESCE(c.current_consultant, c.consultant_at_order) AS effective_consultant,
-- ...其他字段
FROM stg_客户盘点 c;
任务 2A-4: 每日刷新 crontab
crontab:
0 4 * /home/ubuntu/scripts/etl_mysql_to_pg.sh # ETL同步
30 4 * /home/ubuntu/scripts/refresh_dwd.sh # 刷新DWD
Phase 2B: DWS + KPI + APP 层(预计 1 天)
任务 2B-1: DWS 岗位事实层(4个物化视图)
sql
-- 咨询师日表
CREATE MATERIALIZED VIEW analytics.dws_consultant_daily AS
SELECT
consultant_name,
biz_date,
SUM(actual_money) AS 收款,
SUM(project_performance) AS 开单,
SUM(settlement_amount) AS 划扣,
COUNT(DISTINCT vip_num) AS 服务客户数,
COUNT(*) AS 操作次数
FROM stg_出纳结算单
GROUP BY consultant_name, biz_date;
-- 医生日表
CREATE MATERIALIZED VIEW analytics.dws_doctor_daily AS
SELECT
doctor_name,
biz_date,
SUM(settlement_amount) AS 执行业绩,
COUNT(DISTINCT vip_num) AS 治疗人数,
COUNT(*) AS 治疗次数
FROM stg_712执行业绩
GROUP BY doctor_name, biz_date;
-- 护士日表(仅医美科室)
CREATE MATERIALIZED VIEW analytics.dws_nurse_daily AS
SELECT
deduct_operator AS nurse_name,
biz_date,
SUM(settlement_amount) AS 执行业绩,
COUNT(DISTINCT vip_num) AS 服务人数,
COUNT(*) AS 操作次数
FROM stg_712执行业绩
WHERE dept_l1 IN ('微创中心', '皮肤中心', '外科中心')
GROUP BY deduct_operator, biz_date;
-- 手艺人日表(仅生美科室)
CREATE MATERIALIZED VIEW analytics.dws_craftsman_daily AS
SELECT
deduct_operator AS craftsman_name,
biz_date,
SUM(settlement_amount) AS 执行业绩,
COUNT(DISTINCT vip_num) AS 服务人数,
COUNT(*) AS 操作次数
FROM stg_712执行业绩
WHERE dept_l1 IN ('洗脸', '修丽可', '医萃托兰', '白鲸洗脸')
GROUP BY deduct_operator, biz_date;
任务 2B-2: KPI 管理指标层(9个物化视图)
关键视图实现:
客户分层(kpi_customer_segment):
- RFM 评分(参考各行业周期差异:生美30天、医美60-90天)
- 客户分层(高价值VIP/重点发展/潜力保持/需唤醒/已流失)
医生复购率(kpi_doctor_monthly):
- ⚠️ 此处有已知坑:必须按月限定范围
- 正确:子查询按月 GROUP BY → HAVING COUNT(*) >= 2
回访考核(kpi_recall_per[[f]]ormance):
- 完成率 = 已回访 / 总计划
- 平均延迟天数 = AVG(实际回访日期 - 计划回访日期)
任务 2B-3: APP 查询接口层(6个物化视图)
各视图面向不同用户角色:
| 视图 | 数据来源 | 刷新依赖 |
| app_customer_summary | DWD + KPI层 | DWD→DWS→KPI全刷新后 |
| app_customer_detail | DWD层 | DWD刷新后 |
| app_customer_monthly | DWS层 | DWS刷新后 |
| app_sta[[f]]f_summary | DWS+KPI层 | DWS→KPI刷新后 |
| app_store_dashboard | KPI层 | KPI刷新后 |
| app_consultant_customer_status | DWD+DWS层 | DWD→DWS刷新后 |
任务 2B-4: 刷新脚本
bash
#!/bin/bash
# refresh_all.sh — 按依赖顺序刷新所有物化视图
export PGHOST=localhost
export PGDATABASE=ruimeiyun_dw
export PGUSER=dify
export PGPASSWORD=***
echo "$(date): 刷新 DWD..."
psql -c "REFRESH MATERIALIZED VIEW CONCURRENTLY analytics.dwd_customer_visit;"
psql -c "REFRESH MATERIALIZED VIEW CONCURRENTLY analytics.dwd_customer_consultant;"
echo "$(date): 刷新 DWS..."
psql -c "REFRESH MATERIALIZED VIEW CONCURRENTLY analytics.dws_consultant_daily;"
psql -c "REFRESH MATERIALIZED VIEW CONCURRENTLY analytics.dws_doctor_daily;"
psql -c "REFRESH MATERIALIZED VIEW CONCURRENTLY analytics.dws_nurse_daily;"
psql -c "REFRESH MATERIALIZED VIEW CONCURRENTLY analytics.dws_craftsman_daily;"
echo "$(date): 刷新 KPI..."
# ... 9个KPI视图
echo "$(date): 刷新 APP..."
# ... 6个APP视图
echo "$(date): 全部刷新完成"