这个 SQL 问题绝对能让你对 MySQL 的理解更进一步!
属性表 (product_props) 结构如下#
数据量 800W 以上
字段名 | 类型 | 说明 |
---|---|---|
id | int | id |
pn_id | int | 属性类型 |
pv_id | int | 属性值 |
product_id | int | 产品 ID |
其中 product_id 与 pn_id,pv_id 是一对多的关系。
数据类似这样:
product_id | pn_id | pv_id |
---|---|---|
10970 | 5 (型号) | 135 (苹果 9) |
10970 | 11 (内存) | 23 (512G) |
10970 | 10 (颜色) | 17 (土豪金) |
10970 | 8 (网络) | 6(5G) |
10980 | 5 | 135 |
10980 | 11 | 24 (1024G) |
10980 | 10 | 16 (极光蓝) |
产品表 (product) 结构如下#
数据量 40W 以上
字段名 | 类型 | 说明 |
---|---|---|
product_id | int | product_id |
type_id | int | 类型 id |
brand_id | int | 品牌 id |
model_id | int | 型号 id |
status | tinyint | 状态 |
数据类似以下:
product_id | type_id | brand_id | model_id | status |
---|---|---|---|---|
10970 | 1(手机) | 1 (苹果) | 1(Iphone8) | 1 (正常) |
10980 | 1(手机) | 1 (苹果) | 1(Iphone8X) | 3 (已售) |
10981 | 1(手机) | 1 (苹果) | 1(Iphone8XP) | 1 (正常) |
问题#
找出型号为苹果 9 同时内存为 512G,颜色为土豪金,状态为正常的产品总数,
ps : 属性条件可能会有超过 10 组。
要求#
性能第一,杜绝聚合函数等
原问题的解决方案性能排行#
- 来自 @Kamicloud 的 exist 方案
SELECT
sql_no_cache `product_id`
FROM
`zx_tests` AS a
WHERE
`pn_id` = 101 AND `pv_id` = 59
AND EXISTS( SELECT
sql_no_cache *
FROM
`zx_tests`
WHERE
a.product_id = product_id and
`pn_id` = 101 AND `pv_id` = 171);
2 组条件下 0.657,3 组 0.695,4 组 0.759,5 组 0.743 (单独查属性表)
- 来自 @Elijah_Wang 的子查询方案
SELECT `product_id` FROM `product` WHERE `pn_id` = 5 AND `pv_id` = 135 AND `product_id` IN (SELECT `product_id` FROM `product` WHERE `pn_id` = 11 AND `pv_id` = 23);
2 组条件下 0.729,3 组 0.75,4 组 0.730,5 组 0.757 (新问题之前)
新问题之后的性能排行#
- 来自 @Elijah_Wang 的子查询方案
select SQL_NO_CACHE count(1) from pdi_product a join (
SELECT
distinct product_id
FROM
`product_props`
WHERE
`pn_id` = 5
AND `pv_id` = 127
AND `product_id` IN ( SELECT `product_id` FROM `product_props` WHERE `pn_id` = 11 AND `pv_id` = 22 )
AND `product_id` IN ( SELECT `product_id` FROM `product_props` WHERE `pn_id` = 10 AND `pv_id` = 18 )
AND `product_id` IN ( SELECT `product_id` FROM `product_props` WHERE `pn_id` = 8 AND `pv_id` = 6 )
AND `product_id` IN ( SELECT `product_id` FROM `product_props` WHERE `pn_id` = 9 AND `pv_id` = 1 )
) b on a.product_id = b.product_id
where a.status = 1;
耗时 1.5-1.56 (执行 10 次的范围)
- expain 分析:
select SQL_NO_CACHE count(1) from pdi_product a
where a.status = 1 and a.product_id in (SELECT
distinct product_id
FROM
`product_props`
WHERE
`pn_id` = 5
AND `pv_id` = 127
AND `product_id` IN ( SELECT `product_id` FROM `product_props` WHERE `pn_id` = 11 AND `pv_id` = 22 )
AND `product_id` IN ( SELECT `product_id` FROM `product_props` WHERE `pn_id` = 10 AND `pv_id` = 18 )
AND `product_id` IN ( SELECT `product_id` FROM `product_props` WHERE `pn_id` = 8 AND `pv_id` = 6 )
AND `product_id` IN ( SELECT `product_id` FROM `product_props` WHERE `pn_id` = 9 AND `pv_id` = 1 ))
耗时 0.69-0.72(执行 10 次的范围)
- explain 分析:
- 来自 @Kamicloud 的 exist 方案
SELECT SQL_NO_CACHE
count(1)
FROM
product a
WHERE
a.STATUS = 1
AND a.product_id IN (
SELECT DISTINCT
`product_id`
FROM
`product_props` AS a
WHERE
a.`pn_id` = 5
AND a.`pv_id` = 127
AND EXISTS ( SELECT product_id FROM `product_props` WHERE a.product_id = product_id AND `pn_id` = 11 AND `pv_id` = 22 )
AND EXISTS ( SELECT product_id FROM `product_props` WHERE a.product_id = product_id AND `pn_id` = 10 AND `pv_id` = 18 )
AND EXISTS ( SELECT product_id FROM `product_props` WHERE a.product_id = product_id AND `pn_id` = 9 AND `pv_id` = 1 )
AND EXISTS ( SELECT product_id FROM `product_props` WHERE a.product_id = product_id AND `pn_id` = 8 AND `pv_id` = 6 )
);
耗时 5.7-5.85 (执行 10 次的范围)
- explain 分析:
SELECT SQL_NO_CACHE
count(1)
FROM
pdi_product a
join (SELECT DISTINCT
`product_id`
FROM
`product_props` AS a
WHERE
a.`pn_id` = 5
AND a.`pv_id` = 127
AND EXISTS ( SELECT product_id FROM `product_props` WHERE a.product_id = product_id AND `pn_id` = 11 AND `pv_id` = 22 )
AND EXISTS ( SELECT product_id FROM `product_props` WHERE a.product_id = product_id AND `pn_id` = 10 AND `pv_id` = 18 )
AND EXISTS ( SELECT product_id FROM `product_props` WHERE a.product_id = product_id AND `pn_id` = 9 AND `pv_id` = 1 )
AND EXISTS ( SELECT product_id FROM `product_props` WHERE a.product_id = product_id AND `pn_id` = 8 AND `pv_id` = 6 ) ) b
on a.product_id = b.product_id
WHERE
a.STATUS = 1
耗时 5.7-6.0(执行 10 次的范围)
- explain 分析:
可以看到如果单纯查属性表,第一位的速度是最快的,可要查产品状态后,速度反而不如子查询。
经 explain 分析,第一个子查询速度之所以快是因为它的 sql 简单,select_type 皆为 simple。
而不管是 join 还是 exists 的方式,select_type 大多为 DERIVED,DEPENDENT SUBQUERY。
结果#
经讨论后,决定迁移到 ES。
PS:你的赞是我创作的动力!
为什么同是 9 年义务教育别人就那么优秀?
想知道请关注订阅号:Buger(关注送 laravel,linux,nginx 等学习资料!!!)
回复 ' 学习 ',推荐你 2 本书。
本作品采用《CC 协议》,转载必须注明作者和本文链接
推荐文章: