数据库简单的实在用 ②

章讲述了对于数据量较少的拍卖。包括新建数据库、使用SQL语言建表、向数据库被导入数据、简单的数目查询操作等。下面介绍一个对立上一个较为复杂的数据库。其实也尽管是数据量增大了碰,数据处理要求多矣把,运用基础之数据库知识和网上的各种教学都足以控制这些处理办法。

例2

 下面列举某宝一些数量处理。

Table 1: Online user behavior before Dec. 2015. (ijcai2016_taobao)

Field

Description

User_id

unique user id

Seller_id

unique online seller id

Item_id

unique item id

Category_id

unique category id

Online_Action_id

“0” denotes “click” while “1” for “buy”

Time_Stamp

date of the format “yyyymmdd”

 

统计内容:

  • 用户计数特征

  • 用户总交互次数,用户点击次数,用户买次数

  • 用户交互、点击、购买之差商品种类数
  • 用户交互、点击、购买之不同商品数

  • 用户时间层级(预测时间吗20151201)

  • 不久前1天/3天/10天用户对货物的彼此/点击/购买数

  • 近些年1天/3天/10龙用交互的差商品种类总数
  • 用户周末(周五六日)对货物之相互、点击、购买次数

  • 用户比值特征

  • 用户点击到贾商品之转化率

  • 用户周末点击、购买占总点击、购买之比值

  • 货计数特征

  • 受点击、被打次数最多的前面5货物

  • 货物被点击、购买的平分时间距离

提交结果:

  1. 用户表

1

user_id

2

用户交互次数

3

点击次数

4

购买次数

5

交互不同商品数

6

点击不同商品数

7

购买不同商品数

8

交互不同商品种类数

9

点击不同商品种类数

10

购买不同商品种类数

11

最近1天用户对商品的交互数

12

最近1天用户对商品的点击数

13

最近1天用户对商品的购买数

14

最近3天用户对商品的交互数

15

最近3天用户对商品的点击数

16

最近3天用户对商品的购买数

17

最近10天用户对商品的交互数

18

最近10天用户对商品的点击数

19

最近10天用户对商品的购买数

20

周末用户对商品的交互数

21

周末用户对商品的点击数

22

周末用户对商品的购买数

23

用户点击到购买商品的转化率

24

用户周末点击、购买占总点击、购买的比值

 

  1. 商品表

 

1

item_id

2

商品被点击平均时间间隔

3

商品被购买平均时间间隔

 

实现工具:SQL Server 2014
、基本的数额

1.这边我们得以同例1不同,可以先行拿淘宝的数目导入到数据库中,但是若可能会咨询:表都没有起数量导入到哪里吗?其实如果在例1中的导入数据时认真观察的话,可以窥见当我们将excel导入时候,会起一个默认的表明,并且这表底讳跟我们导入的excel的表名字相同(通常状态下,为Sheet),所以我们这里就直接导入数据了哦。

 SQL Server 1

SQL Server 2

SQL Server 3

SQL Server 4

SQL Server 5

SQL Server 6

SQL Server 7

SQL Server 8

SQL Server 9

2.修改表的列名

SQL Server 10

3.视察数据是否成导入

 

use 淘宝

SELECT *
FROM dbo.taobao

 

SQL Server 11

4.完事前的4起要求(注意**Online_Action_id=1 要么 0
代表不与含义**)

  1. 用户表

1

user_id

2

用户交互次数

3

点击次数

4

购买次数

use 淘宝

select A.User_id, click as 点击次数,buy as 购买次数, click+buy as 用户交互次数
from(
    select user_id,count(Online_Action_id)  click
    from taobao
    where Online_Action_id=0
    group by user_id) as A
join
    (select user_id,count(Online_Action_id)  buy
    from taobao
    where Online_Action_id=1
    group by user_id)as B
on A.user_id=B.user_id

SQL Server 12

5.就要求5~7

5

交互不同商品数

6

点击不同商品数

7

购买不同商品数

use 淘宝

select A.user_id,click as 点击不同商品数 ,buy as 购买不同商品数,click+buy as 交互不同商品数
from(
    select user_id,count(Item_id)  click
    from taobao
    where Online_Action_id=0
    group by user_id) as A
join
    (select user_id,count(Item_id)  buy
    from taobao
    where Online_Action_id=1
    group by user_id)as B
on A.user_id=B.user_id

 SQL Server 13

6.就要求8~10(注意要字DISTINCT
的以)

8

交互不同商品种类数

9

点击不同商品种类数

10

购买不同商品种类数

 

use 淘宝

select A.user_id,click as 点击不同商品种类数 ,buy as 购买不同商品种类数,click+buy as 交互不同商品种类数
from(
    select user_id,count(DISTINCT Item_id)  click
    from taobao
    where Online_Action_id=0
    group by user_id) as A
join
    (select user_id,count(DISTINCT Item_id)  buy
    from taobao
    where Online_Action_id=1
    group by user_id)as B
on A.user_id=B.user_id

SQL Server 14

 7.就要求11~19(只是运气的改变从而一定给仅生三件职责
这里用<2 还是<4还是<11
是坐<的操作效率比<=要高!)

关于详细讲解见博客http://blog.chinaunix.net/uid-20586655-id-3406139.html

 DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,’20151201′,110))<2 

有关此函数的企图:详情见http://www.w3school.com.cn/sql/func\_datediff.asp

10

购买不同商品种类数

11

最近1天用户对商品的交互数

12

最近1天用户对商品的点击数

13

最近1天用户对商品的购买数

14

最近3天用户对商品的交互数

15

最近3天用户对商品的点击数

16

最近3天用户对商品的购买数

17

最近10天用户对商品的交互数

18

最近10天用户对商品的点击数

19

最近10天用户对商品的购买数

use 淘宝

SELECT A.user_id ,click as 最近1天用户对商品的点击数 ,buy as 最近1天用户对商品的购买数,click+buy as 最近1天用户对商品的交互数
FROM 
    (SELECT User_id ,count(Online_Action_id)  click
    fROM dbo.taobao
    WHERE  Online_Action_id=0 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<2
    GROUP BY User_id)A
join
    (SELECT User_id ,count(Online_Action_id)  buy
    FROM dbo.taobao
    WHERE Online_Action_id=1 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<2
    GROUP BY User_id)B
ON A.user_id=B.user_id

SQL Server 15

为有整体的11~19询问语句

use 淘宝
SELECT A.user_id ,click as 最近1天用户对商品的点击数 ,buy as 最近1天用户对商品的购买数,click+buy as 最近1天用户对商品的交互数
from 
(SELECT User_id ,count(Online_Action_id)  click
from dbo.taobao
Where  Online_Action_id=0 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<2
Group By User_id)A
Join
(SELECT User_id ,count(Online_Action_id)  buy
from dbo.taobao
Where  Online_Action_id=1 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<2
Group By User_id)B
on A.user_id=B.user_id

SELECT A.User_id ,click as 最近3天用户对商品的点击数 ,buy as 最近3天用户对商品的购买数,click+buy as 最近3天用户对商品的交互数
from 
(SELECT User_id ,count(Online_Action_id)  click
from dbo.taobao
Where  Online_Action_id=0 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<4
Group By User_id)A
Join
(SELECT User_id ,count(Online_Action_id)  buy
from dbo.taobao
Where  Online_Action_id=1 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<4
Group By User_id)B
on A.user_id=B.user_id


SELECT A.User_id ,click as 最近10天用户对商品的点击数 ,buy as 最近10天用户对商品的购买数,click+buy as 最近10天用户对商品的交互数
from 
(SELECT User_id ,count(Online_Action_id)  click
from dbo.taobao
Where  Online_Action_id=0 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<11
Group By User_id)A
Join
(SELECT User_id ,count(Online_Action_id)  buy
from dbo.taobao
Where  Online_Action_id=1 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<11
Group By User_id)B
on A.user_id=B.user_id

SQL Server 16

8.好要求20~22(注意我们的每周第一上和美国之每周第一龙是殊的,

为此开如用语句SET DATEFIRST
1设置第一上为周一)

每当这边的DATEPART(dw,CONVERT(date,time_stamp,110))

参看http://www.w3school.com.cn/sql/func\_datepart.asp讲解

 

20

周末用户对商品的交互数

21

周末用户对商品的点击数

22

周末用户对商品的购买数

use 淘宝
set DATEFIRST 1
SELECT A.User_id ,click as 周末用户对商品的点击数 ,buy as 周末用户对商品的购买数,click+buy as 周末用户对商品的交互数
FROM
    (SELECT USER_id,Count(online_action_id) click
    FROM taobao
    WHERE online_action_id =0
    AND DATEPART(dw,CONVERT(date,time_stamp,110))in(5,6,7)
    GROUP BY User_id)A
join
    (SELECT USER_id,Count(online_action_id) buy
    FROM taobao
    WHERE online_action_id =1
    AND DATEPART(dw,CONVERT(date,time_stamp,110))in(5,6,7)
    GROUP BY User_id)B
ON A.user_id = B.user_id

SQL Server 17

 9.就要求23

函数ltrim(Convert(numeric(9,2),buy*100.0/(click+buy)))+’%’as
用户点击到买入商品之转速率 

详情见https://msdn.microsoft.com/zh-cn/library/ms177827.aspx

23

用户点击到购买商品的转化率

 

use 淘宝

SELECT A.user_id,buy,click ,ltrim(Convert(numeric(9,2),buy*100.0/(click+buy)))+'%' as 用户点击到购买商品的转化率
FROM 
    (SELECT USER_id,Count(online_action_id) click
    FROM taobao
    WHERE online_action_id =0
    AND DATEPART(dw,CONVERT(date,time_stamp,110))in(5,6,7)
    GROUP BY User_id)A
join
    (SELECT USER_id,Count(online_action_id) buy
    FROM taobao
    WHERE online_action_id =1
    AND DATEPART(dw,CONVERT(date,time_stamp,110))in(5,6,7)
    GROUP BY User_id)B
ON A.user_id = B.user_id

 

 

SQL Server 18

10.成就要求24(对于这要求得应用前面的星期点击和周末市除以总点击、总购进)

24

用户周末点击、购买占总点击、购买的比值

 

那关键问题是咱该怎么将之前分开的查询操作统一在共同好一个询问操作也。

今昔出狱我们的大招

使用with
可以拿我们的查询语句写以一块儿,并且可以同堆的查询语句进行更命名,之后才待利用这更命名就好了。具体操作见下的事例

use 淘宝;
set datefirst 1;//加上分号
with cnt_action as(
select a.User_id, click as 点击次数,buy as 购买次数, click+buy as 用户交互次数
from(
select user_id,count(Online_Action_id)  click
from taobao
where Online_Action_id=0
group by user_id)A
join
(select user_id,count(Online_Action_id)  buy
from taobao
where Online_Action_id=1
group by user_id)B
on A.user_id=B.user_id),//注意加上逗号

cnt_click_wk as(
SELECT A.User_id ,click as 周末用户对商品的点击数 ,buy as 周末用户对商品的购买数,click+buy as 周末用户对商品的交互数
from 
(select USER_id,Count(online_action_id) click
from taobao
where online_action_id =0
and DATEPART(dw,CONVERT(date,time_stamp,110))in(5,6,7)
group by User_id)A
join
(select USER_id,Count(online_action_id) buy
from taobao
where online_action_id =1
and DATEPART(dw,CONVERT(date,time_stamp,110))in(5,6,7)
group by User_id)B
on A.user_id = B.user_id)//这里不加逗号

Select cnt_action.user_id,cnt_action.用户交互次数,cnt_action.点击次数,cnt_action.购买次数,

ltrim(Convert(numeric(9,2),cnt_click_wk.周末用户对商品的点击数*100.0/(cnt_action.点击次数)))+'%' as 用户周末点击占总点击的比值,
ltrim(Convert(numeric(9,2),cnt_click_wk.周末用户对商品的购买数*100.0/(cnt_action.购买次数)))+'%' as 用户周末购买占总购买的比值

from cnt_action join cnt_click_wk on cnt_action.user_id=cnt_click_wk.user_id //直接使用重命名就可以

每当此处要注意以下几点:

1.每个重命名都需要不同;

2.每个查询以后的右括号及逗号(,) 最后一个请勿加以;

3.对于ltrim(Convert(numeric(9,2),cnt_click_wk.周末用户对商品之点击数*100.0/(cnt_action.点击次数)))+’%’as
用户周末点击占总点击的比率,这些函数的使用需要经过网上的追寻来学会运用;

4.还有就是是以with之前的十分语句末尾要抬高分号(;)

SQL Server 19

 

11.地方的with使用被了咱新的技巧,那么我们好为此with放出一个大幅度的招。这无异于造成可于我们把上面有查询得到的表全部齐心协力化一摆表,这将凡一个巨大无比之阐发,在没有完全控制with的当儿,千万不要尝试这项操作。因为实际是极端怪了,而且这招数会使得程序员十分沉,有无比多之再次代码。

为此于with大招之前,我们还是将各个一个询问都亲写一普,这样在把它组合在一起的时刻就是非会见深感陌生。

现今见证大招的威力:

use 淘宝;
set datefirst 1;
with cnt_action as(
select a.User_id, click as 点击次数,buy as 购买次数, click+buy as 用户交互次数
from(
select user_id,count(Online_Action_id)  click
from taobao
where Online_Action_id=0
group by user_id)A
join
(select user_id,count(Online_Action_id)  buy
from taobao
where Online_Action_id=1
group by user_id)B
on A.user_id=B.user_id),


cnt_item as(
select A.user_id,click as 点击不同商品数 ,buy as 购买不同商品数,click+buy as 交互不同商品数
from
(select USER_id,Count(item_id) click
from dbo.taobao
where Online_Action_id = 0 
group by user_id) A
join
(select USER_id,Count(item_id) buy
from dbo.taobao
where Online_Action_id = 1 
group by user_id) B
on A.user_id=B.user_id),

cnt_item_type as(
select A.user_id,click as 点击不同商品种类数 ,buy as 购买不同商品种类数,click+buy as 交互不同商品种类数
from
(select USER_id,Count(DISTINCT item_id) click
from dbo.taobao
where Online_Action_id = 0 
group by user_id) A
join
(select USER_id,Count(DISTINCT item_id) buy
from dbo.taobao
where Online_Action_id = 1 
group by user_id) B
on A.user_id=B.user_id),

cnt_oneday as(
SELECT A.user_id ,click as 最近1天用户对商品的点击数 ,buy as 最近1天用户对商品的购买数,click+buy as 最近1天用户对商品的交互数
from 
(SELECT User_id ,count(Online_Action_id)  click
from dbo.taobao
Where  Online_Action_id=0 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<2
Group By User_id)A
Join
(SELECT User_id ,count(Online_Action_id)  buy
from dbo.taobao
Where  Online_Action_id=1 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<2
Group By User_id)B
on A.user_id=B.user_id),

cnt_threeday as(
SELECT A.User_id ,click as 最近3天用户对商品的点击数 ,buy as 最近3天用户对商品的购买数,click+buy as 最近3天用户对商品的交互数
from 
(SELECT User_id ,count(Online_Action_id)  click
from dbo.taobao
Where  Online_Action_id=0 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<4
Group By User_id)A
Join
(SELECT User_id ,count(Online_Action_id)  buy
from dbo.taobao
Where  Online_Action_id=1 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<4
Group By User_id)B
on A.user_id=B.user_id),

cnt_tenday as(
SELECT A.User_id ,click as 最近10天用户对商品的点击数 ,buy as 最近10天用户对商品的购买数,click+buy as 最近10天用户对商品的交互数
from 
(SELECT User_id ,count(Online_Action_id)  click
from dbo.taobao
Where  Online_Action_id=0 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<11
Group By User_id)A
Join
(SELECT User_id ,count(Online_Action_id)  buy
from dbo.taobao
Where  Online_Action_id=1 AND DATEDIFF(D,CONVERT(date,time_stamp,110),convert(date,'20151201',110))<11
Group By User_id)B
on A.user_id=B.user_id),

cnt_click_wk as(
SELECT A.User_id ,click as 周末用户对商品的点击数 ,buy as 周末用户对商品的购买数,click+buy as 周末用户对商品的交互数
from 
(select USER_id,Count(online_action_id) click
from taobao
where online_action_id =0
and DATEPART(dw,CONVERT(date,time_stamp,110))in(5,6,7)
group by User_id)A
join
(select USER_id,Count(online_action_id) buy
from taobao
where online_action_id =1
and DATEPART(dw,CONVERT(date,time_stamp,110))in(5,6,7)
group by User_id)B
on A.user_id = B.user_id),

cnt_percent as(
SELECT A.user_id,buy,click ,ltrim(Convert(numeric(9,2),buy*100.0/(click+buy)))+'%' as 用户点击到购买商品的转化率
from 
(select USER_id,Count(online_action_id) click
from taobao
where online_action_id =0
and DATEPART(dw,CONVERT(date,time_stamp,110))in(5,6,7)
group by User_id)A
join
(select USER_id,Count(online_action_id) buy
from taobao
where online_action_id =1
and DATEPART(dw,CONVERT(date,time_stamp,110))in(5,6,7)
group by User_id)B
on A.user_id = B.user_id)

Select cnt_item.user_id,cnt_action.用户交互次数,cnt_action.点击次数,cnt_action.购买次数,
cnt_item.交互不同商品数,cnt_item.点击不同商品数,cnt_item.购买不同商品数,
cnt_item_type.交互不同商品种类数,cnt_item_type.点击不同商品种类数,cnt_item_type.购买不同商品种类数,
cnt_oneday.最近1天用户对商品的交互数,cnt_oneday.最近1天用户对商品的点击数,cnt_oneday.最近1天用户对商品的购买数,
cnt_threeday.最近3天用户对商品的交互数,cnt_threeday.最近3天用户对商品的点击数,cnt_threeday.最近3天用户对商品的购买数,
cnt_tenday.最近10天用户对商品的交互数,cnt_tenday.最近10天用户对商品的点击数,cnt_tenday.最近10天用户对商品的购买数,
cnt_click_wk.周末用户对商品的交互数,cnt_click_wk.周末用户对商品的点击数,cnt_click_wk.周末用户对商品的购买数,
cnt_percent.用户点击到购买商品的转化率,
ltrim(Convert(numeric(9,2),cnt_click_wk.周末用户对商品的点击数*100.0/(cnt_action.点击次数)))+'%' as 用户周末点击占总点击的比值,
ltrim(Convert(numeric(9,2),cnt_click_wk.周末用户对商品的购买数*100.0/(cnt_action.购买次数)))+'%' as 用户周末购买占总购买的比值

from cnt_action join cnt_item on cnt_action.user_id=cnt_item.user_id 
left join cnt_item_type on cnt_action.user_id =cnt_item_type.user_id 
left join cnt_oneday on cnt_item.user_id=cnt_oneday.user_id
left join cnt_threeday on cnt_item.user_id=cnt_threeday.user_id
left join cnt_tenday on cnt_item.user_id=cnt_tenday.user_id
left join cnt_click_wk on cnt_item.user_id=cnt_click_wk.user_id
left join cnt_percent on cnt_item.user_id=cnt_percent.user_id

SQL Server 20

 12.通过前的练,后面的商品表只是有些函数的应用

商品表

1

item_id

2

商品被点击平均时间间隔

3

商品被购买平均时间间隔

use 淘宝;

Select A.item_id,click_time as 商品被点击平均时间间隔,buy_time as 商品被购买平均时间间隔
from(
select A.item_id, item_rank, max_date, min_date, cnt_click, 0 - cnt_click/DATEDIFF(D, max_date, min_date) as click_time
from  
(select item_id, dense_rank() over(order by count(online_action_id) desc) item_rank 
from taobao 
where online_action_id = 0 
group by item_id) A 
inner join  
(select item_id, max(CONVERT(date,time_stamp,101)) max_date,min(CONVERT(date,time_stamp,101)) min_date,count(online_action_id)cnt_click 
from taobao 
where online_action_id = 0 
group by item_id 
)B on A.item_id = B.item_id 
where item_rank < 6)A
left join
(select A.item_id, item_rank, max_date, min_date, cnt_click, 0 - cnt_click/DATEDIFF(D, max_date, min_date) as buy_time
from  
(select item_id, dense_rank() over(order by count(online_action_id) desc) item_rank 
from taobao 
where online_action_id = 1 
group by item_id) A 
join  
(select item_id, max(CONVERT(date,time_stamp,101)) max_date,min(CONVERT(date,time_stamp,101)) min_date,count(online_action_id)cnt_click 
from taobao 
where online_action_id = 1 
group by item_id 
)B on A.item_id = B.item_id 
where item_rank < 6)B
on A.item_id=B.item_id

SQL Server 21

 

相关文章