ploylt原来是一款用来做数据分析和可视化的在线平台,后来有人开发了一些语言(Python、R、Matlab等)的API,在R里就是plotly包了。plotly已经发布在CRAN上了,要想安装,仅需:
install.packages('plotly')
这里说一下旭日图(sunburst chart)。旭日图是在饼图的基础上拓展的(关于摒弃饼图暂且不考虑),主要展示分类变量的比例,并且将分类变量进行了分级,每一级的占总体的比例一览无余。R语言中有专门绘制旭日图的包:sunburstR,然而是绘出的图形是静态的。plotly作为强大的交互式可视化包,当然也有这个功能,并且点击父级可以仅展示该父级及其子级,方便用户探索各级的比例以及父级的子级占该父级的比例(以下例子均来自官网):
library(plotly)
fig <- plot_ly(
#定义所有级别各类的标签
labels = c("Eve", "Cain", "Seth", "Enos", "Noam", "Abel", "Awan", "Enoch", "Azura"),
#定义所有级别各类的父级,与上面定义的标签一一对应
parents = c("", "Eve", "Eve", "Seth", "Seth", "Eve", "Eve", "Awan", "Eve"),
#定义各分类的值(一一对应)
values = c(10, 14, 12, 10, 2, 6, 6, 4, 4),
#指定图表类型:sunburst
type = 'sunburst'
)
#输出
fig
library(plotly)
fig <- plot_ly(
labels = c("Eve", "Cain", "Seth", "Enos", "Noam", "Abel", "Awan", "Enoch", "Azura"),
parents = c("", "Eve", "Eve", "Seth", "Seth", "Eve", "Eve", "Awan", "Eve"),
values = c(65, 14, 12, 10, 2, 6, 6, 4, 4),
type = 'sunburst',
#设置子级的值:total。也就是子级的图形长度与 其值与父级的值的比例 成正比。
#注意之前那张图里没有设定这个,父级'Seth'及其子级'Noam'和'Enos'就不符合这个设定
branchvalues = 'total'
)
fig
library(plotly)
d <- data.frame(
#定义id,,可以理解为主键
ids = c(
"North America", "Europe", "Australia", "North America - Football", "Soccer",
"North America - Rugby", "Europe - Football", "Rugby",
"Europe - American Football","Australia - Football", "Association",
"Australian Rules", "Autstralia - American Football", "Australia - Rugby",
"Rugby League", "Rugby Union"
),
#定义标签,即显示的文本。这里面的 <br> 是HTML的语法,表示换行
labels = c(
"North<br>America", "Europe", "Australia", "Football", "Soccer", "Rugby",
"Football", "Rugby", "American<br>Football", "Football", "Association",
"Australian<br>Rules", "American<br>Football", "Rugby", "Rugby<br>League",
"Rugby<br>Union"
),
#定义父级
parents = c(
"", "", "", "North America", "North America", "North America", "Europe",
"Europe", "Europe","Australia", "Australia - Football", "Australia - Football",
"Australia - Football", "Australia - Football", "Australia - Rugby",
"Australia - Rugby"
),
stringsAsFactors = FALSE
)
fig <- plot_ly(d, ids = ~ids, labels = ~labels, parents = ~parents, type = 'sunburst')
fig
这里gif上传失败,只好mp4了https://www.zhihu.com/video/1220318292180885504
各级各类之间来回切换,是不是很有意思!
df = read.csv('https://raw.githubusercontent.com/plotly/datasets/718417069ead87650b90472464c7565dc8c2cb1c/coffee-flavors.csv')
fig <- plot_ly()
fig <- fig %>% add_trace(
type='sunburst',
ids=df$ids,
labels=df$labels,
parents=df$parents,
#设置最大深度为2,也就是最多只展示两级
maxdepth=2,
#控制扇区内文本的方向,这里设置为radial(径向的)
insidetextorientation='radial'
)
fig
可以看到,设置了最大深度为2后,最多只显示2级。但是数据其实是有三级的,点击第二级之后就可以看到第三级了。如果要想显示所有层级,可以把上面代码中的'maxdepth = 2'删去,但这样图形就显得很乱,甚至会有卡顿的情况……
library(plotly)
d1 <- read.csv('https://raw.githubusercontent.com/plotly/datasets/master/coffee-flavors.csv')
d2 <- read.csv('https://raw.githubusercontent.com/plotly/datasets/718417069ead87650b90472464c7565dc8c2cb1c/sunburst-coffee-flavors-complete.csv')
fig <- plot_ly()
fig <- fig %>%
#添加轨迹,相当于ggplot2的图层geom。这里是定义子图1
add_trace(
ids = d1$ids,
labels = d1$labels,
parents = d1$parents,
#定义轨迹的类型:sunburst
type = 'sunburst',
maxdepth = 2,
#子图1放在第1列(plotly以0开始计数)
domain = list(column = 0)
)
fig <- fig %>%
#子图2
add_trace(
ids = d2$ids,
labels = d2$labels,
parents = d2$parents,
type = 'sunburst',
maxdepth = 3,
#子图放在第2列
domain = list(column = 1)
)
fig <- fig %>%
#定义样式
layout(
#网格:2列1行
grid = list(columns =2, rows = 1),
margin = list(l = 0, r = 0, b = 0, t = 0),
#颜色
sunburstcolorway = c(
"#636efa","#EF553B","#00cc96","#ab63fa","#19d3f3",
"#e763fa", "#FECB52","#FFA15A","#FF6692","#B6E880"
),
extendsunburstcolors = TRUE)
fig
是不是很有意思?R语言可视化不止ggplot2及其65个扩展包(截至目前),还有更多值得探索,这篇文章就介绍到这里了!