|
今天在群里跟大家简单讨论了一下R包:wordcloud,一个制作标签云的程序包。
CRAN:http://cran.r-project.org/;
1、加载程序包:
library(Rcpp) #衔接R与C++
library(RColorBrewer) #R客都懂的颜色程序包
library(wordcloud) #加载wordcloud包
2、wordcloud函数介绍:
commonality.cloud--公共词云
comparison.cloud--对比词云
textplot--基于X Y坐标的非重叠的文本云
wordcloud--常规的文字云
3、主要函数详细介绍:
先介绍一下常规的标签云,wordcloud; 因为R对中文的支持有限,所以后面能用英文演示的尽量用英文;
使用:
wordcloud(words,freq,scale=c(4,0.5),min.freq=3,max.words=Inf,random.order=TRUE,random.color=FALSE,
rot.per=.1,colors="black",ordered.colors=FALSE,use.r.layout=FALSE,...)
参数:
words:词
freq:词频
scale:c(最大字号,最小字号)
min.freq:最小限制频数,低于频数的词不被显示
max.words:显示词的数量
random.order:T:乱序,F:按频数降序排列
random.color:T:任意选择颜色,F:基于频数选择颜色
rot.per:T:水平,F:旋转90度
实例:
test1=read.csv("d:/R/wf2.csv")
rc=brewer.pal(9,"Set1") #使用Set1主题模板
wordcloud(test1$words,test1$fre,scale=c(5,0.5),min.freq=-Inf,max.words=Inf,colors=rc)
第二个函数介绍一下textplot,基于x,y坐标做词云;
rt=read.delim("clipboard") #剪切板读数据,特别是针对Excel数据
head(rt,3)
textplot(rt$x.lab,rt$y.lab,rt$city,cex=.67,col=brewer.pal(9,"Set1"))
共性词云和对比词云我用包里的例子,基于国情咨文的词频矩阵分析
以wordcloud包中例子为说明
if(require(tm)){
data(crude)
crude <- tm_map(crude, removePunctuation)
crude <- tm_map(crude, function(x)removeWords(x,stopwords()))
tdm <- TermDocumentMatrix(crude)
m <- as.matrix(tdm)
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word = names(v),freq=v)
wordcloud(d$word,d$freq)
#A bigger cloud with a minimum frequency of 2
wordcloud(d$word,d$freq,c(8,.3),2)
#Now lets try it with frequent words plotted first
wordcloud(d$word,d$freq,c(8,.5),2,,FALSE,.1)
##### with colors #####
if(require(RColorBrewer)){
pal <- brewer.pal(9,"BuGn")
pal <- pal[-(1:4)]
wordcloud(d$word,d$freq,c(8,.3),2,,FALSE,,.15,pal)
pal <- brewer.pal(6,"Dark2")
pal <- pal[-(1)]
wordcloud(d$word,d$freq,c(8,.3),2,,TRUE,,.15,pal)
#random colors
wordcloud(d$word,d$freq,c(8,.3),2,,TRUE,TRUE,.15,pal)
}
##### with font #####
wordcloud(d$word,d$freq,c(8,.3),2,,TRUE,,.15,pal,
vfont=c("gothic english","plain"))
wordcloud(d$word,d$freq,c(8,.3),2,100,TRUE,,.15,pal,vfont=c("script","plain"))
wordcloud(d$word,d$freq,c(8,.3),2,100,TRUE,,.15,pal,vfont=c("serif","plain"))
}
参考与http://blog.sina.com.cn/s/blog_6934cecb01016ikl.html
|
|