R语言 spa包 coraAI()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-30 14:21:08

coraAI(spa)
coraAI()所属R语言包：spa

                                       The Cora AI Data
                                       将科拉AI数据

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

The coraAI data consists of a response, journal indication matrix, and co-citation network. This data is a subset of the Cora text mining project (refer to reference).
coraAI数据的响应，杂志的指示矩阵，共引网络。此数据是科拉文本挖掘项目（参考参考）的一个子集。

The observations are text documents that consist of 879 published papers about either Artificial Intelligence (AI) or Machine Learning (ML).  The journal name for each document is available (8 journals and an other category). The observed co-citation graph is also available, where each vertex is a document (observation), and the edge is the count of citations in common between each document and all other documents.
这些意见是文本文档，包括大约879发表的论文无论是人工智能（AI）和机器学习（ML）。该杂志为每个文件的名称是（8期刊和其他类）。所观察到的共引图也是可用的，其中每个顶点是一个文档（观察），和边缘是共同的每个文档之间和其他所有文件引用计数。

The goal is to incorporate both the text information and co-citation information for the prediction of paper subject AI/ML. Another, interesting problem might be to predict the journal of the paper given the text information and the categorization.
我们的目标是把文本信息和合作引文信息论文题目的预测AI/ML。另外，有趣的问题，可能是预测的杂志的文件的文本信息和分类。

用法----------Usage----------

data(coraAI)

格式----------Format----------

The coraAI data consists of three objects each discussed next.
coraAI数据由3个对象，每个对象在下面讨论。

class: categorization of the document(observation) as either AI or ML.  Typically the response.
class：分类的文件（观察）作为是AI或ML。通常情况下，响应。

journals: indication of the document as published in a specific journal, (other, artificial-intelligence, machine-learning, nueral-computing, ieee-trans-Nnet, ieee-tpami, j-artificial-intelligence-research, ai-magazine, JASA)
journals：指示的文件公布在一个特定的杂志，（另外，人工智能，机器学习，推力的计算，IEEE反Nnet，IEEE-tpami，J-人工智能的研究， AI杂志，JASA）

cite: the adjacency matrix of the co-citation network for these 879 documents.
cite：879文件的共引网络的邻接矩阵的。

Details

详细信息----------Details----------

The spa is particularly appealing for this data since it fits a function directly to the graph and coeficient vector to the journals. Other approaches require convergence of the journal information into a graph for processing, which is unclear when the data is a binary design matrix.
该温泉是特别有吸引力，因为它符合这个数据的功能直接到图形和系数向量的期刊。其他的方法，需要到用于处理的曲线图，这是不清楚的，当数据是二进制的设计矩阵中的log信息的收敛性。

源----------Source----------

The data was generated using AWK scripting from the cora raw sweet (first reference).  The journal names were fixed to obtain a useable representation (e.g. tpami, ieee tpami, pami are all ieee-tpami).
使用awk脚本生成的科拉原料甜（第一次提及）的数据。该杂志的名字是固定的，以获得一个可用的表示（如tpami，IEEE tpami，PAMI是所有IEEE-tpami的）。

参考文献----------References----------

A. McCallum, K. Nigam, J. Rennie, and K. Seymore (2000). Automating the construction of internet portals with machine learning. Information Retrieval Journal, 3.
M. Culp (2011). spa: A Semi-Supervised R Package for Semi-Parametric Graph-Based Estimation. Journal of Statistical Software, 40(10), 1-29. URL http://www.jstatsoft.org/v40/i10/.

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册