inpIDMapper(AnnotationDbi)
inpIDMapper()所属R语言包:AnnotationDbi
Convenience functions for mapping IDs through an appropriate set of
映射ID的便利功能,通过适当的一套
译者:生物统计家园网 机器人LoveR
描述----------Description----------
These are a set of convenience functions that attempt to take a list of IDs along with some addional information about what those IDs are, what type of ID you would like them to be, as well as some information about what species they are from and what species you would like them to be from and then attempts to the simplest possible conversion using the organism and possible inparanoid annotation packages. By default, this function will drop ambiguous matches from the results. Please see the details section for more information about the parameters that can affect this. If a more complex treatment of how to handle multiple matches is required, then it is likely that a less convenient approach will be necessary.
这是一套方便的功能,试图把一些有关这些ID是什么addional信息的ID列表,什么类型的ID,你想他们,以及一些什么品种,他们是从什么信息种你想他们从,然后试图以尽可能简单的转换,使用有机物和可能INPARANOID注解包。默认情况下,这个函数将从暧昧比赛的结果,。请参阅有关信息的参数,可以影响这个细节部分。如果需要的更为复杂的处理,如何处理多个匹配,那么它很可能将是必要的,不太方便的方法。
用法----------Usage----------
inpIDMapper(ids, srcSpecies, destSpecies, srcIDType="UNIPROT",
destIDType="EG", keepMultGeneMatches=FALSE, keepMultProtMatches=FALSE,
keepMultDestIDMatches = TRUE)
intraIDMapper(ids, species, srcIDType="UNIPROT", destIDType="EG",
keepMultGeneMatches=FALSE)
idConverter(ids, srcSpecies, destSpecies, srcIDType="UNIPROT",
destIDType="EG", keepMultGeneMatches=FALSE, keepMultProtMatches=FALSE,
keepMultDestIDMatches = TRUE)
参数----------Arguments----------
参数:ids
a list or vector of original IDs to match
原来的ID列表或向量,以配合
参数:srcSpecies
The original source species in in paranoid format. In other words, the 3 letters of the genus followed by 2 letters of the species in all caps. Ie. 'HOMSA' is for Homo sapiens etc.
原源种在偏执格式的。换句话说,属3个字母,其次是2种字母全部大写。即。 “HOMSA”是智人等
参数:destSpecies
the destination species in inparanoid format
INPARANOID格式目标物种
参数:species
the species involved
涉及的物种
参数:srcIDType
The source ID type written exactly as it would be used in a mapping name for an eg package. So for example, 'UNIPROT' is how the uniprot mappings are always written, so we keep that convention here.
源ID类型的书面正是因为它会被用于如包在映射名称。因此,例如,“UNIPROT”总是写,如何uniprot映射,所以我们保持该公约。
参数:destIDType
the destination ID, written the same way as you would write the srcIDType. By default this is set to "EG" for entrez gene IDs
目标ID,写同样的方式,你会写srcIDType。默认情况下此设置“,例如”Entrez基因身份证
参数:keepMultGeneMatches
Do you want to try and keep the 1st ID in those ambiguous cases where more than one protein is suggested? (You probably want to filter them out - hence the default is FALSE)
你要尽力保持第一的ID在一个以上的蛋白质建议那些模棱两可的情况下? (您可能想要过滤出来 - 因此默认为FALSE)
参数:keepMultProtMatches
Do you want to try and keep the 1st ID in those ambiguous cases where more than one protein is suggested? (default = FALSE)
你要尽力保持第一的ID在一个以上的蛋白质建议那些模棱两可的情况下? (默认为FALSE)
参数:keepMultDestIDMatches
If you have mapped to a destination ID OTHER than an entrez gene ID, then it is possible that there may be multiple answers. Do you want to keep all of these or only return the 1st one? (default = TRUE)
如果你已经映射到一个目的地ID比Entrez基因ID,那么很可能,可能有多个答案。你要保持所有这些,或只返回第一个? (默认= TRUE),
Details
详情----------Details----------
inpIDMapper - This is a convenience function for getting an ID from one species mapped to an ID type of your choice from another organism of your choice. The only mappings used to do this are the mappings that are scored as 100 according to the inparanoid algorithm. This function automatically tries to join IDs by using FIVE different mappings in the sequence that follows:
inpIDMapper - 这是一个方便的功能,获得一个物种从您选择的有机体从另一个映射到您所选择的ID类型的ID。只有做到这一点的映射,得分为100,根据的INPARANOID算法映射。此功能会自动尝试加入的ID使用如下的顺序,五个不同的映射:
1) initial IDs -> src organism Entrez Gene IDs 2) src organism Entrez Gene IDs -> sre organism Inparanoid ID 3) src organism Inparanoid ID -> dest organism Inparanoid ID 4) dest organism Inparanoid ID -> dest organism Entrez Gene ID 5) dest organism Entrez Gene ID -> final destination organism ID
1)初始的ID - > SRC有机体Entrez基因身份证2)SRC生物体的Entrez基因ID - > SRE有机体INPARANOID编号3)SRC有机体INPARANOID ID - > dest的有机体INPARANOID ID 4)目的有机体INPARANOID编号 - > dest的有机体目的有机体Entrez基因身份证5)Entrez基因身份证 - >最终目标生物体的ID
You can simplify this mapping as a series of steps like this:
作为一系列步骤,这样可以简化这个映射:
srcIDs —> srcEGs —> srcInp —> destInp —> destEGs —> destIDs (1) (2) (3) (4) (5)
srcIDs - > srcEGs - > srcInp - > destInp - > destEGs - > destIDs(1)(2)(3)(4)(5)
There are two steps in this process where multiple mappings can really interfere with getting a clear answer. It's no coincidence that these are also adjacent to the two places where we have to tie the identity to a single gene for each organism. When this happens, any ambiguity is confounding. Preceding step \#2, it is critical that we only have ONE entrez gene ID per initial ID, and the parameter keepMultGeneMatches can be used to toggle whether to drop any ambiguous matches (the default) or to keep the 1st one in the hope of getting an additional hit. A similar thing is done preceding step \#4, where we have to be sure that the protein IDs we are getting back have all mapped to only one gene. We allow you to use the keepMultProtMatches parameter to make the same kind of decision as in step \#2, again, the default is to drop anything that is ambiguous.
在这个过程中有两个步骤多个映射可以得到一个明确的答案真的干扰。这并非巧合,这些也都是相邻的两个地方,我们必须为每个生物体的单个基因,以配合身份。当这种情况发生的,任何含糊之处是混杂的。前面的步骤\#2,这是至关重要的,我们只有一个Entrez的每初始标识基因ID,可用于切换是否放弃任何的暧昧比赛(默认)或保持在希望的第一个参数keepMultGeneMatches获得一个额外的命中。前一步\#4,我们必须确保我们取回的蛋白质ID都映射到唯一的一个基因做类似的事情。我们允许你使用keepMultProtMatches参数作出决定的一步\#2,再次在同类中,默认是放弃,是含糊不清的东西。
intraIDMapper - This is a convenience function to map within an organism and so it has a much simpler job to do. It will either map through one mapping or two depending whether the source ID or destination ID is a central ID for the relevant organism package. If the answer is neither, then two mappings will be needed.
intraIDMapper - 这是一个方便的功能图内一个有机体,所以它有一个更简单的工作要做。这将通过一对一的映射取决于源ID或目标ID是否是中央有关生物体包ID或两个映射。如果既不是答案,然后将需要两个映射。
idConverter - This is mostly for convenient usage of these functions by developers. It is just a wrapper function that can pass along all the parameters to the appropriate function (intraIDMapper or inpIDMapper). It decides which function to call based on the source and destination organism. The disadvantage to using this function all the time is just that more of the parameters have to be filled out each time.
idConverter - 这主要是为方便使用,开发这些功能。这仅仅是一个包装函数,可以沿着所有的参数传递给相应的功能(intraIDMapper或inpIDMapper)。它决定哪些函数调用基于源和目的地的有机体。所有的时间使用此功能的缺点是,更多的参数只是每次要填写。
值----------Value----------
a list where the names of each element are the elements of the original list you passed in, and the values are the matching results. Elements that do not have a match are not returned. If you want things to align you can do some bookeeping.
一个列表,其中每个元素的名称是您通过原始列表的元素和价值观的匹配结果。没有匹配的元素都没有回来。如果你想要的东西对齐,你可以做一些bookeeping。
作者(S)----------Author(s)----------
Marc Carlson
举例----------Examples----------
## Not run: [#无法运行:]
## This has to be in a dontrun block because otherwise I would have to[#这是在dontrun块,否则我将不得不]
## expand the DEPENDS field for AnnotationDbi[#扩大依赖AnnotationDbi领域]
## library("org.Hs.eg.db")[#库(“org.Hs.eg.db”)]
## library("org.Mm.eg.db")[#库(“org.Mm.eg.db”)]
## library("org.Sc.eg.db")[#库(“org.Sc.eg.db”)]
## library("hom.Hs.inp.db")[#库(“hom.Hs.inp.db”)]
## library("hom.Mm.inp.db")[#库(“hom.Mm.inp.db”)]
## library("hom.Sc.inp.db")[#库(“hom.Sc.inp.db”)]
##Some IDs just for the example[#一些IDS只是为例子]
library("org.Hs.eg.db")
ids = as.list(org.Hs.egUNIPROT)[10000:10500] ##get some ragged IDs[#得到一些衣衫褴褛的ID]
## Get entrez gene IDs (default) for uniprot IDs mapping from human to mouse.[#获取Entrez基因uniprot ID从人类鼠标映射的标识(默认)。]
MouseEGs = inpIDMapper(ids, "HOMSA", "MUSMU")
##Get yeast uniprot IDs in exchange for uniprot IDs from human[#获取uniprot标识的人类酵母在的交流uniprot ID]
YeastUPs = inpIDMapper(ids, "HOMSA", "SACCE", destIDType="UNIPROT")
##Get yeast uniprot IDs but only return one ID per initial ID[#获取酵母uniprot标识,但只返回每一个初始的ID ID]
YeastUPSingles = inpIDMapper(ids, "HOMSA", "SACCE", destIDType="UNIPROT", keepMultDestIDMatches = FALSE)
##Test out the intraIDMapper function:[#测试intraIDMapper的功能:]
HumanEGs = intraIDMapper(ids, species="HOMSA", srcIDType="UNIPROT",
destIDType="EG")
HumanPATHs = intraIDMapper(ids, species="HOMSA", srcIDType="UNIPROT",
destIDType="PATH")
##Test out the wrapper function[#测试封装函数]
MousePATHs = idConverter(MouseEGs, srcSpecies="MUSMU", destSpecies="MUSMU",
srcIDType="EG", destIDType="PATH")
##Convert from Yeast uniprot IDs to Human entrez gene IDs.[#转换从酵母uniprot的ID的人Entrez基因标识。]
HumanEGs = idConverter(YeastUPSingles, "SACCE", "HOMSA")
## End(Not run)[#结束(不运行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|