how to split the column in R? -
i want split same column in same way . wanted following bur not working propely.
the code used t38kbat = read.table("test38kbat.txt", header = false) head(t38kbat)
t38kbat <- separate (t38kbat, v2, c("sp", "id", "gene_organism"), \\"|") t38kbat <- separate (t38kbat, gene_organism, c("gene", "organism"), \\"_") t38kbat <- unite (t38kbat, sp, sp, id, sep = "|")
while run script recieved error
error: unexpected input in "t38kbat <- separate (t38kbat, v2, c("sp", "id", "gene_organism"), \"
can guide me how resolve it.
in base r, strsplit
command operate on vector of form, produces list, have simplify further.
in tidyr
package, there's separate
function preserve data frame nature of things. that's preferable use case.
for example
> library(tidyr) > <- data.frame(x=1:3, y=c("a|b|c", "b|c|d", "d|e|f")) > x y 1 1 a|b|c 2 2 b|c|d 3 3 d|e|f > separate(a, y, c("a","b","c"), '\\|') x b c 1 1 b c 2 2 b c d 3 3 d e f
to flesh out strsplit
solution slightly, have use awkward combination of cbinds
there
> cbind(a, do.call(cbind, strsplit(as.character(a$y), "\\|"))) x y 1 2 3 1 1 a|b|c b d 2 2 b|c|d b c e 3 3 d|e|f c d f
edit: should note if use tidyr
approach, have use recursively, possibly unite
, complete behavior. like
df <- separate(df, col, c("type", "subtype", "rawclass"), "\\|") df <- separate(df, rawclass, c("class", "subclass"), "_") df <- unite(df, sp, type, subtype, sep="|")
assuming original column called col
, , made-up names final headers.
Comments
Post a Comment