scala - format a (K, (v, w)) pair in spark rdd -
i have rdd this:
val custfile = sc.textfile("custinfo.txt").map(line => line.split('|')) val custprd = custfile.map(a => (a(0), ((a(1)), (a(2), a(3), a(4), a(5), a(6), a(7), a(8))))) val custgrp = custprd.groupbykey custgrp.saveastextfile("custinfo2")
that produces this:
(1104,compactbuffer((s_savg,(1,1,1,1,1,1,1)), (cn_savg,(4,4,1,1,4,1,1))))
how can use this:
custprdgrp.map{case (k, vals) => {val valsstring = vals.mkstring(", "); s"{$k:, {$valsstring}}" }}
to format (k, (v, w)) pair...i tried got error:
val custprdrep = custprdgrp.map({case (k, (v, w)) => {val valsstring = v.mkstring(", "); val valsprvcy = w.mkstring(", "); s"'${k}'| [$valsstring]" }}) <console>:27: error: constructor cannot instantiated expected type; found : (t1, t2) required: iterable[(string, (string, string, string, string, string, string, string))] val custprdrep = custprdgrp.map({case (k, (v, w)) => {val valsstring = v.mkstring(", "); val valsprvcy = w.mkstring(", "); s"'${k}'| [$valsstring]" }}) ^ <console>:27: error: not found: value v val custprdrep = custprdgrp.map({case (k, (v, w)) => {val valsstring = v.mkstring(", "); val valsprvcy = w.mkstring(", "); s"'${k}'| [$valsstring]" }}) ^ <console>:27: error: not found: value w val custprdrep = custprdgrp.map({case (k, (v, w)) => {val valsstring = v.mkstring(", "); val valsprvcy = w.mkstring(", "); s"'${k}'| [$valsstring]" }})
i'd want array this:
('1104'|{'s_savg': {a: '1', b: '1', c: '1', d: '1', e: '1', f: '1', g: '1'}, 'cn_savg': {a: '4', b: '4', c: '1', d: '1', e: '4', f: '1', g: '1'}})
well, there quite lot of details here should work:
val keys = list("a", "b", "c", "d", "e", "f", "g") custgrp.map{case (k, vals) => { val valsstring = vals map { case (val1, val2) => { val pairs = keys // create someletter: 'somenumber' pairs .zip(val2.productiterator.map{case (x: string) => x}.toseq) .map{case (k, v) => s"$k: '$v'"} // join single string .mkstring(", ") // add "key" s"'$val1': {$pairs}" } } // combine above val valscomb = valsstring.mkstring(", ") // create final string s"('$k'|{$valscomb})" }}
you simplify things creating correct data structure in first place. example using maps instead of tuples:
map("s_savg" -> map("a" -> "1", "b" -> "1", ...), ...)
Comments
Post a Comment