scala - format a (K, (v, w)) pair in spark rdd -


i have rdd this:

val custfile = sc.textfile("custinfo.txt").map(line => line.split('|'))  val custprd = custfile.map(a => (a(0), ((a(1)), (a(2), a(3), a(4), a(5), a(6), a(7), a(8)))))  val custgrp = custprd.groupbykey  custgrp.saveastextfile("custinfo2") 

that produces this:

(1104,compactbuffer((s_savg,(1,1,1,1,1,1,1)), (cn_savg,(4,4,1,1,4,1,1)))) 

how can use this:

custprdgrp.map{case (k, vals) => {val valsstring = vals.mkstring(", "); s"{$k:, {$valsstring}}" }} 

to format (k, (v, w)) pair...i tried got error:

val custprdrep = custprdgrp.map({case (k, (v, w)) => {val valsstring = v.mkstring(", "); val valsprvcy = w.mkstring(", "); s"'${k}'| [$valsstring]" }}) <console>:27: error: constructor cannot instantiated expected type;  found   : (t1, t2)  required: iterable[(string, (string, string, string, string, string, string, string))]        val custprdrep = custprdgrp.map({case (k, (v, w)) => {val valsstring = v.mkstring(", "); val valsprvcy = w.mkstring(", "); s"'${k}'| [$valsstring]" }})                                                  ^    <console>:27: error: not found: value v            val custprdrep = custprdgrp.map({case (k, (v, w)) => {val valsstring = v.mkstring(", "); val valsprvcy = w.mkstring(", "); s"'${k}'| [$valsstring]" }})                                                                                   ^     <console>:27: error: not found: value w            val custprdrep = custprdgrp.map({case (k, (v, w)) => {val valsstring = v.mkstring(", "); val valsprvcy = w.mkstring(", "); s"'${k}'| [$valsstring]" }}) 

i'd want array this:

('1104'|{'s_savg': {a: '1', b: '1', c: '1', d: '1', e: '1', f: '1', g: '1'}, 'cn_savg': {a: '4', b: '4', c: '1', d: '1', e: '4', f: '1', g: '1'}}) 

well, there quite lot of details here should work:

val keys = list("a", "b", "c", "d", "e", "f", "g")  custgrp.map{case (k, vals) => {     val valsstring = vals map {         case (val1, val2) => {             val pairs = keys                 // create someletter: 'somenumber' pairs                 .zip(val2.productiterator.map{case (x: string)  => x}.toseq)                 .map{case (k, v) => s"$k: '$v'"}                 // join single string                 .mkstring(", ")             // add "key"             s"'$val1': {$pairs}"         }     }     // combine above     val valscomb = valsstring.mkstring(", ")     // create final string     s"('$k'|{$valscomb})" }} 

you simplify things creating correct data structure in first place. example using maps instead of tuples:

 map("s_savg" -> map("a" -> "1", "b" -> "1", ...), ...) 

Comments

Popular posts from this blog

html - Firefox flex bug applied to buttons? -

html - Missing border-right in select on Firefox -

python - build a suggestions list using fuzzywuzzy -