I am using spark's python API and I am finding a few matrix operations challenging. My RDD is one dimensional list of length n (row vector). Is it possible to reshape it to a matrix/multidimensional array of size sq_root(n) x Sq_root(n).
and desired output 3 x 3=
[[1,2,3] [4,5,6] [7,8,9]]
Is there an equivalent to reshape in numpy?
Conditions: n (>50 million) is huge so that rules out using .collect(), and can this process be made to run on multiple threads?