I have a problem with reading CSV(or txt file) on pandas module Because numpy's loadtxt function takes too much time, I decided to use pandas read_csv instead.
I want to make a numpy array from txt file with four columns separated by space, and has very large number of rows (like, 256^3. In this example, it is 64^3).
The problem is that I don't know why but it seems that pandas's read_csv always skips the first line (first row) of the csv (txt) file, resulting one less data.
here is the code.
from __future__ import division import numpy as np import pandas as pd ngridx = 4 ngridy = 4 ngridz = 4 size = ngridx*ngridy*ngridz f = np.zeros((size,4)) a = np.arange(size) f[:, 0] = np.floor_divide(a, ngridy*ngridz) f[:, 1] = np.fmod(np.floor_divide(a, ngridz), ngridy) f[:, 2] = np.fmod(a, ngridz) f[:, 3] = np.random.rand(size) print f np.savetxt('Testarray.txt',f,fmt='%6.16f') g = pd.read_csv('Testarray.txt',delimiter=' ').values print g print len(g[:,3])
f and g that is displayed as an output has to much but it doesn't, indicating that pandas is skipping the first line of the
Testarray.txt. Also, length of loaded file
g is less than the length of the array
I need help.
Thanks in advance.