pandas
pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the programming language
一、安装
1、pandas
git clone git://github.com/pydata/pandas.git
2、Requirements
Cpython
pytz
二、例子:
数据内容格式:
dstat命令输出的结果
"epoch","1m","5m","15m","read","writ","read","writ","usr","sys","idl","wai","hiq","siq","recv","send","used","buff","cach","free","util","util"1384246698.290,2.730,1.190,0.940,4316699.658,3074071.976,283.904,131.427,1.172,0.464,96.302,1.858,0.022,0.182,0.0,0.0,6959415296.0,1933697024.0,19159023616.0,22583123968.0,0.155,0.2921384246699.290,2.730,1.190,0.940,4579328.0,880640.0,560.0,27.0,14.518,0.876,82.228,1.877,0.0,0.501,363556.0,2888961.0,6973390848.0,1933697024.0,19159031808.0,22569140224.0,0.0,17.6001384246700.290,2.730,1.190,0.940,3678208.0,516096.0,449.0,18.0,13.965,0.623,83.292,1.621,0.125,0.374,140745.0,1662141.0,6983143424.0,1933697024.0,19159306240.0,22559113216.0,0.0,13.8001384246701.290,2.510,1.170,0.930,4292608.0,667648.0,524.0,49.0,7.644,0.752,89.348,1.629,0.0,0.627,139858.0,1056176.0,6971334656.0,1933697024.0,19159306240.0,22570921984.0,0.0,15.01384246702.290,2.510,1.170,0.930,4104192.0,684032.0,502.0,22.0,3.616,0.748,93.267,1.621,0.0,0.748,126114.0,1135695.0,6983221248.0,1933697024.0,19159592960.0,22558748672.0,0.0,14.800
程序:
#!/usr/bin/env python2.7#-*- coding:utf-8 -*-import pandas as pdimport numpy as npf = open("./ssd_dstat.txt")index_list = []flag = Falseall_data_list=[]for line in f.xreadlines(): if "buff" in line: index_str_list = line.strip('\n').split(',') index_list = [] for i in index_str_list: column_name = i.strip('"') if column_name in index_list: column_name = column_name + '_1' index_list.append(column_name) print index_list flag = True elif flag: data_str_list = line.strip('\n').split(',') data_list = [] for i in data_str_list: data_list.append(float(i)) all_data_list.append(data_list)data = np.array(all_data_list)df = pd.DataFrame(data,columns=index_list)print df
以上是个初步的例子,还有待优化