Saturday, August 18, 2007

Python utility functions

Some utility functions

Here are a couple of utility functions. There is a name split function that shows a crude way of splitting a field like "Chapman,Benjamin J" into three fields for each component of the name. It handles the case of a missing middle name. The second function creates a list of dictionaries based on a CSV file. It determines the dictionary keys based on the column headings. Note that you will need to supply the name of a CSV file to use in the function.



def splitname(name='Chapman,Benjamin J'):
""" Return a three element tuple composed of
givenname,middle,sn if no middle, set middle to empty string
"""
(sn,givenname) = name.split(',')
try:
(givenname,middle) = givenname.split(' ')
except ValueError:
middle = ''
return (givenname,middle,sn)

def load_data(filename='Name of file',mylist=[]):
""" Reads a csv file with an initial header row and returns a list
of dicts. Keys are from the initial header row and
values from the contents of each row. Keys are converted to lower case
and spaces are replaced with underscores.
"""
import csv
fp = open(filename,'rU')
csv_reader = csv.reader(fp)
fieldlist = csv_reader.next()
for i in range(len(fieldlist)):
fieldlist[i] = fieldlist[i].lower().strip().replace(' ','_')
for row in csv_reader:
myrow = zip(fieldlist,row)
mylist.append(dict(myrow))
return mylist

if __name__ == "__main__":
print "Test of splitname function"
print splitname()
print splitname('Fudd,Elmer')
print splitname('Coyote,Wiley T.')
print "Following will return error"
print splitname('Chapman')
print "load_data test"
zz = load_data('CSVFILENAME_HERE')
print zz[1]
print zz[35]
print zz[-5:-1]
for i in range(20,30):
print splitname(zz[i]['name'])

No comments: