CSV to SHP with Python

المشرف العام

Administrator
طاقم الإدارة
Python is a well established script language in the GIS/geodata world. And as a Facebook friend asked how to read csvs with Python I thought about “How to convert a csv to a shp with Python?”. Keeping in mind that most GPS solutions and many internet tools offers a csv export and it’s common in any stats/spreadsheet program this can be a handy solution for your everyday life. See my solution here…Reading A CSV With Python

As a first task, we need to read a csv. This is accomplished using the csv module in Python. We will access a file and will read line after line:import csvwith open('/home/ricckli/Desktop/example.tsv', 'rb') as csvfile: reader = csv.reader(csvfile, delimiter='\t') #my example uses the tab as delimiter for line in reader print '; '.join(line)



So by reading line by line we can easily do everything we would like with the content. So maybe convert the csv first into a nice dictionary with the column names as attributes and “cells” as values for those. Therefore we will not use the reader function in the csv module. Instead we use the DictReader
:import csvwith open('/home/ricckli/Desktop/example.tsv', 'rb') as csvfile: reader = csv.DictReader(csvfile, delimiter='\t') #my example uses the tab as delimiter for row in reader: print(row['LAT'], row['LON']) #these are my geometry columnsThe Tricky Part: Designing The Shapefile

The ogr module enable us to build a shapefile from scratch. Yet it is not easy:import osgeo.ogr, osgeo.osr #we will need some packagesfrom osgeo import ogr #and one more for the creation of a new fieldspatialReference = osgeo.osr.SpatialReference() #will create a spatial reference locally to tell the system what the reference will bespatialReference.ImportFromEPSG(4326) #here we define this reference to be utm Zone 48N with wgs84..driver = osgeo.ogr.GetDriverByName('ESRI Shapefile') # will select the driver foir our shp-file creation.shapeData = driver.CreateDataSource('/home/ricckli/Desktop/example_points.shp') #so there we will store our datalayer = shapeData.CreateLayer('Example', spatialReference, osgeo.ogr.wkbPoint) #this will create a corresponding layer for our data with given spatial information.layer_defn = layer.GetLayerDefn()As you might have seen, we already have defined the reference system for our coordinates. If your file have coordinates in another System, use the CRS of your source. Furthermore we don’t have any fields in our shapefile at the moment.But how to get field names in a generic way. Therefore we will analyse the dictreader object :with open('/home/ricckli/Desktop/example.tsv', 'rb') as csvfile: readerDict = csv.DictReader(csvfile, delimiter='\t') for field in readerDict.fieldnames: new_field = ogr.FieldDefn(field, ogr.OFTString) #we will create a new field for each header element layer.CreateField(new_field)Yet we do have a problem here. We assume, that all the information in the csv is text information. In fact we do have some numbers as well. But if you would like to take this into account, you need to build each field by yourself (Is there another/generic way?)

Bringing It All Together

Coming back to the lines/points of the separated file. As for each line in the csv we need to add a feature with coordinates defined in the columns LAT and LON and add the attributes to the fields. Furthermore let’s get this script called with four input parameters (import csv file, EPSG code, delimiter and export shapefile):from sys import argvscript, input_file, EPSG_code, delimiter, export_shp = argvimport csvimport osgeo.ogr, osgeo.osr #we will need some packagesfrom osgeo import ogr #and one more for the creation of a new fieldspatialReference = osgeo.osr.SpatialReference() #will create a spatial reference locally to tell the system what the reference will bespatialReference.ImportFromEPSG(int(EPSG_code)) #here we define this reference to be the EPSG codedriver = osgeo.ogr.GetDriverByName('ESRI Shapefile') # will select the driver for our shp-file creation.shapeData = driver.CreateDataSource(export_shp) #so there we will store our datalayer = shapeData.CreateLayer('layer', spatialReference, osgeo.ogr.wkbPoint) #this will create a corresponding layer for our data with given spatial information.layer_defn = layer.GetLayerDefn() # gets parameters of the current shapefileindex = 0with open(input_file, 'rb') as csvfile: readerDict = csv.DictReader(csvfile, delimiter=delimiter) for field in readerDict.fieldnames: new_field = ogr.FieldDefn(field, ogr.OFTString) #we will create a new field with the content of our header layer.CreateField(new_field) for row in readerDict: print(row['LAT'], row['LON']) point = osgeo.ogr.Geometry(osgeo.ogr.wkbPoint) point.AddPoint(float(row['LON']), float(row['LAT'])) #we do have LATs and LONs as Strings, so we convert them feature = osgeo.ogr.Feature(layer_defn) feature.SetGeometry(point) #set the coordinates feature.SetFID(index) for field in readerDict.fieldnames: i = feature.GetFieldIndex(field) feature.SetField(i, row[field]) layer.CreateFeature(feature) index += 1shapeData.Destroy() #lets close the shapefile Attributes in QGIS


In the end you can call the whole file like this in your terminal/cmd console. You can enhance it further and make the LAT/LON names generic. The first line works for tab-separated files, second for “;” separated files:python /home/ricckli/Desktop/csv_to_shp.py /home/ricckli/Desktop/example2.tsv 4326 $'\t' /home/ricckli/Desktop/test2.shppython /home/ricckli/Desktop/csv_to_shp.py /home/ricckli/Desktop/example.tsv 4326 ";" /home/ricckli/Desktop/test.shpYou can download the python script here and also the example files. I’ll appreciate any comment!The post CSV to SHP with Python appeared first on Digital Geography.
 
أعلى