I need to calculate the distance between points of two different datasets, once with locations of towns, the other with locations of dams. Both are located in different countries in Africa.
The more I read the more confused I keep getting. I'm using the Python ecosystem (Geopandas, Shapely, Fiona), but my question is basic enough that a general answer is helpful (if you provide some code it would be an added benefit!)
A) The first dataset is a .shp file with locations of towns as Points. This one is nice and provides the crs as epsg=4326 (which I understand is the code for WGS 84):
import geopandas as gpdtowns = gpd.read_file('towns.shp')) #Uses fiona to loadprint(towns.crs)print(towns.geometry.head(3))#{'init': 'epsg:4326'}#0 POINT (8.877318824300001 9.93427297769)#1 POINT (9.163896418389999 9.47532028526)B) The second one has locations of dams. It came in an excel file with latitude and longitude with decimal places. It didn't give any information on the crs.
#Omitting the excel loading of `dams`print(dams.geometry.head(2))print(dams.crs) # empty#487 POINT (8.97333333333 9.76472222222)#488 POINT (4.55305555556 8.44277777778)Here are my questions
1) If the dataset comes in latitude/longitude in decimals without any extra info, can I assume that the crs is WGS 84? I guess I can't be 100% sure, but I just want to know if this is more or less standard and a reasonable guess.
More importantly:
2) What's the right way to measure the distance between these two points? (Let's assume for now they are both in WGS 84)I can calculate the distance between all towns and the first element of the dams:
# Distance function uses Shapelyprint(towns.geometry.distance(dams.geometry.iloc[1]))#0 0.194849#1 0.346508#2 1.046174Is this just calculating the Euclidian distance and hence very inaccurate?What do y'all do for this workflow? Should I transform the crs of the points to something that works for all of Africa and then take the Geopandas/Shapely distance function? Or would it be easier to keep the lat/lon (or WGS 84) and use a Haversine formula (or similar)? To me this would break a bit the benefit of using Geopandas.
Thanks for your time!
أكثر...
The more I read the more confused I keep getting. I'm using the Python ecosystem (Geopandas, Shapely, Fiona), but my question is basic enough that a general answer is helpful (if you provide some code it would be an added benefit!)
A) The first dataset is a .shp file with locations of towns as Points. This one is nice and provides the crs as epsg=4326 (which I understand is the code for WGS 84):
import geopandas as gpdtowns = gpd.read_file('towns.shp')) #Uses fiona to loadprint(towns.crs)print(towns.geometry.head(3))#{'init': 'epsg:4326'}#0 POINT (8.877318824300001 9.93427297769)#1 POINT (9.163896418389999 9.47532028526)B) The second one has locations of dams. It came in an excel file with latitude and longitude with decimal places. It didn't give any information on the crs.
#Omitting the excel loading of `dams`print(dams.geometry.head(2))print(dams.crs) # empty#487 POINT (8.97333333333 9.76472222222)#488 POINT (4.55305555556 8.44277777778)Here are my questions
1) If the dataset comes in latitude/longitude in decimals without any extra info, can I assume that the crs is WGS 84? I guess I can't be 100% sure, but I just want to know if this is more or less standard and a reasonable guess.
More importantly:
2) What's the right way to measure the distance between these two points? (Let's assume for now they are both in WGS 84)I can calculate the distance between all towns and the first element of the dams:
# Distance function uses Shapelyprint(towns.geometry.distance(dams.geometry.iloc[1]))#0 0.194849#1 0.346508#2 1.046174Is this just calculating the Euclidian distance and hence very inaccurate?What do y'all do for this workflow? Should I transform the crs of the points to something that works for all of Africa and then take the Geopandas/Shapely distance function? Or would it be easier to keep the lat/lon (or WGS 84) and use a Haversine formula (or similar)? To me this would break a bit the benefit of using Geopandas.
Thanks for your time!
أكثر...