I have a few files with location information:
1) User Lat Long Pings in Parquet Files (can be converted to CSV) format - this contains userid, timestamp, lat and lon
2) US County/State Shapefile
3) Store Centroid Data - centroid of stores like Walmart
Our goal is to:
1) Add geo attributes like county and state to user pings from county shapefile2) Identify users who visited the store
Challenges:
1) Scale - User data has 30 million rows per day (over 6 months)
Can you please guide me on which geospatial libraries to use in order to create an efficient workflow? I briefly explored Fiona, Shapely, Geopandas.
I need to import the CSVs, shapefiles; define projections; create buffers; intersect them to get attributes and then output them. And I need to be able to do this at scale
Please let me know if my question is too vague
أكثر...
1) User Lat Long Pings in Parquet Files (can be converted to CSV) format - this contains userid, timestamp, lat and lon
2) US County/State Shapefile
3) Store Centroid Data - centroid of stores like Walmart
Our goal is to:
1) Add geo attributes like county and state to user pings from county shapefile2) Identify users who visited the store
Challenges:
1) Scale - User data has 30 million rows per day (over 6 months)
Can you please guide me on which geospatial libraries to use in order to create an efficient workflow? I briefly explored Fiona, Shapely, Geopandas.
I need to import the CSVs, shapefiles; define projections; create buffers; intersect them to get attributes and then output them. And I need to be able to do this at scale
Please let me know if my question is too vague
أكثر...