GeoBuiz 2016: High volume geospatial imagery processing requires new architecture

المشرف العام

Administrator
طاقم الإدارة
DigitalGlobe owns and operates a constellation of commercial earth observation satellites.* The latest is Worldview-3 which is capable of about 30 cm precision.* Digital Globe's satellites collect 3,000,000 square kilometers of Earth imagery every day, that's about 70 terabytes of data per day.* Digital Globe also provides online access to 15 years of earth imagery, about 90 petabytes.* Of all high-resolution commercial imagery collected since 2010, Digital Globe has collected approximately 80% of it.* Clearly as a user you can't download all of this data, so how can users do things like search through this huge data set and generate a product such as a vegetation map for Tanzania ?

At the GeoBuiz 2016 Summit in Bethesda this week, Walter Scott, CTO and Founder, outlined how* Digital Globe is making this vast amount of high resolution data accessible to and usable by customers.* To make it possible for customers to search and process this data efficiently, Digital Globe has put all of the data on the public cloud (Amazon) and has provided a platform that supports efficient search, processing and workflows. GBDX is an online environment for efficiently running advanced algorithms for information extraction from imagery datasets at scale.* It is a set of RESTful APIs, and algorithms that have already been implemented by Digital Globe or its partners include car counting, orthorectification, land use, land cover, and atmospheric compensation.*

One of the most important efficiencies of the GBDX architecture is that it avoids having to move huge volumes of data over the internet.* It does this by bringing the application to the data, rather than the data to the application.* For example, to search through the data, GBDX provides a catalog API that uses 39 different image attributes such as geographic location, cloud cover percentage, sensor type, sun angle and others.* The search runs locally on Amazon, avoiding pushing data over the internet.

The GBDX API includes support for workflows. * A workflow is a sequence of tasks executed against sets of data. A typical workflow is initiated with a search for the data you want to process.* The selected data is assigned to a working set, and then processed* through a sequence of steps to generate your desired product. * For example, a typical product would be an orthorectified vegetation map.* Again the workflow runs locally on Amazon, avoiding pushing data over the internet during processing.* Only the final product resulting from the workflow, such as a vegetation map, would be transferred over the internet to the customer.

GBDX uses the cloud infrastructure of Amazon Web Service (AWS).* Data is stored in S3 ‘buckets’ that are accessed through GBDX RESTful web service APIs. GBDX includes built in tools supporting remote sensing algorithms from DigitalGlobe and ENVI.* GBDX is python-friendly and allows you*create your own tools.

One of the application areas that the platform makes possible is deep learning.* The basic concept is that software can simulate the brain's large array of neurons as an artificial “neural network".* Large volumes of input data and some way of quantitatively assessing the results are required to train the system.* With the greater depth enabled by new computers and algorithms, remarkable advances in speech and image recognition are being achieved.* For example, a Google deep-learning system that had been shown 10 million images from YouTube videos proved almost twice as good as any previous image recognition effort at identifying objects such as cats.* Microsoft has demonstrated speech software that transcribes spoken words into English text with an error rate of only 7 percent.

But deep learning requires a lot of compute power and a lot of training.* GBDX provides the computational power to apply deep learning to earth observation and Digital Globe's data library can provide the training data sets.* What is required is a scalable way of assessing and quantifying the results. Walter Scott suggested crowd sourcing as a way of achieving the scale required for earth observation training.




أكثر...
 
أعلى