1. Image data for building detection
We trained two models for building detection: one for street view and one for bird’s eye view. All the related data are under folder “RCNN”.
RCNN/SV/Annotations: building annotations
RCNN/SV/ImageSets: train-test split
RCNN/SV/JPEGImages: all image data for street view buildings
RCNN/SV/results: building detection results on test set
Same for BV.
2. Image data for building matching
a) GPS
kml files (can be shown in Google Earth)
GPS*.mat files contain GPS locations.
For the dataset, following are the number of GPS locations we have for each city:
Pittsburg - 2000 GPS locations (1586 unique)
Orlando - 1324 GPS locations
Manhattan - 5941 (train: 2330 - we do not use the Manhattan training data in our experiments, test: 3611)
b) All image data
Folder data1 - data22 include all the original images. Each image contains side by side street view and bird's eye view images. data*.xgtf are annotation files corresponding to each data folder and data*.txt are the building annotations.
Following are related details:
Pittsburg:
Data1- 2000 images
Data2 - 1802 images
Data3- 2000 images
Data4 - 1816 images
Data5 - 2000 images
Data6 - 1864 images
Data7 - 1000 images
Data8 - 1000 images
Data9- 1584 images
Orlando:
Data10 - 1503 images
Data11 - 1440 images
Data12- 1416 images
Data13 - 2005 images
Data14 - 2160 images
Data15 - 1103 images
Data16 - 1104 images
Manhattan_train (we do not use Manhattan training data in our experiments)
Data17 - 2000 images
Data18 - 2000 images
Data19 - 2000 images
Data20 - 2000 images
Data21 - 1320 images
Manhattan_test
Data22 - 14444 images
In data*.txt, for example data1.txt, the representation of each column is explained in the following.
Column 1 is starting frame and column 2 is ending frame. The tool we used for annotation is for videos, so there are starting frame and ending frame. But in our case, column 1 should always be equal to column 2 (i.e. the same image). Column 3 is the index of building in current frame/image. The remaining four columns are coordinates ( x1, y1, x2, y2) for building positions.
c) For experiments on building matching, we trained and tested the model on images from Pittsburg and Orlando. The used data are under folder image_matching/image_SV and image_BV. Some related stats:
Pittsburg:
Positive pairs: data1-data9
Num of positive pairs: [1594, 2006, 2099, 1388, 1330, 865, 590, 452, 1021] (total: 11345)
Train-test split: Pittsburg: 7th St & Liberty Ave
lat_thr = 40.442420;
log_thr = -79.999999;
Train: 8955 positive pairs
Test: 2390 positive pairs
Orlando:
Positive pairs: data10-data16
Num of positive pairs: [941, 928, 757, 1847, 1805, 608, 928] (total: 7814)
Train-test split: Orlando: E Concord St & N Orange Ave
lat_thr = 28.551170;
Train: 6719 positive pairs
Test: 1095 positive pairs
Pittsburg + Orlando:
Train: 15674 positive pairs, 313480 negative pairs
Test: 3485 positive pairs, 69700 negative pairs
For the testing experiments on Manhattan data, the images should be in folder BV_Manhattan and SV_Manhattan.
3. Source code
The source code is located in folder “Code”, for building matching and dominant set. The parameters are specified in our paper.