1. Image data for building detection

We trained two models for building detection: one for street view and one for bird’s eye view. All the related data are under folder “RCNN”.

RCNN/SV/Annotations: building annotations

RCNN/SV/ImageSets: train-test split

RCNN/SV/JPEGImages: all image data for street view buildings

RCNN/SV/results: building detection results on test set

Same for BV.

2. Image data for building matching

a) GPS

kml files (can be shown in Google Earth)

GPS*.mat files contain GPS locations.

For the dataset, following are the number of GPS locations we have for each city:

Pittsburg - 2000 GPS locations (1586 unique)

Orlando - 1324 GPS locations

Manhattan - 5941 (train: 2330 - we do not use the Manhattan training data in our experiments, test: 3611)

b) All image data

Folder data1 - data22 include all the original images. Each image contains side by side street view and bird's eye view images. data*.xgtf are annotation files corresponding to each data folder and data*.txt are the building annotations.

Following are related details:

Pittsburg:

Data1- 2000 images

Data2 - 1802 images

Data3- 2000 images

Data4 - 1816 images

Data5 - 2000 images

Data6 - 1864 images

Data7 - 1000 images

Data8 - 1000 images

Data9- 1584 images

Orlando:

Data10 - 1503 images

Data11 - 1440 images

Data12- 1416 images

Data13 - 2005 images

Data14 - 2160 images

Data15 - 1103 images

Data16 - 1104 images

Manhattan_train (we do not use Manhattan training data in our experiments)

Data17 - 2000 images

Data18 - 2000 images

Data19 - 2000 images

Data20 - 2000 images

Data21 - 1320 images

Manhattan_test

Data22 - 14444 images

In data*.txt, for example data1.txt, the representation of each column is explained in the following.

Column 1 is starting frame and column 2 is ending frame. The tool we used for annotation is for videos, so there are starting frame and ending frame. But in our case, column 1 should always be equal to column 2 (i.e. the same image). Column 3 is the index of building in current frame/image. The remaining four columns are coordinates ( x1, y1, x2, y2) for building positions.

c) For experiments on building matching, we trained and tested the model on images from Pittsburg and Orlando. The used data are under folder image_matching/image_SV and image_BV. Some related stats:

Pittsburg:

Positive pairs: data1-data9

Num of positive pairs: [1594, 2006, 2099, 1388, 1330, 865, 590, 452, 1021] (total: 11345)

Train-test split: Pittsburg: 7th St & Liberty Ave

lat_thr = 40.442420;

log_thr = -79.999999;

Train: 8955 positive pairs

Test: 2390 positive pairs

Orlando:

Positive pairs: data10-data16

Num of positive pairs: [941, 928, 757, 1847, 1805, 608, 928] (total: 7814)

Train-test split: Orlando: E Concord St & N Orange Ave

lat_thr = 28.551170;

Train: 6719 positive pairs

Test: 1095 positive pairs

Pittsburg + Orlando:

Train: 15674 positive pairs, 313480 negative pairs

Test: 3485 positive pairs, 69700 negative pairs

For the testing experiments on Manhattan data, the images should be in folder BV_Manhattan and SV_Manhattan.

3. Source code

The source code is located in folder “Code”, for building matching and dominant set. The parameters are specified in our paper.