The news is now out:

We are now 'Aurizn'.

What if Google Maps was Art?

Ranging from highly realistic to immensely abstract, the great artists (da Vinci, Monet, Picasso, van Gogh) had an almost magical ability to represent the world around them. But how might these artists have imagined the world from the point of view of a satellite, looking down on the world from the heavens above? With curiosity as one of our core values at Consilium Technology, a team of our data scientists got together for a one-day hack to find out.

Obtaining Satellite Imagery

As part of our work on GAIA, we partnered with DigitalGlobe to utilise their database of satellite imagery, as well as their cloud-based image processing platform, know as GBDX (Geospatial Big Data Platform). This provided us with lots of data to supplement the ‘community’ satellite imagery, such as the Landsat8 or Sentinal-2 datasets (also available via Earth Explorer). We searched for and processed the images in Jupyter Notebooks running on GBDX, and  implemented machine learning training directly on GBDX too. We also exported the data to our own computers to view in QGIS and use our own GPUs.

We will now step you through the process of creating your own satellite imagery artwork. To get started, you’ll need to obtain some satellite images. If you’re interested, we also recommend the GBDX tutorials and the gbdxtools package documentation.

Choose an area of interest

The area of interest (AOI) could be a bounding box or other polygon comprised of latitude and longitude coordinates. It is easiest to use the interactive map on the Imagery tab in all GBDX notebooks.

bounding_box = [138.434, -34.951, 138.762, -34.892] # Adelaide CBD and surrounds


© Mapbox, © DigitalGlobe, © Stamen Design & OpenStreetMap.

Find images in that area

Search the catalogue for satellite photos that cover your AOI which do not have too much cloud cover. Clouds are an issue in many applications, and there are methods to identify and mask them. However, it is preferable to find a cloud-free source.  GBDX’s graphical interface makes this easy, but it can also be done in code. We will only search for freely available Landsat8 imagery.

from shapely.geometry import box
from gbdxtools import Interface
gbdx = Interface()
def search_cloudcover(bbox, max_results=100, cloudcover=4):
    aoi = box(*bbox).wkt
    query = 'item_type:Landsat8 AND attributes.cloudCover_int:<{}'.format(cloudcover)
    return gbdx.vectors.query(aoi, query, count=max_results)

results = search_cloudcover(bounding_box)
catalogue_id = results[8]['properties']['id'] # 'LC80970842018124LGN00'

Pull data into your notebook

The next step involves importing the segment of the catalogue image into the notebook using the AOI and catalogue ID found above. GBDX has an easy way to pan-sharpen the image, which will be explained below.

from gbdxtools import CatalogImage
image = CatalogImage(catalogue_id, pansharpen=True, bbox=bounding_box)
image.plot(w=10, h=10)


Adelaide from above, © DigitalGlobe.

Save to GeoTIFF

There is also the option to download the image in GeoTIFF format.

from gbdxtools.s3 import S3
filename = 'Adelaide.tiff'
image.geotiff(path=filename )
bucket.upload(filename , 'exported/{}'.format(filename ))
# Then download it from your associated S3 bucket.

Pan-sharpen for more resolution

Unlike a normal photograph, satellite images often include many spectral bands beyond the typical red, green, and blue (RGB) bands. This includes ‘deep blue’ and various infrared bands. Of particular interest is the panchromatic band, which has collected light across many of the other bands. As such, the panchromatic band often has a higher resolution since the sensor pixels do not have to be as large to collect enough light. This can be exploited to increase the resolution of other bands through a process called pan-sharpening. It essentially uses the brightness from the panchromatic band, and the colour from the three RGB bands, to produce the pan-sharpened image. To achieve this, simply set the pan-sharpen parameter in CatalogImage to True.

image_multispectral = CatalogImage('104001003A118500', band_type="MS", bbox=[138.605, -34.926, 138.607, -34.923])
image_panchromatic = CatalogImage('104001003A118500', band_type="pan", bbox=[138.605, -34.926, 138.607, -34.923])
image_pansharp = CatalogImage('104001003A118500', pansharpen=True, bbox=[138.605, -34.926, 138.607, -34.923])


World View 3 Imagery showing a circus tent in downtown Adelaide © DigitalGlobe.

Load GeoTIFF in Python

If you downloaded the GeoTIFF, you will now need to load the file into Python for processing. Generally, these images contain many colour channels, or bands, of 16-bit precision integers. We used the popular gdal library to do this.

from osgeo import gdal
import numpy as np


def geotiff2numpy(filename):
    '''Extract raster data from GeoTiff file.
    filename -- geotiff file path and name
    Returns: numpy ndarray with each geotiff band stacked along the 3rd dim.
        tiff = gdal.Open(filename)
    except RuntimeError as e:
        print('Unable to open {}'.format(filename))
        return None
    band_arrays = []
    for idx in range(tiff.RasterCount):
        band = tiff.GetRasterBand(idx + 1)
        band_array = band.ReadAsArray()
    return np.stack(band_arrays, axis=2)

if __name__ == "__main__":
    import numpy as np
    tiff = geotiff2numpy('Adelaide.tiff')                               # Load geotiff
    rgb = tiff[:,:,[3,2,1]]                                             # RGB in 4th, 3rd, 2nd bands for Landsat8
    rgbmin, rgbmax = np.percentile(rgb, [1,99])                         # Ignore outliers
    scaledrgb = (rgb.clip(rgbmin, rgbmax)-rgbmin) / (rgbmax-rgbmin)     # Rescale to brighten the image

We are now ready to create some artistic neural networks!

The Neural Style Transfer Method

Neural style transfer (NST) was invented by Gaty and colleagues in 2015, who used Convolutional Neural Networks to create models of both the content and style of arbitrary images. Artistic images can then be produced by recombining new content with a different style. This work quickly gained widespread attention and lead to a series of publications. Neural Style Transfer: A Review provides a comprehensive review of these recent publications.

NN style transfer
An example style transfer (Neural Style Transfer: A Review).

Artistic Maps

To create our artistic maps, we used a satellite image as our input content and the style model was created from another image (e.g., a painting). We applied a pre-trained stroke-controllable fast style transfer neural network to some satellite images taken during ‘mad march’, when Adelaide hosts a variety of festivals and events.

Below are the five images from which the style models have been learned.


Pre-trained style source images.

The image below shows a portion of the Adelaide 500 Supercars street circuit, with the five different style models applied.


Adelaide 500, Satellite image (top left) © DigitalGlobe.

This second image is of the Adelaide Fringe Festival, a month-long performing arts extravaganza. Now we are applying art to art!


Adelaide Fringe Festival satellite image (top left) © DigitalGlobe.

Stroke size

Stroke Controllable Fast Style Transfer (2018) can generate artistic images with various stroke sizes. The tensorflow-based neural network consists of three parts: (1) the StrokePyramid, (2) the VGG pre-encoder, and (3) the stroke decoder. The content and style of arbitrary images are fed into the neural network and will be encoded by VGG. This means that a number of feature map inputs will be extracted and used by the loss function to measure the semantic loss and stroke loss. In order to control the stroke size, the StrokePyramid is applied to create receptive fields with adaptive size. Finally, the stroke decoder will generate the stylised output with the desired stroke size.


Adelaide 500 supercars satellite image (left) © DigitalGlobe.

The CycleGAN Method

Another approach to style transfer is through the use of Generative Adversarial Networks, commonly known are GANs.

GANS are a form of generative unsupervised machine learning in which a system learns to map from a latent space to a particular distribution of interest. The original GAN was introduced by Ian Goodfellow and colleagues (2014) and involved training two neural networks in the setting of a zero-sum game. In the case of generating artificial imagery, a generator network is tasked with creating ‘fake’ images. The GAN has the express purpose of fooling a discriminator network, which is tasked with correctly identifying whether an image of interest is real or fake. By training these networks together with significant care, the two networks learn and improve together. Consequently, the generator must learn to create more realistic ‘fakes’ to fool the discriminator,  while the discriminator must learn more complex features to identify a real from a fake.  The basic GAN concept has been subsequently extended to a wide range of image processing application, including synthesis, object detection, super resolution, as well as style transfer. For more detail, see the original paper on GANS.

CycleGAN is an extension of the GAN concept that focuses on the cross-domain transfer problem. Although a lot of image-to-image translation systems can learn a mapping by training over a set of aligned image pairs (e.g., input to output), this can often be prohibitive. Consider the instance of wanting to learn a mapping from a horse to a zebra. It would be extremely difficult to obtain lots of photos of horses and zebras in the same pose, and in the same environment. What is exciting about CycleGANs is that we don’t need such photos. Instead, we can collect a lot of horse and zebra images in different environments and train the system on the un-paired imagery.


An illustration of the CycleGAN method for Horse ↔ Zebra (Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, 2017).

In essence, a CycleGAN is comprised of two generator networks learning complementary mappings between the two image types (e.g.,  A→B and B→A), and two discriminators which learn to identify ‘real’ and ‘fake’ images from each of the images types. Images from both classes are used as inputs to each of the generator and discriminators. They are trained in a manner that is relatively similar to a standard GAN. However, an additional constraint is applied to the generators by way of cycle consistency. Put simply, this means that if a horse is transformed into a zebra, when it is transformed back into a horse it should look very similar to how it originally started. Without this, all of the horse images will tend to transform into the same picture of a zebra. Unlike standard GANS, cycle consistency means we do not require paired images for cycleGANs. More information is provided in the original paper and the CycleGAN website.


The Adelaide Art museum recently hosted the “Colours of Impressionism” exhibit, showcasing more than 65 impressionist masterpieces from the renowned collection of the Musée d’Orsay in Paris. The works include works of Cézanne, Monet, Manet, Renoir, Pissarro and Morisot. These artist helped reshape our understanding of art and kicked-off what is now known as the modern art movement. It seemed appropriate to pay homage to these great impressionist with our styling of Adelaide satellite imagery. Like any budding artist, we looked to the great artists to guide us, specifically Claude Monet, one of the most prolific impressionists of all time. Monet painted many great works which provided a fantastic training set.  The benefit of using the CycleGAN approach was that we could train our system on multiple different paint styles – we weren’t just limited to one style. 

We trained our CycleGAN on two sets of images (satellite and Monet) using an opensource Tensorflow implementation that can be found here. There is no point re-inventing the paint brush!

Monet GAN

CycleGAN Output: Transforming Satellite Imagery into Monet Paintings. Worldview 3 images © DigitalGlobe.

As illustrated above, the system captured elements of Monet’s work and impressionism in general, where freely brushed colours take precedence over distinct lines. Despite this, the overall structure of the cityscape remained present in our imagery. It is interesting to note some of the relationships that the CycleGAN learned. Specifically, in the second satellite image, it appears the parkland has been mapped to blue water or sky. Due to the simultaneous training of the two generators, we also visualised the reconstructions of the satellite imagery that was produced by mapping our ‘fake’ Monet images back to the style of satellite imagery (i.e., the third image shown above). We can see that the reconstructions almost look like grey scale copies of our original satellite imagery (i.e., the first image). This is probably because the system has learnt to focus on the difference in colours and edging between the two image styles.

The CycleGAN was trained on a catalogue of Monet works comprising a range of styles painted over his long career. As a result, there was quite a diverse variety of outputs, ranging from vibrant colours to more dull-looking browns.


 An illustration of the output diversity generated by the CycleGAN.

At times the network linked image elements in interesting and fascinating ways. Despite the top-down perspective of satellite photography, which is a very different viewpoint compared to Monet’s landscapes and portraits, the system managed to find some shadows and bushes that resemble a tree. This resulted in an image that resembled a landscape scene, rather than the satellite view from which it was transformed from.


The CycleGAN occasionally generated images from different perspectives.

Finally, as we trained networks for mapping in both directions, we mapped original Monet paintings to completely imagined cityscapes. Although there is almost no relation between the two images in terms of overall structure, the satellite image itself seems to contain elements of structure, such as connecting roads and parklands.


Mapping original Monet painting to imagined cityscapes.

This work illustrates how the great artists may have imagined the world from the heavens above, and demonstrates that Google Maps can indeed be considered a form of art.


  • Gatys, L. A., Ecker, A. S. and Bethge, M. (2015) ‘A neural algorithm of artistic style’, ArXiv e-prints. arXiv:1508.06576.
  • Jing, Y., Yang, Y., Feng, Z., Ye, J., Yu, Y. and Song, M. (2018) ‘Neural Style Transfer: A Review’.  arXiv:1705.04058.
  • Jing, Y., Liu, Y., Yang, Y., Feng, Z., Yu, Y., Tao, D. and Song, M. (2018) ‘Stroke controllable fast style transfer with adaptive receptive fields’, arXiv preprint. arXiv:1802.07101.
  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y. (2014) ‘Generative adversarial nets’, Advances in neural information processing systems, pp. 2672-2680.
  • Zhu, J.-Y., Park, T., Isola, P. and Efros, A. A. (2017) ‘Unpaired image-to-image translation using cycle-consistent adversarial networks’, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2223– 2232.

Data Augmentation: Part 2

continued from part 1

Data Augmentation with SMOTE

What about the case when we don’t know how to peturb the data to ensure that label information is preserved? Well the Synthetic Minority Over Sampling Technique (SMOTE) can be used. Imagine every sample is a point in a multi-dimensional graph, where each dimension of the graph is one of the features. This is commonly referred to as feature space. Select two random samples fromthe same class. Now imagine drawing a line between them in feature space, and then creating a new synthetic sample at some random distance along that line. Its easiest to visualise this in two dimensions!

There are some existing implementations of SMOTE freely available on the internet. Here are a couple of sources:

  • MATLAB – SMOTE by Manohar
  • R – SMOTE is part of the DMwR package

Instead of just using SMOTE to create additional samples of a single minority class, we are going to increase the abundance of every class. Furthermore, instead of perturbing the raw input data, we will first transform the data into features (using convolutional filters) [1], and then apply SMOTE to create the additional samples.


Lets perform our experiment again. But this time we will use the synthetically created samples to augment the training data set. Starting with 500 real samples per class, we will use Data Warping and SMOTE to iteratively increase the number of samples all the way up to 5,000 samples per class. Then we can compare the improvement in classifier performance of Data Warping (using elastic distortions) with that of SMOTE.

Figure 3 shows the results for the CNN classifier. We get good improvement in classifier performance using Data Warp, but it’s not quite as good as using real samples. We get modest improvement in classifier performance using SMOTE.

Figure 3. CNN Error % vs Number of Training Samples.

Figure 3. CNN Error % vs Number of Training Samples.

Figure 4 shows the results for the SVM classifier. Here we get some improvement in performance using Elastic Data Warping. But no where near as good as using real-sample. We have no improvement, and even a slight degradation, in performance using SMOTE.

Figure 4. SVM Error % vs Number of Training Samples.

Figure 5 shows the results for the ELM classifier. Here is results are mix. A large number of synthetic Elastic Data Warping samples are required to provide a modest improvement in test set error% performance. Whereas a small amount of SMOTE samples provides a small improvement in test set error%. But increasing the number of samples further degrades performance.


Figure 5. ELM, Error % vs the Number of Training Samples.


For problems where the classifier is overfitting the data, the best solution to improving classifier performance is to collect more data. However, Data Augmentation using synthetic samples is possible and can give good results. Data Warping (such as elastic deformation) will give good results, if label preserving transformations of the data are known. Otherwise the SMOTE algorithm can be used to generate synthetic samples.

Convolutional Neural Networks (CNNs) are very amenable to data-augmentation techniques. They are my first choice for classification problems with spatial data.


[1] Wong, Sebastien C., Adam Gatt, Victor Stamatescu, and Mark D. McDonnell. (2016) “Understanding data augmentation for classification: when to warp?.” arXiv preprint arXiv:1609.08764 .

Data Augmentation: Part 1

One of the key components to a successful machine learning product is having sufficient, good quality data to train the classifier. The data samples should be representative of the entire population distribution. Increasing the number of samples reduces the risk of your model over fitting the data. That is, the model is too complex for the data set. The best way to get more samples is to simply go out and collect them. This might mean expensive and time consuming experimental data collection, along with manually labelling thousands or even million of samples with the correct class label.

However, in many instances the cost and/or time required to collect the additional samples is prohibitive. In this article, I will outline two methods to synthetically increase the number of samples available for your machine learning task.

How much is enough?

The first question to ask is do you have enough samples? The simplest way to answer this is to divide your entire data set into two groups: a training set, and a test set. (A better approach is to create three groups: training, validation, and test set). You then train the system using only part of the training data, but test the model using the complete test set. Then incrementally increase the amount of training data used and retrain the system. By graphing the error rate (of both the training and test set) against the number of training samples, you will be able to evaluate the if you have sufficient data. The plot you created should should show the gap in error rate between the training set and the test set decrease. A large gap between the training error and the test error indicates overfitting of the model, which can usually be remedied by training with more data.

Lets do this for the Mixed National Institute of Standards and Technology (MNIST) handwritten digit dataset, which has 10 classes the numbers 0 to 9. The data is split into a 50,000 sample training set (i.e., 5000 per class), and a 10,000 sample test set. We will use three different classifiers: (i) a convolutional neural network (CNN), (ii) a convolutional support vector machine (CSVM), and (iii) a convolutional extreme learning machine (CELM). As we increase the number of samples, the training error percentage will generally increase, but the test error percentage will decrease. A rough rule of thumb to prevent over fitting the model is to ensure that the gap between the training error and test error is within 0.5%. It is also important to confirm that the test error percentage good enough for your application!

Figure 1. Baseline Results. Error %vs Number of Training Samples.


Great! So 50,000 samples (i.e., 5,000 per class) provides enough data to prevent our three classifiers from overfitting.

But what if the gap was bigger?

Let’s say we only had a total of 5,000 handwritten digits in the training class (500 per class). Here, (far left of the plot) the gap between training and testing is over 1%. So we could reduce the gap by making our classifiers simpler, (e.g. less neurons), but this would also increase the overall test error%.

Instead, we are going to artificially increase the number of samples through:

  1. Data Warping
  2. Synthetic Over Sampling (SMOTE). This will be covered in a subsequent blog post.

Data Warping with Elastic Deformations.

The basic idea with data warping is that we are going to transform the images of the handwritten digits, while still preserving the label information. This means, warp it a bit, but make sure it still looks like the original number!

To do this we are going to create a random displacement field. This is a matrix that causes pixels-values in each digit to be moved (a little) to new locations. So we can use a 2D matrix with uniformly distributed random numbers. But we also want this movement to be smooth, so we will convolve the matrix with a Gaussian.

The code to do this looks like:

function [X_warped, morelabels] = DataWarpingDiffusion(X, labels, K, N, alpha)
 % [Y, morelabels, L] = DataWarping(X, labels, k, n, alpha)
 % A function to increase the number of training data vectors,
 % by creating N warped duplicates of each of the K vectors.
 % Uses pseudo-elastic warping, see
 % (Simard, 2003) Best Practices in Convolutional Neural Networks
 % X - the data in row vector form
 % labels - the labels for each of the vectors
 % K - the number of vectors
 % N - the number of duplicates
 % alpha - warp-strength [in pixels]
 % Sebastien Wong, 5 Jan 2014
 Xim = reshape(X',28,28,K); % assuming 28 by 28 input image
 L = K*N;
 Y = zeros(28,28,L);
 l = 1;
 for n=1:N,
   for k = 1:K
     I = Xim(:,:,k);
     C = rand(28,28,2)*2 - 1;
     blur = d2gauss(28,20,28,20,0);
     C(:,:,1) = conv2(C(:,:,1),blur,'same');
     C(:,:,2) = conv2(C(:,:,2),blur,'same');
     E = sqrt ( C(:,:,1).^2 + C(:,:,2).^2 ) + 1e-7;
     C(:,:,1) = C(:,:,1) ./ E;
     C(:,:,2) = C(:,:,2) ./ E;
     Y(:,:,l) = imwarp(I, C * alpha);
     l = l+1;
 X_warped = reshape(Y,28*28,[])';
 morelabels = repmat(labels,[N,1]);

So what do these warped digits look like (see below)? Importantly how strong should the displacement alpha be? I found that alpha = 1.2 pixels worked well for this data set. I was pretty sure that all the warped digits still looked liked digits. If we make alpha large, say alpha = 8, we can cause some of digits to look like other numbers!


Figure 2. Warped Digits from the MNIST Database, with alpha = 1.2 (left) and alpha = 8 (right).


continue to part 2