I am interested on generating GeoTiff images given a set of scattered points, the dataset is normally between 500K points up to 2M points. I have been using datashader and geoviews to generate static images of the data and that works fine. However, when I want to generate the geotiff, there are two options:
- To generate an RGB geotiff with the coloring as provided by datashader Image color map
- To use the values inside the geotiff to create a colormap on the geotiff server.
We are interested on going for the second solution but I do not understand the values which I extract from the shade function:
import pandas as pd
import datashader as ds
import datashader.transfer_functions as tf
df = pd.DataFrame({"x": [0, 0.6, 1, 1.2, 2, 5], "y":[0, 0.6, 0.8, 1, 5, 6], "value":[1, 1, 1, 1, 0.5, 0.5]})
canvas = ds.Canvas(plot_width=100, plot_height=100, x_range=(0, 5), y_range=(0, 5), x_axis_type="linear", y_axis_type="linear")
shaded = tf.shade(canvas.points(df, "x", "y", ds.mean("value")))
As far as I know, I extract the aggregated values from shade as numpy and can use this array to save a GeoTiff with one band but the values into that array make no sense for me.
shaded.to_numpy().max()
4293318829
It is always huge numbers which have no relationship with the aggregation process.
So the questions are, what do those values actually mean? Is it possible to get the actual aggregated values ? Is it possible to extract the color mapping from an Image object so I can pass it to the server and they color the geotiff as I expect it to be colored?
So datashader uses a couple of steps to produce the results we get. The more important, there is a step in which the values are mapped to RGB using the color_map input. After that the colourize function inside datashader stack the 4 RGBA uint8 arrays of size (100, 100) into one uint32 (100, 100) array using this:
From datashader:
That is why the values are so strange. To get the RGBA, one can access the data like this: