Binning functions¶
SciKit-GStat implements a large amount of binning functions, which can be used to spatially aggregate the distance matrix into lag classes, or bins. There are a number of functions available, which usually accept more than one method identifier:
-
skgstat.binning.
even_width_lags
(distances, n, maxlag)[source]¶ Even lag edges
Calculate the lag edges for a given amount of bins using the same lag step width for all bins.
Changed in version 0.3.8: Function returns None as second value to indicate that The number of lag classes was not changed
- Parameters
distances (numpy.array) – Flat numpy array representing the upper triangle of the distance matrix.
n (integer) – Amount of lag classes to find
maxlag (integer, float) – Limit the last lag class to this separating distance.
- Returns
bin_edges – The upper bin edges of the lag classes
- Return type
-
skgstat.binning.
uniform_count_lags
(distances, n, maxlag)[source]¶ Uniform lag counts
Calculate the lag edges for a given amount of bins with the same amount of observations in each lag class. The lag step width will be variable.
Changed in version 0.3.8: Function returns None as second value to indicate that The number of lag classes was not changed
- Parameters
distances (numpy.array) – Flat numpy array representing the upper triangle of the distance matrix.
n (integer) – Amount of lag classes to find
maxlag (integer, float) – Limit the last lag class to this separating distance.
- Returns
bin_edges – The upper bin edges of the lag classes
- Return type
-
skgstat.binning.
auto_derived_lags
(distances, method_name, maxlag)[source]¶ Derive bins automatically .. versionadded:: 0.3.8
Uses histogram_bin_edges <numpy.histogram_bin_edges> to derive the lag classes automatically. Supports any method supported by histogram_bin_edges <numpy.histogram_bin_edges>. It is recommended to use ‘sturges’, ‘doane’ or ‘fd’.
- Parameters
- Returns
bin_edges – The upper bin edges of the lag classes
- Return type
See also
-
skgstat.binning.
kmeans
(distances, n, maxlag, binning_random_state=42, **kwargs)[source]¶ New in version 0.3.9.
Clustering of pairwise separating distances between locations up to maxlag. The lag class edges are formed equidistant from each cluster center. Note: this does not necessarily result in equidistance lag classes.
- Parameters
distances (numpy.array) – Flat numpy array representing the upper triangle of the distance matrix.
n (integer) – Amount of lag classes to find
maxlag (integer, float) – Limit the last lag class to this separating distance.
- Returns
bin_edges – The upper bin edges of the lag classes
- Return type
See also
sklearn.cluster.KMeans()
Note
The
KMeans
that is used under the hood is not a deterministic algorithm, as the starting cluster centroids are seeded randomly. This can yield slightly different results on reach run. Thus, for this application, the random_state on KMeans is fixed to a specific value. You can change the seed by passing another seed toVariogram
as binning_random_state.
-
skgstat.binning.
ward
(distances, n, maxlag, **kwargs)[source]¶ New in version 0.3.9.
Clustering of pairwise separating distances between locations up to maxlag. The lag class edges are formed equidistant from each cluster center. Note: this does not necessarily result in equidistance lag classes.
The clustering is done by merging pairs of clusters that minimize the variance for the merged clusters, unitl n clusters are found.
- Parameters
distances (numpy.array) – Flat numpy array representing the upper triangle of the distance matrix.
n (integer) – Amount of lag classes to find
maxlag (integer, float) – Limit the last lag class to this separating distance.
- Returns
bin_edges – The upper bin edges of the lag classes
- Return type
See also
sklearn.clsuter.AgglomerativeClustering()
-
skgstat.binning.
stable_entropy_lags
(distances, n, maxlag, **kwargs)[source]¶ Optimizes the lag class edges for n lag classes. The algorithm minimizes the difference between Shannon Entropy for each lag class. Consequently, the final lag classes should be of comparable uncertainty.
- Parameters
distances (numpy.array) – Flat numpy array representing the upper triangle of the distance matrix.
n (integer) – Amount of lag classes to find
maxlag (integer, float) – Limit the last lag class to this separating distance.
- Keyword Arguments
- Returns
bin_edges – The upper bin edges of the lag classes
- Return type