Skip navigation
All People > Dan_Patterson > Py... blog > 2016 > March
2016

Data can be viewed in a variety of ways.  This is example of how to view data.  The examples I am going to use are arrays, but they can equally be replaced with small maps or other forms of data.

 

Picture this... a 3D list of data, like observations for 3 people.  A simple printout of the data is pretty easy to digest if the set is small and the dimensions are few.  Here are a couple of options.  The first column (simple 3D array) is the standard presentation.  The second column adds a bit more information and puts the dimension number in place of the blank row used previously, which would be handy if the list were long.  The last column reshuffles the data so that each entry into the thrid dimension is a column block and it consolidates the number of rows needed to show data that are smaller in their columns with respect to the number of rows.

 

Simple 3D array3D with a tad more information3D by rows each dimension as column

The array... (a_3d) ...

[[[ 7  2 11]
  [ 2  2  5]
  [ 7  0  1]
  [ 8  5  4]
  [ 8  8  6]
  [10  4  0]]

 [[ 7 13  5]
  [10 11  4]
  [11  0  2]
  [ 8  5 11]
  [ 7  2  0]
  [ 2 11  6]]

 [[ 9  6  9]
  [ 5  7  0]
  [12  9  6]
  [ 3  4 10]
  [ 9 14  1]
  [ 7  3 11]]]

de_line(a_3d) with extra info...

array...
shape (3, 6, 3) ndim 3 size 54
a[0]...
[[[ 7  2 11]
  [ 2  2  5]
  [ 7  0  1]
  [ 8  5  4]
  [ 8  8  6]
  [10  4  0]]
a[1]....
 [[ 7 13  5]
  [10 11  4]
  [11  0  2]
  [ 8  5 11]
  [ 7  2  0]
  [ 2 11  6]]
a[2]....
 [[ 9  6  9]
  [ 5  7  0]
  [12  9  6]
  [ 3  4 10]
  [ 9 14  1]
  [ 7  3 11]]]

to_row(a_3d)

 

Array...  3D   r  c
  shape: (3, 6, 3) size: 54 
a[0]......  a[1]......  a[2]......  
[ 7  2 11]  [ 7 13  5]  [ 9  6  9]
[ 2  2  5]  [10 11  4]  [ 5  7  0]
[ 7  0  1]  [11  0  2]  [12  9  6]
[ 8  5  4]  [ 8  5 11]  [ 3  4 10]
[ 8  8  6]  [ 7  2  0]  [ 9 14  1]
[10  4  0]  [ 2 11  6]  [ 7  3 11]

 

The tabular data shown in the last column can be presented in graphical form if the arrays are fairly simple.

 

image.jpeg

Color would be cool, with or without labels, or you can skip the fill and simply use the labels.

 

Of course this can be applied to higher or lower dimensions as in the case of the 4D array below, which consists of 5 3D arrays.

 

4D shaping

 

Array... 4D  shape: (5, 3, 6, 3) size: 270
3D subarrays: 5

Array...  3D   r  c
  shape: (3, 6, 3) size: 54
a[0]......  a[1]......  a[2]......
[ 7  2 11]  [ 7 13  5]  [ 9  6  9]
[ 2  2  5]  [10 11  4]  [ 5  7  0]
[ 7  0  1]  [11  0  2]  [12  9  6]
[ 8  5  4]  [ 8  5 11]  [ 3  4 10]
[ 8  8  6]  [ 7  2  0]  [ 9 14  1]
[10  4  0]  [ 2 11  6]  [ 7  3 11]

Array...  3D   r  c
 shape: (3, 6, 3) size: 54
a[0]......  a[1]......  a[2]......
...
SNIP
...
[16 10  6]  [ 8 17 12]  [13  9 17]

Array...  3D   r  c</p>
  shape: (3, 6, 3) size: 54
a[0]......  a[1]......  a[2]......
[15 10 19]  [15 21 13]  [17 14 17]
[10 10 13]  [18 19 12]  [13 15  8]
[15  8  9]  [19  8 10]  [20 17 14]
[16 13 12]  [16 13 19]  [11 12 18]
[16 16 14]  [15 10  8]  [17 22  9]
[18 12  8]  [10 19 14]  [15 11 19]

 

I suppose it is a matter of perspective which format you prefer.  Comments welcome prior to posting my code in case people have views they would like to see.

Like is says.

 

Modified:  2016-12-05

Simplified aspect calculation and added the ability of how to classify 'flat' during aspect calculations.  this should clean up those pesky random seemingly random aspect values in areas of small slope.

 

This is simply an implementation of the 3rd order finite difference method of determining slope and aspect (Horton 1981, Burrough et al. various) and used by the Spatial Analyst extension.  It is also the base for a variety of terrain derivatives.  I have improved the base algorithm to facilitate using larger arrays/rasters in a shorter period of time.  This implementation doesn't account for nodata values... I will leave that for the Code Sharing site.  I will leae the code documentation and discussion attached, but bear in mind it is much slower than the code below... but the fundamentals are the same.

 

This approach provides an estimate of slope which attempts to account for the cell values (ie elevations) in the 8 neighbours surrounding a core cell.  A measure of the slope in both the X and Y directions is made and averaged.

 

Alternate measures of slope can be simply implemented using variants of the fundamental algorithms documented in the attachment.  The author has also worked with implementations of simple hillshading and related neighbour functions which employ the same or similar methods of parsing 2D arrays into a 4D array.

The block or tiling process that is used to produce the subarrays has been documented in previous threads and here.

 

image.png - image.png

 

Sample surface...

Slopes form a pyramid. Edges trimmed...

>>> Dem...
[[0 1 2 3 2 1 0]
[1 2 3 4 3 2 1]
[2 3 4 5 4 3 2]
[3 4 5 5 5 4 3]
[2 3 4 5 4 3 2]
[1 2 3 4 3 2 1]
[0 1 2 3 2 1 0]]

Slope...
[[ 15.79  15.79  11.31  15.79  15.79]
[ 15.79  13.9    8.53  13.9   15.79]
[ 11.31   8.53   0.     8.53  11.31]
[ 15.79  13.9    8.53  13.9   15.79]
[ 15.79  15.79  11.31  15.79  15.79]]

Aspect...
[[ 315.  315.    0.   45.   45.]
[ 315.  315.    0.   45.   45.]
[ 270.  270.  270.   90.   90.]
[ 225.  225.  180.  135.  135.]
[ 225.  225.  180.  135.  135.]]

 

 

Slope calculations... Aspect calculations

 

>>> # ---- Slope calculation using
>>> # finite difference ----
>>>
>>>Slope face...
[[1 1 1]
[3 3 3]
[5 5 5]]
N     dx    deg.    %
(1 )   0.5   71.6  400.0
(2 )   1.0   56.3  200.0
(3 )   2.0   36.9  100.0
(4 )   4.0   20.6   50.0
(5 )   6.0   14.0   33.3
(6 )   8.0   10.6   25.0
(7 )  10.0    8.5   20.0
(8 )  12.5    6.8   16.0
(9 )  15.0    5.7   13.3
(10)  20.0    4.3   10.0
(11)  25.0    3.4    8.0
(12)  40.0    2.1    5.0
(13)  80.0    1.1    2.5
(14) 100.0    0.9    2.0
>>> # ---- dx is the cell width ----
>>> # It is a north facing slope,
>>> # with dx=4 and dz=2, you get a
>>> # 20.6 deg. or 50% slope
>>>

 

>>> # ---- Some windows with
>>> #   aspect determination ----
>>>
(0) Array aspect...0.0
[[1 0 1]
[2 2 2]
[3 4 3]]
(1) Array aspect...315.0
[[0 1 2]
[1 2 3]
[2 3 4]]
(2) Array aspect...270.0
... snip ...
(4) Array aspect...180.0
[[3 4 3]
[2 2 2]
[1 0 1]]
(5) Array aspect...135.0
[[4 3 2]
[3 2 1]
[2 1 0]]
(6) Array aspect...90.0
[[3 2 1]
[4 2 0]
[3 2 1]]
(7) Array aspect...45.0
[[2 1 0] etc... etc ...

 

 

Now for some functions

 

The code below provides a check on the arrays then a few helper functions before the slope_a and aspect_a functions.  A demo function is used to show the results for a 3xN window with know properties.  I have used this easily on arrays in the 1000's  in both x and y.

import sys
import numpy as np
from numpy.lib.stride_tricks import as_strided
from textwrap import dedent, indent
import matplotlib.pyplot as plt

ft = {'bool': lambda x: repr(x.astype('int32')),
      'float': '{: 0.1f}'.format}
np.set_printoptions(edgeitems=10, linewidth=100, precision=2,
                    suppress=True, threshold=200,
                    formatter=ft)
np.ma.masked_print_option.set_display('-')
def _check(a, r_c, subok=False):
    """Performs the array checks necessary for stride and block.
    : see arr_tools
    """

    if isinstance(r_c, (int, float)):
        r_c = (1, int(r_c))
    r, c = r_c
    if a.ndim == 1:
        a = np.atleast_2d(a)
    r, c = r_c = (min(r, a.shape[0]), min(c, a.shape[1]))
    a = np.array(a, copy=False, subok=subok)
    return a, r, c, tuple(r_c)

def stride(a, r_c=(3, 3)):
    """Provide a 2D sliding/moving view of an array. 
    :  see arr_tools
    """

    a, r, c, r_c = _check(a, r_c)
    shape = (a.shape[0] - r + 1, a.shape[1] - c + 1) + r_c
    strides = a.strides * 2
    a_s = (as_strided(a, shape=shape, strides=strides)) #.squeeze()
    return a_s


def filter_a(a_s, filter=None, cell_size=1):
    """Used by aspect, slope and hillshade to filter a raster/array
    :Requires:
    :--------
    : a_s - a r*c*3x3 strided array
    : filter - a 3x3 filter to apply to a_s
    : cell_size - for slope (actual size*8, aspect (8 is required)
    """

    f_dxyz = filter
    cell_size= cell_size*8.0
    if filter is None:
        f_dxyz = np.array([[1, 2, 1], [2, 0, 2], [1, 2, 1]], dtype="float64")
    a_s = a_s * f_dxyz
    dz_dx = (a_s[...,:,2] - a_s[...,:,0]).sum(axis=-1)/cell_size
    dz_dy = (a_s[...,2,:] - a_s[...,0,:]).sum(axis=-1)/cell_size
    return dz_dx, dz_dy


def slope_a(a, cell_size=1, degrees=True, verbose=False, keepdims=False):
    """Return slope in degrees for an input array using 3rd order
    :  finite difference method for a 3x3 moing window view into the array.
    :
    :Requires:
    :--------
    : a - an input 2d array. X and Y represent coordinates of the Z values
    : cell_size - must be in the same units as X and Y
    : degrees - True, returns degrees otherwise radians
    : verbose - True, to print results
    : keepdims - False, to remove/squeeze extra dimensions
    : filter - np.array([[1, 2, 1], [2, 0, 2], [1, 2, 1]]) **current default
    :
    :Notes:    dzdx: sum(col2 - col0)/8*cellsize
    :-----     dzdy: sum(row2 - row0)/8*celsize
    : Assert the array is ndim=4 even if (1,z,y,x)
    :   general         dzdx      +    dzdy     =    dxyz
    :   [[a, b, c],  [[1, 0, 1],   [[1, 2, 1]       [[1, 2, 1]
    :    [d, e, f]    [2, 0, 2], +  [0, 0, 0]   =    [2, 0, 2],
    :    [g, h, i]    [1, 0, 1]]    [1, 2, 1]]       [1, 2, 1]]
    :
    """

    frmt = """\n    :----------------------------------------:
    :{}
    :input array...\n    {}
    :slope values...\n    {!r:}
    :----------------------------------------:
    """

    # ---- stride the data and calculate slope for 3x3 sliding windows
    a_s = stride(a, r_c=(3,3))
    if a_s.ndim < 4:
        new_shape = (1,) * (4-len(a_s.shape)) + a_s.shape
        a_s = a_s.reshape(new_shape)
    r = a_s.shape[0]
    c = a_s.shape[1]
    f_dxyz = np.array([[1, 2, 1], [2, 0, 2], [1, 2, 1]], dtype="float64")
    # ---- default filter, apply the filter to the array ----
    #
    dz_dx, dz_dy = filter_a(a_s, filter=f_dxyz, cell_size=cell_size)
    #
    s = np.sqrt(dz_dx**2 + dz_dy**2)
    if degrees:
        s = np.rad2deg(np.arctan(s))
    if not keepdims:
        s = np.squeeze(s)
    if verbose:
        from textwrap import indent, dedent  # if not imported
        p = "    "
        args = ["Results for slope_a... ",
                indent(str(a), p), indent(str(s), p)]
        print(dedent(frmt).format(*args))
    return s

   
def aspect_a(a, cell_size=1, flat=0.1):
    """Return the aspect of a slope in degrees from North.
    :Requires:
    :--------
    : a - an input 2d array. X and Y represent coordinates of the Z values
    : flat - degree value, e.g. flat surface <= 0.05 deg
    :        0.05 deg => 8.7e-04    0.10 deg => 1.7e-02           
    :
    """

    if not isinstance(flat, (int, float)):
        flat = 0.1
    a_s = stride(a, r_c=(3,3))
    if a_s.ndim < 4:
        new_shape = (1,) * (4-len(a_s.shape)) + a_s.shape
        a_s = a_s.reshape(new_shape)
    r = a_s.shape[0]
    c = a_s.shape[1]
    f_dxyz = np.array([[1, 2, 1], [2, 0, 2], [1, 2, 1]], dtype="float64")
    a_s = a_s * f_dxyz
    # filter the array, using default filter f_dxyz above
    #
    dz_dx, dz_dy = filter_a(a_s, filter=f_dxyz, cell_size=1)
    #
    asp = np.arctan2(dz_dy, -dz_dx)     # relative to East
    s = np.sqrt((dz_dx*cell_size)**2 + (dz_dy*cell_size)**2)    # get the slope
    asp = np.rad2deg(asp)
    asp = np.mod((450.0 - asp), 360.)   # simplest way to get azimuth
    asp = np.where(s <= flat, -1, asp)
    return asp


def _demo():
    """Demonstration of calculations
    :
    """

    frmt = """
    :------------------------------------------------------------------
    :{}
    :input array
    {}
    :slope values
    {}
    :zero?
    {}
    :aspect values
    {}
    :
    :------------------------------------------------------------------
    """

    p = "    "
    t = 1.0e-12
    d = [[0, 0, 0, 0, t, t, t, 0, 0, 0, 0, 0, 0, 4, 4, 4, 2, 2, 2],
         [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, t, t, t, 4, 4, 4, 2, 2, 2],
         [0, 0, 0, 0, 0, 0, 0, t, t, t, 0, 0, 0, 4, 4, 4, 2, 2, 2]]
    a = np.array(d, dtype='float64')
    sl = slope_a(a, cell_size=5, degrees=True, verbose=False, keepdims=False)
    sboo = np.where(sl==0.0, " T", " F")
    asp = aspect_a(a)
    z = np.ma.asarray(a)
    z.mask = (z==t)
    args = ["Surface properties",
            z,
            indent(str(sl), p),
            indent(str(sboo), p),
            indent(str(asp), p)]
    for i in d: print(i)
    print(dedent(frmt).format(*args))
    return a, sl, asp, d
# ---------------------------------------------------------------------
if __name__ == "__main__":
    """Main section...   """
    #print("Script... {}".format(script))
    a, sl, asp, d = _demo()
 

That's all for now.

Free Frequency...   

 

Announced  2017-04-15  ***Frequency is freed... version for ArcGIS Pro

Created        2016-03-02

Modified       2016-12-20  *** the shackles remain in 10.5 make it your resolution in 2017 to join the cause

References   many

 

Why is this tool only available at the Advanced license level???

Come on. That is lame. This type of question comes up time and again.

Select your options from the following request.

------------------------------

I have ...

    • two   - several   - many

columns containing class information with

    • the same   - different data types

and I want to combine them to produce a unique classification scheme and

    • count the number in each class
    • calculate some statistical value,
    • graph it.

------------------------------

Get the picture?  Sound familiar?

I will show the results of doing this type of analysis and provide an attachment for you to do it.

In ArcGIS, the Frequency tool produces these results

    Frequency_01.png

Now these results look similar...in fact identical, but created without using the frequency tool... but using tools provided within ArcMap itself.

Input arraySorted arraySummary

    A     B

   'a',   'c'

   'b',   'a'

   'c',   'b'

   'c',   'b'

   'b',   'a'

   'a',   'c'

   'b',   'a'

   'c',   'b'

   'b',   'a'

   'a',   'c'

   'a',   'a'

   'c',   'b'

   A     B    class row

   'a',   'a'     0       0

   'a',   'c'     1       1

   'a',   'c'     1

   'a',   'c'     1

   'b',   'a'     2       4

   'b',   'a'     2

   'b',   'a'     2

   'b',   'a'     2

   'c',   'b'     3       8

   'c',   'b'     3

   'c',   'b'     3

   'c',   'b'     3

combinations    class

  ( 'a',   'a' )              0

  ( 'a',   'c' )              1

  ( 'b',   'a' )              2

  ( 'c',   'b' )              3

 

first occurence  0, 1, 4, 8

 

reclassed list

  0 1 1 1 2 2 2 2 3 3 3 3

 

class     0   1   2   3   4

count    1    3   4   4   -

 

Now to produce the results above, you could

  • build an SQL style query using the unique combinations of inputs for each field   (ie if A =='a' and B == 'a': then...)
  • concatenate the values in each field (accounting for differences in data types), get the unique combinations

These can be determined using:

  • built in tools in arcmap ( field calculator, the Calculate Field, Frequency and Summarize tools).
  • scripting, using python with searchcursors or arcobjects.  You could even make a fancy interface to collect and present the results.

You can even produce graphs.

 

So... how was it done?  ... it is in the attachment.

 

Join the cause... frequently used tools should be part of the standard arsenal.

Basic fancy formats...

 

Created:   2016-02-28

Modified: 2016-08-13

Attachments:

    Script attached in zip file below.

References:

 

This is a work in progress.

I am tired of reinventing the wheel every time I want to format output so it looks appealing and conveys the information I want.  To this end, I am going to present what I have... but I will do this in increments.

 

I will not provide commentary... I don't want to cloud the issue with rules and specifics.  I use a monospaced font to ensure that the output is what you would expect in any python IDE, and to control vertical alignment within a pure text environment.  Asample script is attached so you can experiment with your own ideas.

 

Consult the script header for further information.  The script also contains some format tips that arise from formatting script... which I have posted on previously.

 

Sample output (courier new 10pt)

This is the only font that will maintain monospacing from the available font family.  I formatted using the "Plain" syntax highlighter so that line numbers were created for reference purposes.

 

Basic formatting....

    :  |_{ [name/num] [!s, !r, none] [: format_spec]_|}
    :  |_{ vertical bar to denote start, _ optional spaces
    :  }_| vertical bar to denote end, _ optional spaces
    : ------------------------------------

    :   [name/num]  = empty, name, number
    :   [!s or !r]  = str or repr format or nothing
    :   [format]    = see examples in output

1   |{0:_>15.0f}|   - lead _, right, integer           |_________123456|

2   |{0:_>15.3f}|   - lead _, right, float             |_____123456.012|

3   |{0:_>+15.3f}|  - lead _, right, req. sign, float  |____+123456.012|

4   |{0:_> 15.3f}|  - lead _, right, opt. sign, float  |____ 123456.012|

5   |{0:> 15.3f}|   - no lead, right, opt. sign, float |     123456.012|

6   |{0:>15.3f}|    - whatever, float                  |     123456.012|

7   |{0: <15.3f}|   - left, float, trailing space(s)   |123456.012     |

8   |{0: < 15.3f}|  - left, space(s), float, spaces    | 123456.012    |

9   |{0:_< 15.3f}|  - left, space(s), float, __        | 123456.012____|

10  |{0!s:<15}|     - left, string                     |123456.01234   |

11  |{0!s:*<15}|    - left, string space(s)            |123456.01234***|

12  |{0!s:*^15}|    - center, lead/trail               |*123456.01234**|

13  |_{0!s: ^13}_|  - lead, center, trail              |_123456.01234 _|

14  |_{0!s:*<}_|    - no size, lead ignored, left      |_123456.01234_|

15  |_{0: >5f}_|    - float, undersized                |_123456.012340_|


Fancy formatting
....

Format elements are generated from field widths and content.
Some examples for ... 1234.5 ...

widths.... [6, 8, 10, 12]
building..

    f = [ "|_{{0!s:^{}}}_|".format(width) for width in widths ]
    txt = "".join(i for i in f)
    print(txt.format(1234.5))

formats... ['|_{0!s:^6}_|', '|_{0!s:^8}_|', '|_{0!s:^10}_|', '|_{0!s:^12}_|']

result.... |_1234.5_||_ 1234.5 _||1234.5  _||_   1234.5   _|


 

Comments welcome.

 

More to come....

Created:   2016-03-01

Modified:  2016-08-13

Attachments:

    Script attached in zip file below.

References:

 

 

This is a work in progress

This example shows how to generate column formats from the data itself and how to generate a header from the tabular information.  The simplest way to obtain a useful table for manipulation in other software is to use the FeatureclassToNumPyArray or TableToNumPyArray methods in arcpy.  A sample output for such a table is shown in the example.

 

In the example below, the following procedure was used:

  • the first row of the table is used to obtain the column widths and data type
  • an appropriate format string was used to create the column structure... NOTE. as in the previous example a vertical bar, |, is used to denote the column positional start and the underscore, _, to denote preceding or trailing spaces.
  • the format string and the number of columns in the table creates the line format,
  • combining that information with column width gives a resultant column format
  • an option to use the maximum size between the column name and data size is used to modify the values of column width.
  • finally... a column header...generated from the input data... is produced and the line is printed.

 

The next example will delve into how to do this all in one swoop giving the option to use column widths based upon data size with the columns and/or whether to maintain column size, alignment, record delimiters and whether to include ancillary information related to the table.

 

Format demo

 

Input array....

:.... [(0, [300000.0, 5025000.0], 12345, 1.2345, b'string', 'unicode')]

format string: |_{{!s:<{}}}_

column widths: [2, 21, 5, 6, 9, 7]

column type:   ['int32', 'ndarray', 'int32', 'float64', 'bytes_', 'str_']

data dtype:    ['i', 'f', 'i', 'f', 'S', 'U']

line format.

...  |_{{!s:<{}}}_|_{{!s:<{}}}_|_{{!s:<{}}}_|_{{!s:<{}}}_|_{{!s:<{}}}_|_{{!s:<{}}}_

resultant:..|_{!s:<2}_|_{!s:<21}_|_{!s:<5}_|_{!s:<6}_|_{!s:<9}_|_{!s:<7}_


Column header... |.ID.|..........XY...........|..C_0..|..C_1...|....C_2....|...C_3...

Output line..... |_0 _|_[  300000.  5025000.]_|_12345_|_1.2345_|_b'string'_|_unicode_


 

More later...