Ok, thanks for following up Allison. I use 10.4, too. I sort of follow what Gerry said but I'm struggling to translate it into automated Python code.
In essence, I believe he is getting each of the elements' extent from the layout view using arcpy.mapping classes (ie GraphicsElements, DataFrame, TextElements, etc). That may take some math in python to calculate the extent values (xy min/max) if you are using any other layout element besides DataFrame (LegendElement, TextElement, etc.). DataFrame has an "extent" property from which you can get xmin, xmax, ymin, ymax values. If using other elements, you would have to calculate those values using the elementHeight, elementWidth, elementXPosition, and elementYPosition properties.
Then, I suppose he is cropping the original exported image using PIL's (Python Image Library) "crop" method, which requires four pixel coordinates values- left, upper, right, lower. I've never used PIL, so I'm not familiar enough with what properties (xmin, xmax, ymin, ymax) from the mapping class equal what pixel coordinates from PIL crop method. Also, I'm not sure if the properties (xmin, xmax, etc) from ArcGIS mapping, which are in map units, can easily be inputted into PIL's crop method without first undergoing some type of conversion to get them into pixel coordinates. Plus, I'm not sure what the units of the original exported image (in my case, a JPEG with unwanted white space) are! If they aren't correct in pixel coordinates, then you'd have to convert them so crop method works properly.
That's probably more confusing than helpful, but I was basically thinking off the top of my head in hopes it was useful for you. It would be nice to have Gerry's input, but I don't think he's even logged into GeoNet for quite some time...