Using generative AI to write a map description

TomBaker · ‎07-24-2023

Screenshot 2023-07-24 at 10.39.19 AM.png I recently upgraded my free ChatGPT account to a paid version. Paid versions include beta features like plugins and the Code Interpreter. Using the GPT-4 model, I added a plugin designed to describe images in text. It is a bit like “reverse engineering” an AI prompt that might be used in an AI image generator. The plugin I used for this post is called SceneXplain. At the time of this post, the plugin was free to use.

Prompt:

In ChatGPT 4, I provided this prompt:

explain this image: http://images.mapsofworld.com/usa/thematic-maps/usa-population-map.jpg

Results:

ChatGPT and SceneXplain returned the following response:

"The image shows a map of the United States, which is displaying the population estimate for each state. At the top of the map, there is a caption that reads "USA POPULATION ESTIMATE BY STATE" in large letters. Below this caption, there is additional text that says "(As of July 1, 2016)".

The map is divided into various sections, each representing a different state. The states and their corresponding abbreviations are labeled with text positioned within their boundaries. Some of the state abbreviations include "WA" for Washington, "MT" for Montana, "ME" for Maine, "ND" for North Dakota, and "OR" for Oregon. These abbreviations are scattered across the map, with each one appearing near its respective state's boundary.

The map also includes labels for population ranges, symbolized by different colors. These labels are placed at the bottom of the map. They indicate the population ranges from "10,000,001-40,000,000" to "1,000,001-5,000,000" to "100,000-1,000,000". The corresponding scale for the range values is displayed as "1 82.5 125" on the right side."

Reactions?

At this time, the tools don’t describe the pattern in the map.
It’s not clear the tools understand the linkage between the legend and map
The tools do a reasonably decent job at describing the content of the image.
The narrative could be helpful as context or for framing a map that a student is to interpret. This may be especially useful for students with visual or cognitive impairments.
The tools don’t work well today for complex JPG/PNG maps nor does this tool work on interactive maps.

This is a first generation technology. What could this look like in 1 year, 5 years, 10 years?