Artificial Intelligence (AI) has started revolutionizing the field of cartography and geographic information systems (GIS). AI can help in processing huge amounts of spatial data quickly and automatically, reducing manual processes in terms of map analysis, and geovisualization, leading to further development of accurate results. ChatGPT is one of the AI-powered large language models that can be utilized alternatively to create maps. However, to incorporate the AI tool with data, it’s necessary to understand how to create prompts that can generate the most useful results.
In the domain of Geoinformatics, AI technology has been integrated into various research areas. Machine learning and Deep learning are utilized to interpret complex geographic information, predict spatial trends, and provide insights with high accuracy. However, the use of AI to automatically create maps from human language commands is still relatively new nowadays. This leads to the question of how well AI can create maps based on cartographic rules and the quality between the AI-generated and human-generated maps.
This master’s thesis mainly aims to utilize AI for creating maps by applying different prompt patterns. AI-generated maps are compared to maps created through a conventional cartographic method. The map results are based on Python’s script according to prompt engineering techniques. The case study particularly focuses on wildfire events in Portugal between 2002 and 2022. The study sets the following specific objectives to guide the research:
• To evaluate the functional capability and learning ability of the AI in producing maps, in both static and interactive maps.
• To analyze and evaluate different prompt patterns that influence map outputs.
• To assess the map quality between maps generated by the AI and those produced through a traditional method, aiming to identify strengths and limitations.
GWIS provides data on wildfire trends, geographic distribution of fires, burned areas per country and sub-national level for all countries globally. The case study of the choropleth map uses average burned area data from 2002 to 2022 from GWIS. The data visualization on the maps such as Yearly Burned Area, Average Monthly Burned Area by Landcover, Fire size and Carbon Monoxide emission are also directly downloaded from GWIS Country Profile application in CSV format. Burned area values based on the product MODIS MCD64 A1. Average Monthly Fire Size indicates monthly fire size per administrative area and year, showing the total burned area per fire size class for each month. Fire size (ha) is from the GlobFire database. The CO emission is derived from the Global Fire Emission Database (GFED)
The Fire Information for Resource Management System (FIRMS) provides Near Real-Time (NRT) active fire data. Moderate Resolution Imaging Spectroradiometer (MODIS) and Terra satellites, and the Visible Infrared Imaging Radiometer Suite (VIIRS) are the foundational satellite data sources to detect active fires and thermal anomalies. (FIRMS, 2024) The graduated symbol and dot density map in this study visualized active fire information in 2022 from MODIS. The data was acquired from NASA, Fire Information for Resource Management System (FIRMS) in the form of shapefile format.
The data are preprocessed in ArcGIS Pro software with shapefile format and used as the input for ChatGPT-4. The preliminary outputs of the three case studies are designed and tested in ChatGPT-4 to ensure the feasibility of the AI model and Python libraries at the beginning of the research. Each thematic method is iterated five times with the same or similar prompt, the outputs among five maps can be evaluated for the next step. The interactive maps are executed within Visual Studio Code.
For the mapping process, each map element is developed by one prompt at a time for the basic prompt pattern. For the advanced prompt, Cognitive verifier can be generate more than one map element at a time because it provides three additional questions related to the original requirements. The Question refinement is used in the last step to adjust specific details.
The concept of Basic prompt or direct instruction pattern is a method that directly tells or instructs the model to follow instructions without providing any examples. The explicit instruction should be clear to derive precise and accurate results. This pattern is also known as Zero-shot in prompt engineering
This study applies Cognitive verifier and Question refinement to enhance the quality of the output and details in the prompt which aims to reduce the user’s effort in creating a map. The Cognitive verifier can divide sub-questions related to the user’s command. Thereafter, the LLM is capable to combine user’s answers and process them into the final outputs. Another prompt is Question refinement which is used for refining map details such as color, placement, and text, including generating map compositions. The advanced capabilities of these prompt engineers enable them to provide refined prompts beyond simple text.
The initial prompt assigns contextual statements to the advanced prompts, the contextual statement is a way to describe how a user and an LLM will communicate in a prompt. For the Cognitive verifier, the LLM is asked to generate three additional questions and when it receives the answers, it needs to combine them to produce a map. The contextual refinement of the Question refinement pattern is described in a prompt whenever it is asked to adjust a map, it should suggest a better version of the prompt based on an original prompt.
The maps from the traditional method in the last stage are generated by using ArcGIS Pro software, and Flourish is utilized for data visualization. The human-generated maps are set as the reference for the suitability criteria. The reference consists of the most appropriate map specifications according to cartographic rules. By comparing maps with map specifications, the strengths and weaknesses can be assessed in their quality. The more the map complies with the map's specifications or benchmark, the better the quality of the map (Liesbeth & Philippe, n.d.). Then the outputs between AI and traditional methods can be evaluated according to three suitability levels.
Using the advanced prompt patterns for creating static choropleth maps can achieve more elements than interactive maps. The common issue is map labels, there are 3 from 5 map labels that could be completed by using advanced prompts on the interactive maps, but the static version has only 1 time that was unsuccessfully achieved. Conversely, using the basic prompt for creating static maps gives more errors or incomplete outputs than the advanced prompt pattern.
The main issues of the graduated symbols map are legend and map field, the basic prompt cannot be able to create a correct legend and map field for all five static maps. On the contrary, advanced prompts can create most of the map fields and legends successfully, only 1 legend and 1 map field are not completed in static maps. Likewise, the basic prompt returns 4 out of 5 incomplete legends in interactive maps, while the advanced prompt has only 1 incorrect legend.
The fundamental map element that cannot be achieved in either static or interactive maps is Legend. In both map versions, the AI similarly failed to generate this element 3 times using Basic prompts and 2 times from Advanced prompts. However, incomplete layer control is also commonly found in all three thematic methods. This leads to unsuccessful interactive outputs which can be solved by utilizing the advanced prompts.
To summarize, the number of incomplete map compositions of the dot density maps is still lower than the other thematic methods either basic or advanced prompt.
The basic prompt pattern in the choropleth maps uses more attempts of all five iterations. All the map compositions in the basic prompt have a wider range of attempts indicating a high variability of attempts. However, the total number of basic prompts to create map fields in five sessions is the same as the advanced prompts.
For creating a map field of the graduated symbols maps, the total number of attempts required between basic and advanced prompts is similar. A graduated symbol map has several complex elements to concern either proper graduated circle sizes or reclassification. Therefore, using complex instructions may not satisfy all requirements at a time.
The advanced prompt gives fewer attempts than the basic prompt in total, which means the advanced prompt facilitates creating a dot density map and reduces the number of iterations to achieve desired outputs.
Advanced prompts generally result in fewer attempts across all map types, but the distributions are more varied and not consistent as the static maps. For the interactive choropleth map, the map field of the basic prompt has a similar range to the advanced pattern, with a slightly higher distribution.
The range of attempts for creating the interactive graduated symbols maps is larger than the choropleth and dot density maps. The map field and legend in advanced prompts have lower attempts than the basic prompt at every iteration.
The dot density has a similar range of number of attempts between the basic and advanced prompt.
The charts reveal patterns in how the two types of prompts return incorrect results generated by the AI. Such results indicate that the model's learning ability has limited cartographic knowledge leading to failure in creating map outputs.
The advanced prompt pattern gives the total number of incorrect results fewer than the basic one. However, all five iterations of the choropleth map’s field have the same total errors in two types of prompts. Legends are made more accurate with the advanced prompts and have fewer mistakes than the basic prompt.
Similarly, the graduated symbols legend produces fewer errors from the advanced prompts than the basic prompt significantly. Considering the map field of the graduated symbols map, it has about the same number of code errors between the two types of prompts.
For the dot density map, both prompt types have a similar number of incorrect results, with the basic prompts having slightly more.
Advanced prompts lead to more incorrect results in the choropleth maps for all five iterations. The map field, generated by both prompt types shows minimal differences and even slightly high in the advanced prompts.
Conversely, the advanced prompts of the graduated symbols maps produce fewer incorrect results in map field and legend components
For the interactive dot density map, the distribution of the incorrect results is similar to the static version but the errors of map fields are not reduced by the advanced prompt.
In conclusion, ChatGPT-4 produces more errors or hallucinated results in complex components, both static and interactive maps have similar distribution across three thematic maps. The use of advanced prompts consistently leads to fewer incorrect results some map elements, particularly in data visualization and legend.
The main error that usually happens in the experiments is the Error Analyzing issue. The error is potentially caused by the model bias in training data, complex datasets and data handling capability. Assessing the technical issue helps to understand the limitations of ChatGPT-4 model in processing spatial data and its performance in creating maps.
Error Analyzing issue consistently occurs when creating map field and data visualization, the issue could happen when the AI cannot process, plot, or load the given files.
When comparing the map fields to data visualizations, errors are more prominent in data visualization across both static and interactive maps. The advanced prompts generally reduce error results but not uniformly, because there are more errors in the data visualization of the interactive choropleth map and static graduated symbols maps.
ChatGPT-4 performs well in generating maps, but occasionally produces intermediate or unsuitable results. Among ten maps of two prompt patterns, the choropleth map is almost the same as the intermediate level. Especially legends, maps, and charts. Also, some maps have the least appropriate levels such as scales, labels, legends, and map fields. Moreover, the legend's color is slightly different from the map causing it to be assessed as intermediate level. For the map field, the map itself is incorrectly classified as specific ranges, and the map cannot classify into certain colors or ranges so the map field.
The experiment reveals map quality is mixed across all three levels, especially legends and map fields that are mostly evaluated as least suitable. This is because the initial results from the AI often return a choropleth map or proportional map. Moreover, the legends hardly corresponded with the symbols on the map. From the results of the completeness section, there are 5 out of a total of 10 maps that the AI cannot successfully generate according to specifications. The low quality of this thematic map points out the limitations of training data in ChatGPT-4’s model.
For dot density map, ChatGPT-4 performs well in creating map elements, the map fields are successfully generated having the most suitable level of all 10 maps for both basic and advanced prompts. However, there are some elements that are in intermediate level which are legend, scale bar and subtitle. The legend of dot density map is not as complicated to create as other maps. However, the problem is that most of them are evaluated in the intermediate level because the dots on Legend have inconsistent sizes, colors, and opacity with the map symbol. The AI code customizes the point size according to pixel size both width and height.
The aims of this thesis are to evaluate the capability and learning ability of ChatGPT-4 in producing maps, in both static and interactive maps, to evaluate different prompt patterns that influence map outputs and to assess the map quality between maps generated by the AI and those produced through a traditional method. Making a map with GIS software can be complicated for nonexperts, and the software could be costly leading to the exploration of alternative approaches. The growth of AI has been implemented in the field of cartography, however the accuracy remains to be evaluated and developed. This leads to the assessment of the capability and accuracy of AI in creating maps as well as the map quality compared to a map created by traditional method. This study utilizes ChatGPT-4 to create thematic maps in the case study of Wildfire events in Portugal, the AI model generates a choropleth, graduated symbols and dot density.
A prompt is the main tool to communicate with ChatGPT-4, then the maps are visualized from AI-generated codes. Leading to the first evaluation of the completeness of the map compositions between basic and advanced prompt patterns. The study reveals ChatGPT-4 can achieve all the map composition according to cartographic rules. Regarding the prompt patterns, the advanced prompts generally create most of the map compositions successfully more than basic prompts, particularly in complex elements like legends, map fields and data visualizations. However, both prompt types still face challenges, particularly in maintaining consistency and achieving complete map compositions across all map types.
The number of attempts is a factor to evaluate how map results are affected by different types of prompts. The advanced prompts generally reduce the number of attempts in most of the elements, their effectiveness is more pronounced in complex scenarios such as creating interactive graduated symbols maps. However, when considering only the map fields of the choropleth and dot density map, both basic and advanced patterns do not have a large difference of an average number of attempts.
AI-generated maps using ChatGPT-4 can produce hallucination or incorrect map outputs due to limited cartographic knowledge. The advanced prompts reduce the number of incorrect results for certain map elements, but the prompt does not consistently improve all components. Considering map fields from five iterations, the advanced prompt returns more inaccurate results for interactive maps. For static maps, there are no significant differences in static versions.
Another factor in evaluating the capability of ChatGPT-4 is "Error Analyzing" issue, highlight the limitations of the AI in processing spatial data since the error often shows when manipulating the given shapefiles and CSV. The "Error Analyzing" issue in ChatGPT-4 significantly impacts two elements which are a map field and data visualization. For static maps, the advanced prompt generally reduces the number of errors, but this pattern does not eliminate the error in interactive maps which means further prompt refinement in the interactive version is needed to be refined to solve the data processing issues.
In the last stage of the thesis aims to evaluate map quality of AI and traditional methods. The AI-generated maps were assessed based on the suitability criteria categorized into most suitable, intermediate, and least suitable. ChatGPT-4 shows potential in map generation, but still requires more improvements to match the quality and flexibility of traditional methods like those provided by GIS software such as ArcGIS Pro. Results from the case study indicates that ChatGPT-4 needs further refinement in thematic maps and visual representations to achieve cartographic standards comparable to human-generated maps. The choropleth and dot density maps are the most suitable and the graduated symbols map is the least suitable compared to the reference criteria. ChatGPT-4 produces maps with inconstant quality in complex map elements like legends, map fields, and data visualization. Therefore, the capability of ChatGPT-4 at the period of this study requires more improvement of cartographical knowledge, the traditional mapping method by ArcGIS still outperforms, being more accurate and consistent. This is due to predefined functionalities and better handling of symbology and visualization.
In conclusion, this study reveals the potential of ChatGPT-4 in the field of cartography and GIS but also highlights several limitations. ChatGPT-4 is useful for a basic map without containing so many elements such as plotting an overview visualization of the data. The results can be improved based on the prompts used in this thesis. The thesis can be a guideline for further studies related to ChatGPT-4’s functionality in map creation. Also, the results show the insights of the strengths and weaknesses of AI in cartography. In addition, the map outputs based on Geopandas and Folium pave the way for more visual and mapping improvement in the future development of the libraries.