obrazek

Summary


The main goal of the diploma thesis was to apply geoinformatics approaches in the research of the diversity of wild precursors of chickpeas, lentils and peas. In other words, to determine the influence of environmental factors on the occurrence of the studied plants using spatial data. The theoretical part of the work is focused on the studied plants, the Fertile Crescent and the current state of the problem. The practical part is focused on obtaining and processing data, on statistical analyzes leading to the determination of the influence of environmental factors on the occurrence of the studied plants and on the interpretation of the results. Two types of data were used in this thesis. The first of them is data of plants obtained from the Department of Botany, UP, which was obtained in tabular form with information about the location of each plant. These data had to be selected and processed into a form suitable for further analysis. The second type of used data is environmental data, namely data sets from the Google Earth Engine platform, bioclimatic variables from the WorldClim data set and SRTM30 altitude data. A significant part of the practical part deals with obtaining of spatial data from the Google Earth Engine. For these purposes two scripts were created to simplify the obtaining this data.
After processing the data into a suitable form, the analytical part followed, in which two main goals were set, namely finding the environmental factors that most influence the occurrence of the studied plants and creating plant clusters based on these selected factors. Statistical methods were used to achieve both of these goals. Environmental factors were sought for each type of plant examined separately, and the principal component analysis in combination with the correlation matrix was used.
For each plant species were selected factors that do not correlate with each other and at the same time have the greatest influence on the data. Finding these factors was necessary to ensure the correctness of the next step, because when forming clusters with many factors that interact, the clustering results can be significantly skewed. For the purposes of clustering, the k-means method was chosen and clusters of plant points formed by selected environmental factors were created for each plant species.
Based on these clusters, it is possible to say for individual plants which factors most influence their occurrence. To visualize the created clusters, a biplot was chosen, which was used to determine the dependence of the clusters on the given factors, and the clusters were also projected on a map to determine their spatial distribution.
The last step of the practical part of the work is to evaluate the influence of positional accuracy of plant data on the analyzes. This evaluation is based on a comparison of highly accurate data with randomly generated points in the error region of less accurate data. It was found that for data with a spatial resolution of 1 km, lower positional accuracy (error rate 1.11 km) has almost no effect on the results. For data with a resolution of 30 meters, the effect of inaccuracy is much more noticeable. The results of this work and spatial environmental data, which were downloaded in the work, will serve the staff of the Department of Botany, UP in their research of wild precursors of cultivated legumes.
The results of clustering, together with a description of the clusters formed in terms of their dependence on specific environmental factors, may assist in research of wild plant precursors that aim to increase crop genetic diversity by crossing their wild precursors.