Supervised learning methods for distinction of cattle production type (#204)
Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. Genetic structure of two Pinzgau cattle populations has been analysed and used as a model for supervised learning of different statistical methods. Supervised learning is the suitable tool for hypothesis testing using genetic data. Using supervised learning allowed us clearly distinguish between animals of milk and beef production type. A result of provided study shows the possibility to classify unknown samples according to genetic data. On the level of DNA it is still possible to sort animals to separate populations. Model is also useful for classification on many logical levels as country of origin, breeding system, herd and many others. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identifying different production types of individual animals. It can be concluded that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.