Generative models have become an area of utmost importance in recent times, due to their ability to learn a probabilistic data distribution from an input data set. Currently these models have been explored mainly in the generation of images, but not so much in the musical field, where the use of these models makes sense since music is rich in structured information which can be learned by these models. In this paper we present the analysis of two case studies of generative models based on deep convolutional networks. We study their ability to generate symbolic music for one or more instruments in the pianoroll format, and whether it is possible to condition the output to display characteristics of different composers or genres. Also we study how controllable are the results generated. We evaluate both models using the Fréchet Inception Distance (FID), a metric for generative image models, in addition to musical metrics defined by us. One of these cases is the use of the recently developed StyleGAN 2 model. Using this type of architecture in a non-visual domain is novel and we present interesting results in terms of FID and in qualitative musical terms. Despite this model was designed for a visual domain to generate high quality images, it can be adapted to a totally different context. In addition, it has properties that are of interest to the area of musical composition, such as having a disentangled latent space, where it is easy to explore different musical ideas, and conditional input to further control the output of the model. We believe that the results we show in this work are a step forward in understanding how to create better generative models in the symbolic music domain, taking into account the concepts of conditionality and controllability to develop better tools for the end users.
How to join?
The defense will be in spanish. If you want to attend, please send an email to Manuel (firstname.lastname@example.org) before the defense.