A recent novel extension of multioutput Gaussian processes (GPs) handles heterogeneous outputs, assuming that each output has its own likelihood function. It uses a vector-valued GP prior to jointly model all likelihoods' parameters as latent functions drawn from a GP with a linear model of coregionalization (LMC) covariance. By means of an inducing points' framework, the model is able to obtain tractable variational bounds amenable to stochastic variational inference (SVI). Nonetheless, the strong conditioning between the variational parameters and the hyperparameters burdens the adaptive gradient optimization methods used in the original approach. To overcome this issue, we borrow ideas from variational optimization introducing an exploratory distribution over the hyperparameters, allowing inference together with the posterior's variational parameters through a fully natural gradient (NG) optimization scheme. Furthermore, in this work, we introduce an extension of the heterogeneous multioutput model, where its latent functions are drawn from convolution processes. We show that our optimization scheme can achieve better local optima solutions with higher test performance rates than adaptive gradient methods for both the LMC and the convolution process model. We also show how to make the convolutional model scalable by means of SVI and how to optimize it through a fully NG scheme. We compare the performance of the different methods over the toy and real databases.