Poster
Learning Extrapolative Sequence Transformations from Markov Chains
Sophia Hager · Aleem Khan · Andrew Wang · Nicholas Andrews
Most successful applications of deep learning involve similar training and test conditions. However, tasks such as biological sequence design involve searching for sequences that improve desirable properties beyond previously known values, which requires novel hypotheses that extrapolate beyond training data. In these settings, extrapolation may be achieved by using random search methods such as Markov chain Monte Carlo (MCMC), which, given an initial state, sample local transformations based on a proposal distribution. However, even with a well-designed proposal, MCMC may struggle to explore structured state spaces efficiently. Rather than relying on stochastic search, it would be desirable to have a model that greedily optimizes the properties of interest, successfully extrapolating in as few steps as possible. We propose to learn such a model from the Markov chains resulting from MCMC search. Specifically, our approach uses selected states from Markov chains as a source of training data for an autoregressive inference network, which is then able to efficiently generate novel sequences at test time that extrapolate along the sequence-level properties of interest. The proposed approach is validated on three problems: protein sequence design, text sentiment control, and text anonymization. We find that the learned inference network confers many of the same (and sometimes better) generalization benefits compared to the slow sampling process, but with the additional benefit of high sample efficiency.
Live content is unavailable. Log in and register to view live content