We introduce the concept of machine-learning minimal-residual (ML-MRes) finite element discretizations of partial differential equations (PDEs), which resolve quantities of interest with striking accuracy, regardless of the underlying mesh size. The methods are obtained within a machine-learning framework during which the parameters defining the method are tuned against available training data. In particular, we use a probably stable parametric Petrov–Galerkin method that is equivalent to a minimal-residual formulation using a weighted norm. While the trial space is a standard finite element space, the test space has parameters that are tuned in an off-line stage. Finding the optimal test space therefore amounts to obtaining a goal-oriented discretization that is completely tailored towards the quantity of interest. We use an artificial neural network to define the parametric family of test spaces. Using numerical examples for the Laplacian and advection equation in one and two dimensions, we demonstrate that the ML-MRes finite element method has superior approximation of quantities of interest even on very coarse meshes.