We propose and demonstrate a compressive temporal imaging system based on pulsed illumination to encode temporal dynamics into the signal received by the imaging sensor during exposure time. Our approach enables >10x increase in effective frame rate without increasing camera complexity. To mitigate the complexity of the inverse problem during reconstruction, we introduce two keyframes: one before and one after the coded frame. We also craft what we believe to be a novel deep learning architecture for improved reconstruction of the high-speed scenes, combining specialized convolutional and transformer architectures. Simulation and experimental results clearly demonstrate the reconstruction of high-quality, high-speed videos from the compressed data.