All or almost all modern generative neural networks, collected in one picture. It’s a pity, there aren’t that many image-to-text networks, so there won’t be a lot of text.