Add an image: Pop open device keyboard and focus on the caption input when arriving on the step to add a caption
This is carried over/descoped from iteration 1 on T290781:

Focus on input: With the exception of when caption onboarding occurs, the focus should be set on the caption input so that the device keyboard is open.