The model is not following my instructions when using images with a system prompt. System prompt tells model that it should only output json and nothing else, but it still outputs a written summary of the image instead of the json