I don't think there is any single way to do things when it comes to working with language models or image models. Here's what I thought worked for me. I'd love to hear your experiences in the comment section at the bottom.
I used Google’s latest image generation model - Nano Banana - to create images for my revamped personal website. I learned some lessons along the way and figured I'll share them with you.
Use GPT 5 to brainstorm.
Make it create a super detailed prompt - you could probably use Claude or Gemini too but I didn’t try them
Then, stick that prompt into Gemini and use Nano Banana in tools.
Make a couple of attempts, till you get something that needs fine-tuning, not big changes. If you’ve done the earlier steps well, this shouldn’t take long. The fine-tuning’s the hard part.
Now, once you’re reached the fine-tuning stage, DO NOT add a new message to the conversation. In my very brief experience, it just kept giving me the same image or odd results. Instead first, download the image if its almost what you wanted. Do this because its going to go away once you do what I suggest next.
Now. Go BACK to your old prompt and edit that one. DO NOT add another message to the thread. If there was a detail you didn’t want, say you don’t want that. Then, have it rerun the edited prompt with your newly added text.
Here is the prompt I got from GPT 5.
"Photorealistic portrait of a laptop With dashboards on it. The laptop displays a SaaS analytics dashboard. Lighting: soft natural key light from camera-left, subtle cool rim light from behind to separate subject from background. Lens feel: 85mm equivalent, shallow depth of field (f/1.8–2.2) — crisp, gently blurred background, with me in the background. Scene specifics: modern office background, out-of-focus person (me) slightly to the subject's right (background). Expression: confident, composed, slight closed-mouth smile, direct eye contact. hands folded in front of chest. Keep the hair mostly the same, just less messy a bit more professional. High detail, photoreal finish as above. " Negative / do-not: "No logos on clothing or background, no exaggerated smile, no cartoonish or painterly effects, no heavy makeup, no dramatic film noir lighting. Don't change hairstyle drastically"
The less than photogenic original photo. I suggest using a more….well-kept look.
Results
Attempt number 1.
Attempt number 2, having edited the same prompt. See how close to each other they are. I just moved myself to the center.
Now that it had my face down, I continued the thread by drafting a new message to make images of me doing something else. It did an excellent job of retaining my face and unfortunately, my hair.
Here’s what to not do
Do not upload a lot of images of yourself in one message. Like below.
And, do not be lazy with prompting.
Create a photorealistic image of me (in the attached photos) as if I were mid-speech. with multiple speech bubbles originating from my mouth. Each speech bubble has the icons of a different ai software. Icons are attached here.
Because, here is the output.
And don’t double down by having it make incremental changes to the same bad output.
You made no change to the image! Change the angle so that I am facing the camera bit more. Not fully front facing but not fully side ways either. Put my face to the bottom corner. A little bit of my neck and shoulder need to be visible but no need for the rest of the body.
And don’t get mad at it, lol.
ANd for heaven's sake. Just use the icons exactly the way I've given you.
I sent a new message with a photo. It improved it a bit, but the next three prompts with incremental changes yielded the same exact photo.