What we learned
- Directional cues (“thin out”, “bring in”, “hold”, “suppress”) beat literal timestamps.
- Instrument suppression is as important as instrument addition.
- Thinning before impact makes later arrivals more likely to register.
- Blank-line silence is a valid pacing tool: it “resets” attention.
Practical rule set
- Write instructions like you’re guiding a mix engineer.
- Use “REMOVE/HOLD” to create space before “ADD”.
- Call out one instrument as the anchor per section (bass OR drums OR lead), not everything at once.
- Use short lines. Avoid huge paragraph prompts that lock the model into inertia.
Example architecture (copy template)
[SECTION — DIRECTIONAL ARRANGEMENT] [HOLD: drums (tight, minimal)] [REMOVE: lead synth] [KEEP: bass (present, steady)] [THIN: mids (leave air)] [SUPPRESS: sax, vocals (delayed induction)] [BUILD] [ADD: percussion (ghost hits)] [ADD: texture layer (tape flutter / noise halo)] [HOLD: kick (do not overdrive)] [IMPACT] [ADD: lead instrument (single identity)] [WIDEN: stereo image] [RESTORE: mids] [SUSTAIN: groove]
Why it works
The model treats the lyrics box like a high-level “scene plan”. When you give it a timeline, it often ignores it. When you give it a sequence of intent, it behaves like it’s following a mix script.
Next iteration
- Turn these directives into a reusable “Prompt Lab Skeleton” file.
- Add a small vocabulary list of reliable verbs:
hold,suppress,thin,restore,induce. - Test if “delayed induction” improves instrument persistence across rerolls.