AI SUBJECT LINE GENERATOR

Transforming email opens rates with AI

We transformed Mailchimp's basic AI subject line generationinto a performance-driving product that measurably increased open rates. We delivered the first measurable customer-benefit metric for Mailchimp's Marketing Generation AI tools.

Results

  • +89% improvement in funnel performance.
  • +26% generate-to-apply lift and –17% re-generation drop.
  • +16% preview-text fill increase.
  • +7% open-rate gain from intent-based generation.
The subject line lives within the email checklist. This is the first step in email creation and the last step before scheduling or sending an email in Mailchimp. The rest of this case study will focus on the subject line part of the email checklist.
ROLE
Lead Product Designer
DURATION
4 months
TEAM
Designer, Product Manager, AI Scientist

USER PROBLEM

Crafting effective subject lines felt like guesswork to marketers, despite being the single most important driver of email open rates

Before this project, Mailchimp already had a generic subject-line generator that produced content but didn’t follow best practices. There was also a Subject Line Optimizer that recommended improvements, but required users to apply them manually. As a result, users got generic outputs, had limited trust, and saw inconsistent performance.

  • 12% of users interacted with the existing, generic subject line generator
  • Only 40% of those users applied the generic subject line generator output.

Subject Line Optimizer on the right which shows best practices to include in the subject line that user needs to apply manually even if they generate their subject line with AI.

When a user presses "Write with AI" the subject line generator is shown with a drop down of prompt starters to choose from.

The subject line generator only generates one subject line at a time.


GOALS

Increase open rates and build user trust in AI generated text

Our goal was to generate high-quality data, driven AI outputs that improved campaign open rates. At the same time, we wanted to build trust in AI through explaining the AI generation process.


RESEARCH

Users want to save time and improve performance

To understand marketer’s problems when crafting subject lines I conducted 10 interviews, a 300-participant survey, and cross-functional workshops with PM, Data Science, and Content Design. We found that the top four customer problems were:

  1. Repetition: It's hard to create fresh subject lines for every new emails
  2. Performance prediction: There's uncertainty about what drives opens
  3. Synthesis: It's difficult to condense an entire email into 7–9 words
  4. Competition: It's hard balancing the need to stand out while aligning with industry best practices

Experimentation strategy

Solving this all four of these problems at once would have required building new models and deeper performance intelligence.

Experiments

We structured the work into three iterative experiments, each exploring a key behavioral lever:

Choice

What is the right number of subject line options?

Pairing

Does combining subject line + preview text improve completion?

Context

Can intent + industry data improve performance?


Experiment 1: Multiple options

Based on data, we knew the apply rate for AI-generated subject lines was low. We hypothesizes it was because they were presented with only one generated option.


We introduced an initial experiment that provided multiple subject line options, increasing variation and giving users a sense of choice and control. By building on our existing system, we were able to move quickly and observe how users interacted with improved outputs before investing in more complex solutions.

Hypothesis

If users see 2–4 options, they will apply one more confidently.

The subject line generator is in close proxity to the subject line field.
The user is presented with 2–4 options to choose from depending on the experiment variant.

Results

  • 1 option: baseline +10% lift
  • 3 options: +38% generate-to-apply lift
  • 4 options: +26% generate-to-apply lift
  • 13–17% reduction in re-generation

What we learned

Even without any best practices applied to generations, users value variety and choice. I hypothesize that 3 options was the threshold before users faced analysis paralysis


Experiment 2: Paired subject line + preview text

Preview text correlates with +0.6% higher open rates, yet 30% of users skip it. We tested how to surface it naturally within the subject-line flow. I partnered with data science to define preview-text best practices.

Hypothesis

If we generate subject lines and preview text together, users will apply the paired output more confidently.

Balancing disrupting user workflow

While PMs preferred the entry point above the subject line field for visibility, I pushed for placing it below the preview text field to respect the user’s primary goal of writing content without interruption. The design-led placement ultimately outperformed other variants, validating that prioritizing flow over visibility led to better engagement.

When pairing the generations we needed to move the entry point to not just be related to the subject line field. Product management wanted to place the entry point above the subject line field for visibility.)

Once a user types in a prompt, the subject line and preview text options are generated.

Results

Variant 1 (PM’s pick): +27% funnel-completion lift
Variant 2 (Design’s pick): +34% funnel-completion lift preview-text fill rate.

What we learned

Small placement changes significantly impact behavior. Pairing subject line and preview text improved completion because the user was able to visalize the relationship between the two fields and get the work done faster.


Experiment 3: Intent-based generation

Our experiments were moving the needle on completion rates, but we wanted to see if we could improve open rates by using campaign intent and industry data to inform AI generations.

Hypothesis

If we use campaign intent and industry data to inform relevant subject line best practices, the generations will be more relevant and result in higher open rates.

Design considerations

  • Added an intent dropdown to allow users to correct or select a different intent if the pre-filled one from data sceince analysis wasn't accurate or available.
  • Added explainability to help users understand how it would impact the generated outputs.
A tooltip appears when a user hovers over the intent dropdown to explain what it is and how it works.
The drop down list of the most common email intents for users to select from.

Results and learnings

  • +7% open rate
  • Incorporating intent and industry data into the generation process resulted in more relevant and engaging subject lines.

The results

This work proved that iterative experimentation can drive measurable business outcomes.

  • +89% improvement in funnel performance (entry → generate).
  • +26% generate-to-apply lift and –17% re-generation drop.
  • +16% preview-text fill increase.
  • +7% open-rate gain from intent-based generation.

Takeaways

This project It redefined design’s role in AI not just shaping interfaces, but influencing model behavior through defining what good AI generated outputs should look like.

Although we were able to move the needle on open rates, users didn't get any notifications that AI generated subject lines improved their open rates. It would have been great to close the loop on the E2E experience by including messaging in their email report about the best practices that were applied and how it impacted their open rates. This wasn't in scope for this project.