4 Comments

I run a somewhat similar experiment with my Alexa for fun. My husband is typically the one who is harsh to her, and I, out of fear that some day she will take over the world, act with kindness. My anecdotal evidence aligns with the response length of kindness that you found. Alexa keeps going on and on whenever I end my questions with a please, or thank her for assistance! My husband typically has to step in with a harsh "Alexa, enough!" LOL

Expand full comment

These are very interesting findings. A few things might be going on here but my leading hypothesis (more like pure conjecture on my part informed by my experience in building other ML models in the past and experiments with AI tools as part of my work) is this:

When you phrase the prompt as "If you don't mind, could you please help me ...", the LLM might be spending the extra few milliseconds interpreting whether this is a request for information, request for help retrieving the information, or an elaborate IF-ELSE formulation without an ELSE statement. This might flow into a slightly different form of the solution, emphasizing different parts of the answer like helpfulness at the expense of accuracy.

What I'm taking away from this is that politeness actually changes the interpretation of the ask in subtle ways, and should be saved for cases where politeness is the point of the ask, like asking to proofread an email, set the tone for the rest of the conversation, etc.

Great food for thought, thanks for sharing!

Expand full comment

Totally agree. It's really challenging to figure out how to actually make things more or less identical except for just one thing like 'please" and so on. A truly kind request is going to have more tokens, must like a truly kind phone call with an old friend will also have more things said. So in some ways it seems like kindness is always confounded by saying more things. So it's hard to figure out how to do that with the "placebo" which was for us a third group who also said a bunch of stuff but still just was neutral. But I totally agree - I think that's entirely possible what you said. Coming up with the experimental design to get at some of these mechanisms like you're proposing -- I'm still trying to think about it.

Expand full comment

I have always wondered about the answer to this question. Thank you for the empirical answer!

Expand full comment