Let's say we have a character holding a cup, you might ask, do people always look at the face first? Then why do I need to spend extra time on polishing the hand, which nobody is going to see? I'd say, yes and no. It is true that most of the audience will pay attention to the face, but we "feel" the hand holding a cup. If there is a finger penetrating the geometry, even though we might not see it, we will feel it. "Fingers shouldn't go through cups" is the so-called common sense, when we are aware of the Kitty-Pryde-ish fingers, which is against the library we have built, then that is the deal breaker for believability.
Even though sometimes we polish every contact, it is still not strong enough to make the audience feel it. So we need to emphasize the contact. It's the same concept as the classic principle: Exaggeration. If your rig allows, try to scale the geometry, for 1 or 2 frames, depending on how exaggerated the style is. You might be surprised that our eyes can't catch this while the video is playing. Instead of seeing it, we feel it. I usually try to push this until it is too much, and then pull it back.
Here are a few example from 2 of my favorite movies, Ratatouille and Horton Hears a Who!.
For the Remi Clap, you can tell that his hands are almost scaled up 2 or 3 times bigger, and because this goes very fast, we can't catch the scale, but we can not miss the clapping hands.
BEFORE

AFTER

The same thing happens when Mayor slams his arm into the wall to wake it up, we can't see the scale, but the contact is clearly registered in our minds.
BEFORE

AFTER

Article created by Erik Lee, refined by Joseph Taylor.