I think it might help to understand why this happens.
Imagine a ladder. The top step of the ladder represents pure white; the bottom step of the ladder represents pure black, and all steps in between represent shades of gray that get darker as you go down. Each step is one "stop" darker than the one above it. A "stop" of light is a unit of measure used in calculating exposure. Digital camera sensors can only discern a finite number of "stops" of light in any given scene. This is known as "dynamic range". Let's say that your digital camera has a dynamic range of 4 stops. Going back to the ladder analogy, the camera could only clearly discern a section of the ladder that is 4 steps tall. If you chose the top 4 steps of the ladder, your camera would clearly show the steps between pure white to light grey. If you chose the bottom of the ladder, your camera would clearly show the steps between dark grey and pure black. If you chose the middle part of the ladder, your camera would clearly show the steps between light grey and dark grey. So, what happens to all the other steps of the ladder than are outside the 4-step range? The camera would show them as pure white or pure black (no detail), depending on whether they are above or below the 4-step range you chose.
Now, let's apply this to the picture of Devine. The range of light-to-dark in this scene was too wide for the camera to capture it all. Most of this scene was in shade and shadows, represented by the bottom steps of the metaphorical ladder. The sky and clouds visible through the trees (in front of Devine's face) are brighter, and would be represented by the upper part of the metaphorical ladder. In order to see the details in the face, leaves, and tree, the camera had to expose for the bottom steps of the metaphorical ladder, and everything above those steps were converted to pure white. This is also known as "blown highlights". If, instead, the camera exposed for the sky, you would clearly see the blue sky and clouds, and some highlight details, but almost everything else would have been too dark or converted to black.
When you're trying to capture a scene with a range of values that is wider than what your camera is capable of capturing, your immediate solution is to compress the values in the scene by either lightening the darker areas of the scene or darkening the brighter areas of the scene. Fill flash adds just enough light to the dark subject in the foreground so that its level of brightness is closer to that of the background. Graduated neutral density filters darken one portion of the scene, such as a bright sky, bringing the entire scene to within the dynamic range of the camera. However, this usually only helps in situations where there is a clear horizon and nothing of importance crosses the horizon plane.
In the explanation above I used a dynamic range of 4 stops for illustrative purposes. Dynamic range varies by camera and media. If memory serves me correctly, the human eye is capable of discerning about 50 stops of light. That wide dynamic range is the reason we don't walk around seeing pure white skies all the time. Our eyes can see the details in the clouds, shades of blue in the sky, the middle values in the grass and the leaves, and you can even see details in the shadows under the trees. Black and White film has a dynamic range of about 9 stops, quite a drop from what our eyes can see. Color negative film has a range of 5-7 stops. Slide film has even narrower range. Further complicating things are the uneven divisions of over- and under- exposure latitude. Digital camera sensors have a dynamic range similar to or narrower than that of slide film, but they're getting better, especially with high-end DSLR cameras.
A more recent solution to the issue of low dynamic range is in post-processing. Some photographers take multiple images of the same scene, exposing each image for a different part of the scene. The photographer then blends the images together in Photoshop, often using layer masks, to create a final image where everything appears properly exposed. Because of the amount of information that can be recovered from RAW files, photographers shooting in RAW format can, to an extent, do what I mentioned above with just a single exposure. This technique is preferred over copying and pasting a sky from another image, which can often look artificial.