David G. Stork: “Rigorous Technical Image Analysis of Fine Art: Toward a Computer Connoisseurship”

– Thank you to Emily for that wonderful introduction to our symposium. I’m going to quickly
introduce Dr. David Stork who is a Rambus fellow at Rambus Labs and leads research into its computational sensing and imaging group. He’s a graduate in physics from MIT and the University of Maryland and studied art history
at Wellesley College. For nearly 20 years, he and his colleagues have pioneered the application of rigorous computer image analysis to problems in the history and
interpretation of fine art. He has taught courses on the subject in both the computer science and art history departments at Stanford, and published over 45 technical papers. He’s lectured in 20
countries at major museums, he’s a fellow of the
international association of pattern recognition of
the international academic research and industry
association of SPIE and IEEE and of the Optical Society of America. He’s published over
200 technical articles, eight books, including the
leading textbook on optics in the arts and pattern classification. The world’s all time best
selling textbook in the field. (audience laughs) Please welcome David Stork. (applause) – Thank you, thank you very much. I’d like to show you what
several of my colleagues and I have been pioneering starting 20 years ago on applying computer vision to problems in the history and
interpretation of fine art including these paintings and many others, and I’d like to start with
Vermeer’s masterpiece, Girl with a Pearl Earring,
and Vermeer of course was a master in rendering
the effects of light. Look at the glint off
her eyes and her lips and the shadows and so forth, but if we were to ask the
simplest technical question about this illumination it would be what is the direction of illumination? And if two art historians disagreed, how would we adjudicate? How would we tell who’s right? Well let me show you six
methods from computer vision that solve this problem absolutely clearly and then tell you how it changes our understanding of this painting and it can be used elsewhere
in the history of art. The methods fall into two categories. The first are model independent methods where we don’t need to
make any assumptions about the 3D form in the tableau. We’ll look at the shadow cast by her nose and what’s called the
occluding contour algorithm and then model dependent methods where we do need to make some assumptions about the three dimensional
form in the tableau. We’ll look at the 3D model
of the pearl, of the eye, of her face and the entire tableau. So the simplest method
is cast-shadow analysis. You take a point on an occluder, its corresponding cast-shadow point, draw a straight line with error bars and you find for this painting the light is coming from a direction of
150 plus or minus two degrees where throughout I’m going
to measure directions with respect to the horizontal like that. The occluding contour theory or algorithm comes from a branch of
computer vision called from shading it tells you how to interpret the three dimensional form of an object and the brightness of it
depending upon the normal sort of an arrow off the surface, the unknown direction of the
illumination and so forth and I won’t go through the math, but if you knew what the normal was, if you knew what the
shape of the face was, you’d be able to, the computer algorithms can then go through and say
what direction of illumination is best consistent with the entire face? Unfortunately we don’t have
a model of her face for this. Two computer vision
experts Nelius and Ecklund came up with a very
clever solution to this. They said all right, we
don’t know what the normal is through her face in the middle, but we do know what it is
along the outer boundary or occluding contour. The normal is perpendicular
to this contour and it’s in the plane of the painting. Not towards us, not away. It’s like a flagpole on the horizon and that makes the mathematics simpler. I won’t go through it,
but it allows us to then set up a system of equations
that allow us to then infer from the pattern of light
around the outside contour the direction of illumination. Well it turns out this method would, oh so here’s one way to think about it. Here’s an object, and if
we have a dark patch here that alone would not allow you to tell what the direction of illumination is, but if I measured the brightness along the entire contour,
look at this patch. It’s bright. It’s probably facing towards the light. This patch is dark, it’s
probably facing away from it and the mathematics I just went through takes all these measurements and says what direction best explains this entire pattern of
lightness over the contour? Well, this technique was actually invented not for paintings, but
for forensic photography. There was a photograph
of Brad and Angelina that sold for a half million
dollars, a half million dollars and not to be outdone, the
star published this painting or this photograph.
(audience laughs) You can see where my head is. This photograph and using the
occluding contour algorithm I just showed you, Hany Farid
showed the light on Brad was coming from this direction, and on Angelina it was
coming from this direction. This was photoshopped. And it was exactly this
technique that showed it, but this also shows how bad
humans are at detecting light. That’s why there are so
many fakes that get by us, but these computer techniques find them absolutely perfectly. And so when you apply that,
we were the first ever to apply that to a painting so you apply it to the Vermeer, taking this contour here. It says ah, this is the
direction to the illumination. 149 plus or minus four degrees. Now for the model dependent methods. I took a high resolution
photograph of the pearl in that painting, and we
can confidently assume that it’s cylindrically symmetric and so I built a 3D
computer graphics model and you now so-called texture map it, put the white on it and a pearl has a specular or mirror-like reflection so you have to adjust by
eye the relative components of the diffuse reflection
and the highlight reflection, and now in our computer
model we can clip the light wherever we want, in
order to render the pearl, the digital pearl so that it matches that in the painting as closely as possible. And then we have an estimate to the direction of the illumination. And that turns out to be 155
plus or minus four degrees. What about the glint on her eye? If her eyes were spheres, if, then the position of the glint on her eye is related to the unknown
direction to the light by a very simple formula. It’s really just the law of reflection. The angle of incidence equals
the angle of reflection. But of course an eye is not a sphere. It has a corneal bump that pushes out, and so we have to find which
way that bump is pointing, we fit an oval to the iris
or colored part of the eye and in this painting she’s looking, they turned out to be perfect circles. She’s looking directly at us or really directly at Vermeer, and you
can see the center of mass of the reflection, so
in the computer model we know the direction of the eye. We have a model of it, it’s
high school physics optics in order to then calculate the direction to the illumination is 150
plus or minus two degrees. The most sophisticated technique we used comes again from this notion
of shape from shading, this algorithm of shape from shading. If we had them all over her face, if we had ’em all over her face, we could then say what
the computer could answer, what direction of
illumination on this model would lead to the pattern of lightness that we find in the painting? The problem is we don’t
have a model of her face, so we did the next best thing. We used a generic model
that’s actually a man’s face, and adjusted the pose to match her face as closely as possible, and we temporarily assumed that was her face and then the computer
gets an estimate that says for this face, this
direction to the illumination will best explain the pattern of lightness we get in the face. But of course it’s not gonna be right because this isn’t the proper face. So then we assume that
the light is correct and we fix the face, or
the computer fixes the face to get a better model of her face, and then we have a
better model of her face and then we can re-estimate the direction to the illumination. The computer scientists in
here will recognize this as the EM or expectation
maximization algorithm. And at the end of the day you have a better model of her face and an estimate to the
direction of illumination. Turns out to be 160 plus
or minus five degrees. The most, the funnest part was making a computer graphics
model of her face, of the entire tableau and
now we can turn it around. Now this is technically,
when I showed this at the Mauritshuis, the
friends of the Mauritshuis went crazy ’cause they’d never seen the back of their painting. (audience laughs) You have to make an assump, this is technically called
an ill posed problem. You have a 2D image and you
have to make a 3D model. You can do this, there
are a number of techniques from computer science. All you need to know is
that it doesn’t add a bias to the estimate of the
direction of illumination. It doesn’t mean the light’s
gonna be too high or too low. There’s gonna be uncertainty,
but it doesn’t bias it and we can incorporate
that into the models. So now we pose, ended in the
pose and we can move the light around wherever we want. Then we’re going to stop
that portion of the film where in the configuration
that best matches the painting, and now we’re going to fly
around Vermeer’s studio and look at her from
many possible directions and then I’m going to stop
it in the configuration that matches the painting and do what computer graphics
people call an alpha blend. Basically a blurring
from the computer model to the actual painting, and you’ll see that it matches extremely well, and that now gives us yet another estimate to the direction of illumination. So here we go. And so here are three stills
from that movie if you will. One where the light, oh one
where the light’s too high, and one where the light’s
too low and so forth. But here are results from
all those techniques. Here are the angles and the average turns out to be about 155 degrees. That’s actually not very important. What is important is the very
small standard deviation. The incredible agreement among these six different techniques
that are very very different, like the pearl and the
occluding contour algorithm. One way of looking at that or showing this is the red line is in that 155 degrees. That’s the direction to the illumination and the yellow line is called a cardioid. It tells us in this case
the relative probability of finding the light in a given direction. So it’s most likely the
light’s in this direction. It’s about I don’t know half as likely off in this direction. The fact that the yellow
cardiod is so long and stretched out shows
the incredible agreement among these different methods for estimating the direction
through the illumination. Okay, so what? Why would an art historian
be interested in this? Well this is not a portrait. We don’t know who this person was. It’s called a tronie, it’s
sort of a character sketch and some art historians actually debated whether there was a person
present during this. Maybe he just did it
out of his imagination. Well the fact that these
methods agree so well, almost certainly shows that
there was someone there present so that he could observe the position of the glints and so forth. And this technique can be used elsewhere, especially in Renaissance
art when you have maybe four or five different figures. Maybe they’re done by one
master and several assistants and if the lighting is
very different on them they might have been
done at different times, different studios, and so forth. That’s evidence to tell whether
they’re by the same hand or at least the same studio. And so this can be used elsewhere. Now I want to use one of the
examples from art history that sort of got me into this
debate, or into this field and it came from the artist David Hockney. Maybe some of you remember that he looked at this transition when you stand back and
look at the grand sweep of the development of western painting you find there’s this
transition around 1430. Before that time, images were somewhat awkward and schematic. If you look at this Giotto fresco portrait you look at the ear, it
looks like a cartoon. This anonymous Austrian
portrait of a king. And this lovely Masolino almost porcelain doll like portrait. But after this time, you
get paintings like this. Robert Campin’s A Man, 1430 in the National Gallery in London. It looks so realistic,
almost photographic. This portrait has a
personality, an individuality, a psychological depth lacking
in these earlier works. This guy looks like us, what happened? Well in order to explain the
emergence of this new art, David Hockney came up with a bold and very controversial theory. He believed that these later paintings were optical, as he called
them, because they are optical. That Renaissance painters
secretly used optical devices during the execution of their works. And here’s Hockney showing
how he believes it was done. Here he is in Santa Monica,
California, my home state. Very important, a lot of sunlight. The figure, the subject is outside. There’s a window here, and
the key element in all this is a concave mirror which
anyone in physics can tell you can project an image just as does a lens. The image is on the other
side of the wall there. It’s upside down, so-called inverted. The artist would trace
it and then bring it down and then turn it around and paint over it while he’s tracing. He would then paint
over it with oil paints or acrylics later or whatever like that. And so that’s his theory of how this realism came around 1430. Well, one of the paintings
he induces as evidence is Georges de la Tour’s Christ in the Carpenter’s
Studio in the Louvre. Incredible masterpiece, and he says that in France, the most famous
of Caravaggio’s followers was Georges de la Tour. Could that candle really
produce all that light? Once again, the source
of light seems to be outside the picture, as it
must be if you use optics. The source of light were in the setting it would cause flare in the lens. Joseph and the girl, it’s
not a girl, it’s Christ, were probably painted separately, each lit by a shielded light source in place of the other figure. So in his book Secret Knowledge, he showed this. He says ah when Christ was being painted, the light source was
in place of St. Joseph, and when St. Joseph was being painted, the light source was in place of Christ, or outside the picture frame. So you have to look at
this and try and figure out where was the light source in this? And human aren’t that good. Computers are superb. So we approached that and said all right, first thing to do is cast shadow analysis. You take for instance a shadow cast by the left thumb of St. Joseph, draw a line from its the occluder from the shadow point
right through it, bingo. Goes straight to the candle. The candle flame is behind Christ’s hand. This is smoke if you know this painting. And then if you do a full
set of such cast-shadows, indeed there is one with a
very small sort of moment arm that’s consistent with the
light being in place of Christ, but the vast majority
including the cast-shadow from St. Joseph’s nose, all
meet right at the candle. And you can see these contours, these are contours of equal aposteriority probability density. It just a lot of words,
it just says the light’s most likely here, less likely here, less likely here, less likely here. Like a high weather, like
a high pressure region on a weather map. And I won’t go through the mathematics, except graphically to say
ah if one cast shadow says oh the light’s in this
direction, would this fall off? And the other says oh
it’s in this direction, basically you multiply these together. Think of these as
sprinklers shooting water and where is it gonna be the wettest? That’s the most likely place
given all this evidence. And that’s all I showed
on that previous slide. But then we could also use the
occluding contour algorithm and you have contours
along you know the arms and the hands and the wrists and so forth and so the computer puts
in all of these red lines. The computer doesn’t know
anything about baroque art. It doesn’t know about Christ, it doesn’t even know
three dimensions really. It just has a contour and
then measures the light along that contour and then infers the direction that the
light’s coming from. Look at Christ’s shin. This is unbelievable to me, that the computer just gets this curve and the pattern of
light and says you know? The light’s right along that direction. Bingo, right to the candle. Everyone has admired the lighting, the ability of de la Tour to render light, but I don’t think people
really appreciated how amazingly well he was to light, to render the shin so perfectly that the computer knows that the light must have come in this direction. And then you can do the
same probability estimate for all of this evidence,
and again the light, it’s most likely the light for
this evidence is right here, less likely here, less
likely here, and so forth. Right at the candle. We’ve done it on other
paintings here in LACMA, and one of the benefits of using
this probabilistic approach this machine learning
approach is that you can now integrate the evidence
from disparate methods. So here’s the evidence I showed
you for the cast-shadows, here’s the evidence for the
occluding contour algorithm and to put them together you
basically just multiply them. And so putting both of those together, it says ah it’s most likely
the light’s right here. Less likely here, less likely here. Certainly not outside
the frame of the painting as Hockney believes, and is
necessary for his theory. But, there’s other lighting
evidence in this painting that we have not yet used. Look at the pattern of
light over the floor. It’s bright in the center,
and dark off to the side. Where would a light source be so as to make the center bright
and the sides dark? Well it’s not gonna be over here. It’s not gonna be over here. It’s gonna be somewhere in the center, and we make a physics model of it. It’s high school physics. I don’t expect art historians to do it, but they do expect any computer scientist to be able to do this kind of stuff, and what we want to do is find
the position of the candle or of the light source,
and other parameters like the tip of the
floor that best explain the pattern of light
that we find on the floor and this is all the math. And then you, so this is
the change on the floor of changing the height of the candle. Here is the tip angle of
the floor and so forth. So the computer is
trying to search around, what’s the tip of the floor? Where’s that candle? In order to explain the pattern of light that we find in the painting. And when you go through and do that, you find that the position is here. Now it’s somewhat low
from the actual candle, and we think we understand that. A grad student at Stanford
and I figured this out that it’s, that the range
of likenesses you get in the real world might
be a factor of 100,000, maybe a million to one. Paint only has an albedo ratio. The whitest white to the blackest black is only about a factor of 100, and that shifts the position. Nevertheless, it’s consistent with the light being in the middle, not outside the frame of
the picture as Hockney says. So he’s just simply wrong on this one. But there’s yet more lighting information that we have not yet used. All the pattern of light
over Christ’s chest, over the entire objects that are not along the outer boundary, and are not casting a shadow just as we had for the Girl
with the Pearl Earring. So we have to make a
computer graphics model, and this can be done semi-automatically and there are increasingly ways to do this completely automatically with certain assumptions, and now we’re going to so here’s
the digital tableau and we’re going to put a
light in place of St. Joseph, and then one in place of the candle, illuminate the Christ
model and then decide which one best explains the pattern that we find in the actual painting. Problem is we don’t know
the depth into the tableau with the light here, here, here, here so we did 19 and chose the best one for the light in place of St. Joseph. The best one for the light
in place of the candle, and rendered Christ, but
before I get to the results here’s the full computer graphics model. Now if you study art history,
St. Joseph’s the carpenter so he has the tools. Christ is the light of the
world, he has a candle. So I hope you can see that the cast-shadow is changing here on the floor
as we adjust the position of the illuminant inside the tableau. And now I’m going to stop it and look at the back of the painting. We’re going to look at the
tableau from all the way around, and then stop it in the configuration that matches the actual painting, and do another alpha blend
to see how well it matches. And here are results. So here’s the actual painting. Here’s the best image you can get with the light in place of St. Joseph. Here’s the best one you can get with the light in place of the candle, and you say which one actually matches the actual painting best? And there’s no question, none whatsoever that the light in place of the candle matches the actual
painting much much better. Look at the pattern of
light over Christ’s chest. Look at the cast-shadow
of the shin and knee. They’re not exactly the same, but it’s much much
better here and so forth. No question that the light was in place of the candle and Hockney was wrong. Another painting that
came up in the debate is this van Eyck’s incredible masterpiece in the National Gallery in London, the Arnolfini portrait. And in art history you’ll
learn immense amounts about the symbolism and the
importance of this painting, but we’re going to look at Hockney’s claim that this was done using
some sort of projector. Well, if it was done some
sort of optical system we can infer what focal
length would have been used. Here’s an example of how
you do it with photography. This is a photograph of me many years ago as a beginning assistant
professor at Swarthmore, taken with a wide angle lens. 24 millimeter short focal length lens. Notice the size of me, and
the size of the pillars here. Then the photographer put on
a normal lens, 50 millimeter, magnifies everything by a factor of two but he moved back from me twice as far so as to ensure that I’m the same size in these two photographs. Look at the pillars. Pillars appear larger. Then you put on a telephoto lens and move back the appropriate distance to make me the same size,
but now look at the pillars. These are all in perfect perspective. They’re photographs, after all but it’s just where the center of, where the camera is, the
center of projection. Well, if you have such a photograph or a tracing of a projection and you know the relative
sizes of the objects, you can work backwards and infer
what focal length was used. Really, everyone in this room can do it. I teach my undergraduate
art history majors at Stanford how to do it. Everyone can do this. And but you need to know,
or make some assumptions about the 3D form in the tableau. So we made the computer graphics model of the Arnolfini tableau, and here it is from the side. And here is the painting to scale in this digital realm. And then you ask where would I put a lens, turns out to be the same as
with a lens and a mirror. Don’t go through the optics too closely, but where would I put a lens such that the sizes of the projected images matched that of the real objects? Of course they’re not
real, it’s all digital but if you put the lens here, then the size of the
image of the chandelier corresponds to the size
of the real chandelier. The size of Arnolfini matches Arnolfini, in fact 13 objects and image pairs. And once you have the lens
here, not here, not here, but here you have what’s
called the image distance, object distance, it’s high school optics to calculate the focal
length of this lens. Turns out to be about 61 plus
or minus eight centimeters. Well, Hockney and his colleague Falco wrote that van Eyck placed a convex mirror at the very center of this
Arnolfini masterpiece. The very mirror, the very
mirror which turned around he may well have used
to construct this image. And you’re all familiar,
everyone in this audience is familiar with this incredibly detailed reflection off the curved mirror at the center of the
Arnolfini masterpiece. You see the back of Arnolfini,
you see the back of his wife, you see the two witnesses to the wedding coming through, and
it’s clearly distorted. You can see the bending of the lines and the rectangular mirror
and the bed and so forth. And so we published four
papers on methods to de-warp this image and
thereby estimate its curvature which then tells us its focal length. And one of them involved,
well here’s a picture of one of my Japanese
colleagues taking a photograph of a curved mirror, security
mirror at the top of his lab and you notice that the lines
that we know are straight. The hallway lines are in
fact distorted and warped, and so mathematically we can undo that and in doing so we get a
measure of the curvature of the, of the curved mirror at the top. And so when you do that, you can un-warp the image or de-warp the image, get every line that’s
supposed to be straight appear indeed straight and then we get the focal length of the mirror, and we can undo that distortion. And so here’s the undistorted view in that little piece of
the Arnolfini portrait. This is an amazing image. This is the view you would get if you were to walk into Arnolfini’s room, stand at that mirror,
turn around, and look back revealed nearly 500 years
after this painting was made, and you notice that the window
is much more rectangular and the perspective is much
much better and so forth as we’ve quantified. But this gives us the
curvature of the mirror, and that means the focal length. Well, and so here’s the mirror on the back and if Hockney and Falco are right, the focal length of this
should be pretty close to the focal length of the putative projection mirror or lens. But when you go through the math, turns out the focal length is 18 plus or minus four centimeters. Much much much too short. It simply could not have
been used as claimed. There are a number of other
reasons it would have been distorted so you wouldn’t
have gotten a good image. They coated the back side with tar, so as to keep the metal from
tarnishing and so forth. There are plenty of reasons why they’re just simply wrong on that. So here from my Scientific
American article is a way to visualize this. So here the putative projection mirror, its focal length is one half the diameter of this sphere as represented here. Here’s the sphere that would
be for the mirror on the back. Much much much too short to
have been used as they claimed. Well okay well maybe
this mirror wasn’t used, but what about other mirrors? Could another mirror lens have been used? No, why? If you do a perspective analysis you find that for instance on the
floor these two lines, perspective lines meet here. This one doesn’t meet it at all. These are parallel lines,
they never meet at all. That the perspective is just not coherent. It doesn’t, which we would get if it were done under a single projection. This does not mean it’s a bad painting. Some art historians say oh, Stork thinks that if something’s not in
proper perspective it’s a bad, not saying this at all. This is a sophisticated, I
don’t have to explain that. But, okay so not for that but what about that chandelier? David Hockney got on CBS 60 Minutes and told eight million people quote “that chandelier is in
perfect perspective” and then he goes on to
build a projector and say ah he would have traced
it this way and so forth. I thought I’d check. (audience laughs) Here is the six armed chandelier plan of the bird’s eye view looking down here, the arms of the chandelier
one two three four five six. Consider two points on corresponding arms like those decorative crockets
that hang down beneath. Those define a line in Arnolfini’s room. Think of my two arms as chandelier arms. This elbow and this elbow
define a line in this room. That line is parallel to
the floor because see, my elbows are at the same height. That line is also perpendicular
to the vertical plane bisecting these two arms, okay? Think of other crockets. Those also define a
line, like my two thumbs. Those define a line that’s
parallel to the floor ’cause these are at the same elevation, and perpendicular to this plane. So those two lines are
parallel in Arnolfini’s room. Likewise, here. All of these. So here’s what I’ve just
shown you, those lines. If you think about symmetry it’s saying for lines defined by these two arms that they’re all parallel and if you really think about it, lines defined by corresponding
points on arms two and five are also parallel. All of these are parallel
lines in Arnolfini’s room. So we built a symmetric
computer graphics model of the chandelier, and if
it’s in proper perspective, of course it is, it’s computer graphics, then if we take the images of these lines. For instance, a line from this crocket to this crocket, extend it. This crocket to this crocket, extend it and all of them, they’d better meet at a vanishing point down here, and when you do it well, sure enough. It’s computer graphics, of
course it’s going to work. What about the Arnolfini chandelier? If it’s in proper perspective we should do the exact same construction. Draw a line from here to here, extend it. Here to here extend it, and we should get a vanishing point somewhere down here. But when you do it, here’s what you get. Kaboom, it’s a mess. An absolute mess, not even close to being proper perspective. Not even close. Even famous artists and art historians are really bad at being able to tell whether something’s in proper perspective but the computer techniques can find it really really simply. But here’s another way of looking at it. So here’s one of those lines extended. Here’s another line extended, and here 15 just from the lefthand side. Not close to having
vanishing point down here, not even close. So how far off, when I published this they came back and said
well maybe the chandelier was really poorly made. (audience laughs) And so I calculated how bad
would the chandelier have to be to be consistent with their theory? Well, you take a point down here, it goes through this point
and it should go through these but what you find are a whopping eight to 10 centimeter variation. Well is that what you would have gotten? Absolutely not. Antonio, my colleague
from Microsoft research and I did photometric
analyses of dinanderie, the decorative metalwork
this time in the Renaissance and found that they were all
highly, highly symmetric. Then we spoke to experts
in dinanderie who said that oh no, these arms would all have been made from the same mold. You’d make six identical arms and then array them around the post. So you would, and I even
went up the street to the Met and measured one of their
chandeliers of the same time, symmetric to about a
millimeter, not 10 centimeters. So that explanation simply cannot hold. Another way of looking at this is doing what computer scientists call a homography, mapping a plane with one arm to a plane with the other arm so that you can compare the arms completely. If they are perfectly the same shape, it’s simple mathematics. But when they have different shapes, you can always make a homography where the thumbs will overlap,
but then the elbows don’t. Or you can make another one
where the elbows overlap, but then the thumbs don’t. You want to get the best overall mapping and my colleague Antonio
Criminisi did the mathematics on this and figured it out how to do it. I won’t go through the math, but it’s basically the best mapping. There’s the math, from
one arm onto another, and so there are two arms being shown here and again you get this whopping eight to 10 centimeter
disagreement between the arms. Much, much, much too large to be consistent with Hockney’s theory. Well, one of his motivations was that oh no artist could have
really drawn that chandelier in such good perspective entirely by eye. He must have used some
sort of optical device. Is that true? Well, I had a realist
artist paint two chandeliers entirely by eye so that
I can analyze those. But first, here’s the
Arnolfini chandelier of course and it turns out one small
portion of the chandelier is in good perspective, the
bobeche or the candle holders. If you take a hexagon,
you can do the homography and find that it matches
pretty well, a little bit off but pretty well. Well, here’s this
chandelier with five arms. You do a pentagon by Nicholas Williams who lives in southern England, and it turns out it’s in slightly better perspective than van Eyck. In fact the whole thing
is in better perspective, and then he painted this
really beautiful chandelier too entirely by eye, no projections, no rulers, nothing just by eye and it’s in superb perspective, as I’ll show you on the next slide. Superb except, except, you can’t see it, but the
computer finds it immediately that one of those arms
is a little bit off. Statistically significantly off, and when I found this out
I emailed what’s going on? And he said oh yeah, I
accidentally dropped the chandelier and one of the arms bent off at the bottom and I calculated 1.5
centimeters, 1.3, close enough. So now let me show you just how good the perspective is on
this eyeballed chandelier. Take one of the arms and
segment the background. So here’s one of the arms
and here’s another arm. We just erase the background so to speak, and then do the mathematics
I just went through and take one of those arms
and overlap it to the other so that takes arm one and transforms it to be as similar to this as the other. Now the homography maps
a plane to a plane, so things that stick out of the plane like the pan of the
candle holder won’t match, so don’t worry about this,
just look at the shape. And when you overlap them, they overlap superbly, incredibly well. That he did this entirely by eye, and he admits fully that he’s
not as great as van Eyck, so you don’t need any sort of optical help in order to draw the Arnolfini chandelier. Well, now I could talk for
weeks on this kind of stuff, but I now just want to
present you my vision for where this field should be going. Here’s a painting I’ve actually published three or four papers
on the carpet on this. I don’t have time to talk about that, but what, imagine the following. Imagine we had all the art
historical knowledge that we have properly organized using
computer techniques and so forth, and then we found this, but not anything about this painting. It’s not in our database
in any way, shape, or form and someone found this painting. We would want our computer to do the following kinds of things. We want it to tell us
that it’s an oil painting, not an etching or watercolor
or something like this. That it’s a double portrait,
that there’s a man there and there’s a woman there. Judging from their clothes and the jewelry, they’re fairly wealthy. That in the back it
would have to figure out, it’s easy for us to see it’s a window but is that a painting on the back wall? But we want our computer to
figure out that it’s a window and that there’s a storm outside and that that means that
they are safe from the storm. They are together, reinforcing each other and loving each other and so forth. And of course there’s a
dog here and at this time of course everything’s period contextual. At that time it means Fido fidelity means she’s always going to
remain faithful to him. It’s very difficult to see, but
there’s a squirrel down here and the man’s pointing to it and holding homo nunc lum, man never. Well the squirrel signifies
skittish wandering or lusting after other women, and so he’s pointing to it and says I will always remain
faithful, homo nunc lum. But very unusual for art of this time, her head is higher than his. You almost never get
this in the Renaissance. Well, records show that this painting was painted several years
after this woman died. She’s in heaven, and notice
the style of her face. It’s much more ethereal
and smooth and so forth whereas the man’s much
more earthy and ruddy and you know in the here and now. And if you really look closely, you can tell that his eyes are red and that he’s been crying. So this is a painting sort of documenting that he’s going
to remain faithful to his wife even after she’s dead. Now, somewhere in that
Art 101 mini lecture, there are things that computers
can do right now no problem, and then the stuff at the very end that is very very difficult
and will take us a long time. But there’s nothing in principle against getting a machine to
do what I just went through. Not going to a database and finding what some art scholar
has published on this, that’s a very simple problem. You know, indexing and
finding background information but actually doing the evaluation that an art historian would do. It’s years away, but I think
with the kinds of techniques that I and my colleagues here and elsewhere around the
world are working on, it’s a doable process. And I think we are now in an era that’s like what Morelli was, who really brought the sort of scientific notion of
connoisseurship to art history, looking, identifying painters
by how they rendered hands and ears and so forth. Let’s not forget, Morelli
was a medical doctor and so it was his power of observation, and coming from outside art history and saying here’s some
techniques that have helped me understand to diagnose diseases. Maybe these can be used in art history. So too those of us
working in computer vision have lots of techniques that we’d love to work with art historians for, for revolutionizing how we study art. Thank you very much. (applause)