Hi Elijah, I'll try with an example of Inverse Distance Weighting with five points: p1, p2, p3, p4, and p5. Each of these points have a location and a measured value.
Cross validation would start by removing p1. It would then use p2, p3, p4, and p5 to predict the value of p1. In IDW, this means taking the weighted average of the values of p2 to p5 (weighted by inverse distance). This will result in some prediction (called the cross validation prediction) that can be compared to the measured value of p1.
Next, p2 would be removed, and p1, p3, p4, and p5 would be used to predict to the location of p2 (note that p1 is added back to the dataset after being cross validated). The same is done for p3, p4, and p5, each using the other four points. This would produce five cross validation errors that would be used to calculate, among other things, the root mean square error of the IDW model.
But when actually making the prediction surface (after cross validation), all points are used to make the predictions. The surface also predicts values everywhere, including at the input point locations. So, what will it predict at, say, the location of p3? The prediction is the weighted average of all the points p1, p2, p3, p4, and p5, weighted by the inverse distance to p3. But the distance from p3 to itself is zero, which gives the value of p3 a weight of infinity. This forces the predicted value to be exactly equal to the measured value at p3. This is what makes IDW an "exact" interpolation method.
Please let me know if that still is not clear.
-Eric