I will try the theoretic approach. I would love to see some real-world examples that would prove or disprove my theory. Good question! If you have any comments or find some flaws, feel free to comment.
For any current encoder (let's take MPEG-4/AVC/h.264 as an example) frame rate does not matter as much as you think. Let's just assume there is no rate control and every picture is encoded with the same base QP (quantization parameter).
You are right with the following: The motion difference (as defined in ITU-R P.910, good read) between two frames of an 48 fps video will be lower than for the same video in 24 fps. This is due to the fact that the frames won't differ as much from each other. Note that the whole temporal motion doesn't increase. In the end, an object moves from point A to B, so its motion vector will be the same length no matter how many frames per second.
As the encoder looks for the difference between two (or more) frames and only encodes the residual values, that means it will have to code less residual per picture. In average, this will be half of the residual. So you're right with that. (We must not forget that only half of the residual does not mean half of the data needed to store it. It depends on the algorithmic coding implemented.)
Then again, you have twice as many pictures per second, which means that – in average – the encoded information doubles again.
To summarize, nothing much changes on that side. The encoder will do its best job to encode all the motion in the video, which in sum is the same (just in smaller steps, if you know what I mean). The only overhead we have to add is the overhead from small residuals that can't be arithmetically coded in an efficient way.
The above only applies to B- or P-pictures, which depend on others. However, we have to insert an intra-coded picture every once in a while which isn't dependent on any other picture. If this rate of intra-coded pictures doesn't increase, we could assume a linear growth in file size, maybe a bit more.
However if you decrease the distance of intra-coded pictures in order to compensate for possible packet-loss or bitstream errors, you will carry more overhead and therefore the increase is more than linear, not much, but probably noticeable.