Inside Vista SP1 File Copy Improvements

Submitted by PlatformAgnostic 2008-02-06 Windows 27 Comments

Windows Vista SP1 includes a number of enhancements over the original Vista release in the areas of application compatibility, device support, power management, security and reliability. You can see a detailed list of the changes in the Notable Changes in Windows Vista Service Pack 1 whitepaper. One of the improvements highlighted in the document is the increased performance of file copying for multiple scenarios, including local copies on the same disk, copying files from remote non-Windows Vista systems, and copying files between SP1 systems. How were these gains achieved? The answer is a complex one and lies in the changes to the file copy engine between Windows XP and Vista and further changes in SP1.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSAlert.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

27 Comments

2008-02-06 1:05 pm

evert
It’s an interesting article, and the author explains well which difficulties must be overcome to make copying files fast in different scenarios.

Still, it’s about time they got copying right… Maybe they tried to make it too smart. I really believe in the KISS principle.

2008-02-06 4:53 pm

Laurence
It’s an interesting article, and the author explains well which difficulties must be overcome to make copying files fast in different scenarios. Still, it’s about time they got copying right… Maybe they tried to make it too smart. I really believe in the KISS principle.

I couldn’t agree more. I can’t believe such a basic and core process can be so mangled.

I mean it’s not as if they don’t have archives of copying routines dated back to the early DOS days to refer to.

2008-02-06 1:36 pm

Ikshaar
Considering that when I had vista, I would dual boot to another OS just to move files around, that was kinda the trigger for not keeping Vista on desktop. Don’t flame me – it’s still on laptop.

2008-02-06 2:40 pm

Zan Lynx
Rebooting seems like a lot of effort. Seems like it would’ve been easier to use a different file copy tool, like Total Commander, or even the command line.

2008-02-06 4:00 pm

Ikshaar
Oh being an OS problem, I did not think any tool could bypass it. Good to know for the laptop

2008-02-06 2:51 pm

lefty78312
Vista SP1 removes the ‘Search’ feature from the right-click menu that is so useful when searching individual drives and folders. Does anyone know how to get it back?

2008-02-06 3:01 pm

eggs
I don’t have SP1, but try holding ctrl, shift, or alt while right clicking, that gives more options sometimes.
2008-02-06 7:44 pm

jayson.knight
Vista SP1 removes the ‘Search’ feature from the right-click menu that is so useful when searching individual drives and folders. Does anyone know how to get it back?

Use the search box in the top right corner…it took me a few minutes to “find” this as well since I’m so used to right clicking a folder to search it.

2008-02-07 2:26 pm

lefty78312

Vista SP1 removes the ‘Search’ feature from the right-click menu that is so useful when searching individual drives and folders. Does anyone know how to get it back?

Use the search box in the top right corner…

Thanks for the response. It’s nice to have an alternative to right-clicking, but this is another case of Microsoft fixing something that wasn’t broken; making the user relearn something they already know how to do. The OS shouldn’t be the star; it should be written for the convenience of the user.

2008-02-06 4:03 pm

leos
It is funny how in Vista blogs you often see some of the most basic problems being discussed in pages and pages of writing, and then extreme detail into the convoluted way it was fixed. I remember one where a MS employee explained why the desktop background dialog couldn’t handle a certain type of file type when it worked fine in IE (or something like that). Then there’s pages of detail about how brilliantly they managed to fix the problem.

These posts never show up in open source blogs. Firstly, because having a problem like that is ridiculous in the first place, and if one did pop up, the devs would quickly fix it in embarrassment, instead of writing 3 pages trying to convince people that it really was an insurmountably hard problem and was only fixed through pure genius.

2008-02-06 5:00 pm

linumax
“… in pages and pages of writing … instead of writing 3 pages trying to convince people…”

Indeed, such an extremely mindbogglingly long 3 page article!

Don’t know about you, but I’m a developer and there has been occasions where an apparently simple (probably stupid) bug took me (and my team) weeks to figure out. I won’t tag such problems as embarrassing and I certainly won’t bury them. Instead, standard practice of documenting in detail is followed and documentation is made available to everyone internally and sometimes to customers who were affected.

That ‘so very long’ blog post, is not meant to be a matter of fun and enjoyment. Take a look at the comments and you’ll figure out that there is in fact a certain audience (some, MS employees) who are interested in finding out what went on, appreciate transparency, and are even asking for more information!

Edited 2008-02-06 17:02 UTC
2008-02-06 9:06 pm

joshv
It is funny how in Vista blogs you often see some of the most basic problems being discussed in pages and pages of writing, and then extreme detail into the convoluted way it was fixed. I remember one where a MS employee explained why the desktop background dialog couldn’t handle a certain type of file type when it worked fine in IE (or something like that). Then there’s pages of detail about how brilliantly they managed to fix the problem.

These posts never show up in open source blogs. Firstly, because having a problem like that is ridiculous in the first place, and if one did pop up, the devs would quickly fix it in embarrassment, instead of writing 3 pages trying to convince people that it really was an insurmountably hard problem and was only fixed through pure genius.

You’ve clearly never read any of the kernel mailing lists have you? Read about the controversy surrounding the Completely Fair Scheduler (CFS?) You can find arbitrarily arcane posts that go on and on explaining solutions to a problem that appears to be quite well and thoroughly solved in other operating systems.

You’ve also clearly not read Mark’s post. He explains that in XP they took a simplistic approach to file copying that maximized perceived performance and compatibility at the costs of actual performance in some cases. In Vista they attempted to improve file copy performance using some newer techniques, but their methods reduced the predictability of the copy, and most importantly reduced the *perceived* performance. In SP1 they’ve gone back to a more XP like solution, but attempted to apply some of the Vista optimizations only in certain situations. The result should be a significant boost in perceived performance, and some real performance boosts in certain specific file copy situations.

It’s really not all that complex to understand. And I guarantee other operating systems face exactly the same sorts of tradeoffs.

2008-02-06 10:17 pm

leos
You’ve clearly never read any of the kernel mailing lists have you?

Sure I have, I was subscribed for quite a while. And the tone of discourse is completely different. On the kernel mailing list people are much more pragmatic and blunt. There is no sugar coating of results, if file copy performance had gone through the drain in a kernel release there would be a storm of emails stating the regression and going on about how it was completely unacceptable. Eventually someone would show up with a patch, or the change would be reverted. I have no problem with technical posts on issues, even if they have been solved before. What I do have a problem with, is the tone of these articles. The title is “Inside Vista SP1 File Copy Improvements”, which is accurate, but really it should be “fixing the file copy snafu”. Then we see “Vista Improvements to File Copy”, and “SP1 Improvements” instead of “Changes in Vista” and “What went wrong” or something along those lines.

You might think I’m being incredibly pedantic here, but just I just never get the impression that Microsoft blog posts are really honest. There is always a coat of marketing. Never admit to anything, make everything look good. “We vastly improved file copying but unfortunately there are some tiny problems” instead of “The new file copy engine, while necessary for reasons X, caused some serious, unacceptable performance regressions”

You’ve also clearly not read Mark’s post.

Whatever you say Mr assumption.

He explains that in XP they took a simplistic approach to file copying that maximized perceived performance and compatibility at the costs of actual performance in some cases. In Vista they attempted to improve file copy performance using some newer techniques, but their methods reduced the predictability of the copy, and most importantly reduced the *perceived* performance.

Oh bullshit. That’s the marketing coming out. Vista decreased file copy performance in a lot of cases. It’s not perceived, it’s real.

In SP1 they’ve gone back to a more XP like solution, but attempted to apply some of the Vista optimizations only in certain situations. The result should be a significant boost in perceived performance, and some real performance boosts in certain specific file copy situations.

Right, and you forgot to mention that it could _halve_ performance when copying files from W2k3 over a network, and also slow down copying large files on the same volume. Since the latter is a very common use case for me, I wouldn’t really say it’s a big improvement.

2008-02-06 4:04 pm

Googol
OK. I didn’t read it, but Ctrl+F for “robocopy” does not produce a result.
2008-02-06 4:11 pm

Punktyras
Initial Vista release

Are you ready to WOW…

after SP1

…now…

after SP2

…?
2008-02-06 5:41 pm

Vanders
The answer is a complex one and lies in the changes to the file copy engine between Windows XP and Vista and further changes in SP1.

The what? Duplicating and storing data is what computers do! How can you possibly make it so complicated you have to have an entire “engine” to do it?

Please stop the planet, I wish to get off.

2008-02-06 6:05 pm

Bending Unit
Obviously you’re not a golfer.

2008-02-07 3:01 am

Spellcheck
You’re out of your element, Donny.

2008-02-06 6:04 pm

optimusg4
I’m glad to hear that this is fixed. First, I thought I must have screwed up my installation somehow and so I decided to reinstall. Still persisted. Luckily, I don’t do *too* much copying in Vista, but enough to make me cringe at times.
2008-02-06 6:46 pm

theosib
Maybe the blog description doesn’t do a good job of describing how Vista does copies, but it feels to me like they’ve put a lot of effort into it but have still fallen short because they really don’t seem to understand asynchronous I/O. For instance, they don’t seem to grok pipelining. Now, I’m a chip designer, so pipelining is second-nature, but anyone who’s ever piped multiple commands together in UNIX should understand the concept.

So (and again, I could have misunderstood the block), they manage to do some async I/O, but the thing that struck me as kinda WTF is that they seem to wait for ALL of the reads to complete before issuing the writes. Then they wait for ALL writes to complete before issuing more reads. Switching between read and write, especially over a network, is going to result in quite long dead-time periods when they’re doing no I/O, wasting all of the benefit of doing the async I/O. Keeping in mind that it’s important to issue writes IN ORDER (see the blog as to why), here’s how I would do it:

– Allocate some number (let’s say 8 as an example) data buffers

– Begin by issuing an async read for each buffer

– Sleep until any read (or write) completes

– If the completed read is the lowest numbered block not yet written, issue a write for that block (and any other completed that are higher numbered)

– If the completed read is not the lowest numbered block, go back to sleep

– If you wake up on a completed write, then the corresponding block is now free. Issue the next read request, pointing it to that memory block, and go back to sleep.

Note that there’s no reason why files have to be copied individually! The copy engine could be given multiple files and simply overlap their copy times, eliminating the need to turn on caching for small files. As one file’s reads complete, start requesting reads for the next file, even while the last file’s writes are still out-standing.

This has some nice effects:

– All reads are issued in order

– All writes are issued in order

– Aside from out-of-order completion, the the maximum number of requests are always outstanding.

– This absorbs latency and gives I/O schedule more requests to reorder according to disk layout.

– While copying to/from the same disk only benefits slightly from the pipelining, copying between different devices (including network) is fully pipelined and asynchronous, making maximum use of available bandwidth.

This all seems very basic to me. But I’m probably being arrogant, and I’m sure that there are practical issues involved, not the least of which is that this copy engine is running as a user process and has to, to an extent, work around “optimizations” employed by the I/O scheduler designed to make regular file access more efficient.

The only way to do better would be to move my above algorithm into the kernel so that it can employ zero-copy methods to reduce CPU overhead and bypass the VM entirely. This would be faster, but it’s probably not smarter. Among other reasons, it’s generally better to put infrequently-used (who spends all day doing nothing but copying?) things outside of the kernel. This keeps the kernel smaller and less prone to crash due to bugs.

2008-02-07 8:00 am

PlatformAgnostic
Things are obviously pipelined over the network. It’s only important to do the batching of reads and writes when it’s the same drive.

2008-02-06 7:15 pm

SpikeMcG
Does anyone here update their machines with Microsoft Hotfixes. Most of the complaints I had and the problems described in this article went away when I downloaded the hotfixes from support.microsoft.com a very LONG TIME AGO. Good article about seven months late however….

Do people really wait for a service pack to fix something like this? If you are doing anything serious with your computer I doubt you would..

You folks in the press really promote bad perceptions and have let the Vista “haters” rule. I haven’t had this problem in months. Instead of uninstalling I just got the fix.

Uninstalling isn’t what I mean by fixing either..

2008-02-06 7:30 pm

Mr. Sanity
Given Microsoft’s history of OS releases, many folks wait until SP1 or SP2 to even consider installing a “new” MS OS. That way, all the early adopters can ride the bumpy road of MS making the OS usable, while the SP waiters can start with a (generally) working product.

2008-02-06 9:48 pm

Milo_Hoffman
$ cp /usr/local/src/WINDOWS_XP/filecopy.c /usr/local/src/WINDOWS_VISTA/filecopy.c

$ cd /usr/local/src/WINDOWS_VISA

$ make

2008-02-07 12:28 am

hraq
you forgot

$ make install

2008-02-06 10:50 pm

renhoek
searching files is really slow in xp, copying files is really slow with vista, especially over the network. (did not try to do file-searching-with-the-dog-which-sends-the-queries-to-microsoft in vista, since i don’t own a copy myself).

but why can a product written by a small team like total commander accurately show the time remaining with copying and search files really fast? this seems like a case of over engineering. Microsoft, just cut the crap, calculate the size/bytes copied and try to copy the files as fast as possible. and use a linear function for the time remaining.

2008-02-07 7:19 am

MollyC
searching files is really slow in xp, copying files is really slow with vista, especially over the network. (did not try to do file-searching-with-the-dog-which-sends-the-queries-to-microsoft in vista, since i don’t own a copy myself). but why can a product written by a small team like total commander accurately show the time remaining with copying and search files really fast? this seems like a case of over engineering. Microsoft, just cut the crap, calculate the size/bytes copied and try to copy the files as fast as possible. and use a linear function for the time remaining.

I’ve found copying one 500MB file to be considerably faster than copying 50,000 10KB files, despite the size/bytes being the same for both cases. So size/bytes isn’t the sole factor in the speed of copying files, not by a long shot.

Edited 2008-02-07 07:19 UTC