Author Topic: Relaxing Max Read/Write Chunk Size  (Read 22233 times)

Stephen

  • Newbie
  • *
  • Posts: 5
    • View Profile
Relaxing Max Read/Write Chunk Size
« on: March 19, 2014, 21:22:15 »
I searched around and know that the chunk size give the highest writing performance is 32k-1MB, but the other fact is 32k give me 33 fragments for a 7MB file, while 1MB give me only 2 fragments in both synchronous and asynchronous mode.

I try using synchronous mode to read and then write the whole buffer even files are on different disks, hoping that can reduce fragmentation of files.

It helps more if files being copied are write once and read many times, like program files.

Thankyou.

Mathias (Author)

  • Administrator
  • VIP Member
  • *****
  • Posts: 4491
    • View Profile
    • Multi Commander
Re: Relaxing Max Read/Write Chunk Size
« Reply #1 on: March 20, 2014, 06:59:52 »
MC have no control where the filesystem write the data. MC just write data and the filesystem find some where to put the data.
If you get 33 fragments for a 7MB file then I guess your drive is very fragmented or very full. If you have a SSD, the fragments does not matter.

I searched around and know that the chunk size give the highest writing performance is 32k-1MB, but the other fact is 32k give me 33 fragments for a 7MB file, while 1MB give me only 2 fragments in both synchronous and asynchronous mode.

I try using synchronous mode to read and then write the whole buffer even files are on different disks, hoping that can reduce fragmentation of files.
There is no differers from synchronous and asynchronous in how the system write the file. the different is that with asynchronous  mode the file is copied by reading and writing at the same time and should only be used if the source and target are totally different drives.


It helps more if files being copied are write once and read many times, like program files.
Not sure what you mean by that. but how many fragmentation a files gets have nothing to do with how many times you read it.

Not sure what you are asking or suggestion ?

Stephen

  • Newbie
  • *
  • Posts: 5
    • View Profile
Re: Relaxing Max Read/Write Chunk Size
« Reply #2 on: March 20, 2014, 09:33:49 »
MC have no control where the filesystem write the data. MC just write data and the filesystem find some where to put the data.
If you get 33 fragments for a 7MB file then I guess your drive is very fragmented or very full. If you have a SSD, the fragments does not matter.

What I am talking about is mechanical Hard disk, not SSD, fragmentation has almost no effect on SSD. Also I got 20GB free space when my 160GB hard disk write that 7MB file.

There is no differers from synchronous and asynchronous in how the system write the file. the different is that with asynchronous mode the file is copied by reading and writing at the same time and should only be used if the source and target are totally different drives.

Chunk size + sync/async mode of writing does affect fragmentation. In async mode, data read and write on different devices. The moment, when a piece data is being read from one hard disk, that piece of data cannot be written on other hard disk at the same time because it is still being read. In a fast rotating Hard disk, the hard disk head hence misses the consecutive write sequence because it has to wait until that piece of data finished reading. Filesystem can only find other empty place to write that piece of data when it finished reading it, and that causes fragmentation.

In Sync mode, by reading the whole file, or a big chunk of data into memory, and then write it back in a single time, this can help reducing fragmentation as filesystem now writes a big chunk of data without interleaving reading and writing.

Not sure what you mean by that. but how many fragmentation a files gets have nothing to do with how many times you read it.

I mean, on mechanical hard disk, reducing fragmentation helps faster booting and loading files because program files are usually read many times during system uptime. If you copy some program files from one place to another, you would try to keep them 1 fragment per file as possible.

In "total commander" and "fastcopy", I can set both the buffer and chunk (I/O) size to 16MB or more instead of only 1MB maximum and I can always get 1 fragment per file with them. I hope I can also set maximum chunk size to 16MB or more in MC.

Hope you understand my poor English.
« Last Edit: March 20, 2014, 09:41:13 by Stephen »

Mathias (Author)

  • Administrator
  • VIP Member
  • *****
  • Posts: 4491
    • View Profile
    • Multi Commander
Re: Relaxing Max Read/Write Chunk Size
« Reply #3 on: March 20, 2014, 10:59:08 »
MC have no control where the filesystem write the data. MC just write data and the filesystem find some where to put the data.
If you get 33 fragments for a 7MB file then I guess your drive is very fragmented or very full. If you have a SSD, the fragments does not matter.

What I am talking about is mechanical Hard disk, not SSD, fragmentation has almost no effect on SSD. Also I got 20GB free space when my 160GB hard disk write that 7MB file.

There is no differers from synchronous and asynchronous in how the system write the file. the different is that with asynchronous mode the file is copied by reading and writing at the same time and should only be used if the source and target are totally different drives.

Chunk size + sync/async mode of writing does affect fragmentation. In async mode, data read and write on different devices. The moment, when a piece data is being read from one hard disk, that piece of data cannot be written on other hard disk at the same time because it is still being read. In a fast rotating Hard disk, the hard disk head hence misses the consecutive write sequence because it has to wait until that piece of data finished reading. Filesystem can only find other empty place to write that piece of data when it finished reading it, and that causes fragmentation.

In Sync mode, by reading the whole file, or a big chunk of data into memory, and then write it back in a single time, this can help reducing fragmentation as filesystem now writes a big chunk of data without interleaving reading and writing.

Not sure what you mean by that. but how many fragmentation a files gets have nothing to do with how many times you read it.

I mean, on mechanical hard disk, reducing fragmentation helps faster booting and loading files because program files are usually read many times during system uptime. If you copy some program files from one place to another, you would try to keep them 1 fragment per file as possible.

In "total commander" and "fastcopy", I can set both the buffer and chunk (I/O) size to 16MB or more instead of only 1MB maximum and I can always get 1 fragment per file with them. I hope I can also set maximum chunk size to 16MB or more in MC.

Hope you understand my poor English.
I do not think sync / async really matters for fragmentation, since the filesystem do not know the location of the drive head.
The filesystem do not search for free block on the disk while writing. (That would be very very slow),  Simplified they have an in memory database of what filesystem blocks are free and so. The filesystem might be located on a raid system and then you can have 10 drive heads depending on configuration.
The filesystem do not know about the drive layout. It only knows that it can write data from block x to block y.

However you are correct that chunk size matters. because if you have larger chunks there is more chance for the filesystem to find an entire block that the chunk of data fits into. But the filesytem will try to keep the data consistent, however if there a low free space or the drive is fragmented it can be hard for it to find a block right next to the previous block. But fragmentation do not need be super bad. it depends on in what order they are. but less fragment are always good.
Windows will try to auto defrag data when computer are idle and make sure file are consistent. (Not on XP)

However chunck size that is too large is not good for write performance (and IO performance), if you write chunks that are larger the the drive cache the OS driver for the harddrive will have to stall and wait before data can be sent to it.

In MC you can tweak the chunck size. Their is no max limit for chunk size in MC, but there is in the settings UI (bug), so one have to hack the config file to get around that.
But that will fix.

But one think that will make more of a different is the option to preallocate the target file. Most filesystem support that now and that is an option that is coming. This will tell the filesystem before the data is written that the file is for example 100MB, and the filesystem will create an empty 100MB file directly that the data is then written into. (since it gets a hint of the final size the filesystem can find a large enough block range for the entire file directly)
However there are some issues with this on some filesystem and network location so this option will not be on by default.

sync/async is mostly a performance option.  when copy between 2 physical drives async is faster. since reads do not need to wait until writes have finished.
there are room for X number of chunks in memory.. and if the buffer is large enough the and writes are slow the buffer can be filled up with the entire file.

Stephen

  • Newbie
  • *
  • Posts: 5
    • View Profile
Re: Relaxing Max Read/Write Chunk Size
« Reply #4 on: March 20, 2014, 15:35:57 »
I do not think sync / async really matters for fragmentation, since the filesystem do not know the location of the drive head.
The filesystem do not search for free block on the disk while writing. (That would be very very slow),  Simplified they have an in memory database of what filesystem blocks are free and so. The filesystem might be located on a raid system and then you can have 10 drive heads depending on configuration.
The filesystem do not know about the drive layout. It only knows that it can write data from block x to block y.

On a high level you are correct. Filesystem only knows read -> buffer -> write. For raid system, except mirroring, data are stripped across different physical hard disks. They are treated as one single logical drive in the filesystem's view. Each physical drive only needs to take care part of the whole data. That is what a high performance system does. If all you have to face is one 4+ cores fast desktop, one not very decent core 2 duo laptop with a bit slow south bridge, and a very slow intel first generation atom small laptop. You will know why I have stress mostly on fragmentation.

However you are correct that chunk size matters. because if you have larger chunks there is more chance for the filesystem to find an entire block that the chunk of data fits into. But the filesytem will try to keep the data consistent, however if there a low free space or the drive is fragmented it can be hard for it to find a block right next to the previous block. But fragmentation do not need be super bad. it depends on in what order they are. but less fragment are always good.

I rarely face low free space and I try to avoid it.

Windows will try to auto defrag data when computer are idle and make sure file are consistent. (Not on XP)

I have both XP and windows 7 running, and I turn the automatic defragmentation task off on windows 7 because the hard disk becomes unresponsive when defragging files, when the computer "thinks" it is idle while I am still there and working (I only don't touch the mouse and keyboard reading and thinking). I manually defrag every file and folder using "Wincontig". Also I think you are talking about CRC checking during defragmentation to make sure files are consistent.

However chunck size that is too large is not good for write performance (and IO performance), if you write chunks that are larger the the drive cache the OS driver for the harddrive will have to stall and wait before data can be sent to it.

I can wait several seconds or minutes more. I do other things (which access other physical drives) while files are writing, in order to make sure they are not fragmented.

In MC you can tweak the chunck size. Their is no max limit for chunk size in MC, but there is in the settings UI (bug), so one have to hack the config file to get around that.
But that will fix.

This is what I need. I'll check that out. :)

But one think that will make more of a different is the option to preallocate the target file. Most filesystem support that now and that is an option that is coming. This will tell the filesystem before the data is written that the file is for example 100MB, and the filesystem will create an empty 100MB file directly that the data is then written into. (since it gets a hint of the final size the filesystem can find a large enough block range for the entire file directly)
However there are some issues with this on some filesystem and network location so this option will not be on by default.

This a good news!

Thankyou for letting me know that MC team is so energetic and advancing in all aspects. I am glad I chose to use MC as my file manager.