Cache behavior in dd command












0














I am performing some dd outputs and running in parallel vmstat:



With no direct write (i.e. via cache)



$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,31332 s, 160 MB/s



$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2659604 383244 3596884 0 0 3 47 34 14 27 7 66 0 0
1 0 64 2509432 383244 3746080 0 0 0 0 1005 2278 5 20 75 0 0
0 0 64 2452560 383248 3807932 0 0 4 204880 1175 2321 4 12 75 9 0
0 0 64 2453144 383248 3807548 0 0 0 0 814 2677 5 2 93 0 0
1 0 64 2444868 383248 3814516 0 0 0 244 529 1746 4 2 94 0 0
0 0 64 2445756 383248 3814516 0 0 0 0 495 1957 3 1 96 0 0


I see that it performed more or less one bulk write



With direct write



$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,6902 s, 124 MB/s



$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2623556 383248 3603572 0 0 3 47 35 14 27 7 66 0 0
1 0 64 2613784 383248 3611220 0 0 0 88064 1001 2573 5 15 79 1 0
0 0 64 2612236 383256 3611804 0 0 8 116736 912 2033 1 18 78 3 0
4 0 64 2621076 383256 3604232 0 0 0 96 1086 3250 8 3 89 0 0


My question is the following:



How does this fragmentation of direct writing operation occurs?
What is the (kernel?) parameter that determined the file should be broken (when not passing via cache) into 2 chunks of 88 and 116 MB (and not say into 4 chunks of 50MB)?



Would the results be different if i had used the oflag=sync instead of direct?










share|improve this question




















  • 1




    You are using vmstat to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run of dd, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?
    – Kusalananda
    Dec 15 at 8:42










  • because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using the direct flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using the direct flag causes the writes to be done immediately (in terms of the bo appearing right away in the vmstat output)
    – pkaramol
    Dec 15 at 8:47












  • If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g. perf or or ftrace. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).
    – dirkt
    Dec 15 at 9:28
















0














I am performing some dd outputs and running in parallel vmstat:



With no direct write (i.e. via cache)



$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,31332 s, 160 MB/s



$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2659604 383244 3596884 0 0 3 47 34 14 27 7 66 0 0
1 0 64 2509432 383244 3746080 0 0 0 0 1005 2278 5 20 75 0 0
0 0 64 2452560 383248 3807932 0 0 4 204880 1175 2321 4 12 75 9 0
0 0 64 2453144 383248 3807548 0 0 0 0 814 2677 5 2 93 0 0
1 0 64 2444868 383248 3814516 0 0 0 244 529 1746 4 2 94 0 0
0 0 64 2445756 383248 3814516 0 0 0 0 495 1957 3 1 96 0 0


I see that it performed more or less one bulk write



With direct write



$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,6902 s, 124 MB/s



$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2623556 383248 3603572 0 0 3 47 35 14 27 7 66 0 0
1 0 64 2613784 383248 3611220 0 0 0 88064 1001 2573 5 15 79 1 0
0 0 64 2612236 383256 3611804 0 0 8 116736 912 2033 1 18 78 3 0
4 0 64 2621076 383256 3604232 0 0 0 96 1086 3250 8 3 89 0 0


My question is the following:



How does this fragmentation of direct writing operation occurs?
What is the (kernel?) parameter that determined the file should be broken (when not passing via cache) into 2 chunks of 88 and 116 MB (and not say into 4 chunks of 50MB)?



Would the results be different if i had used the oflag=sync instead of direct?










share|improve this question




















  • 1




    You are using vmstat to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run of dd, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?
    – Kusalananda
    Dec 15 at 8:42










  • because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using the direct flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using the direct flag causes the writes to be done immediately (in terms of the bo appearing right away in the vmstat output)
    – pkaramol
    Dec 15 at 8:47












  • If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g. perf or or ftrace. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).
    – dirkt
    Dec 15 at 9:28














0












0








0







I am performing some dd outputs and running in parallel vmstat:



With no direct write (i.e. via cache)



$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,31332 s, 160 MB/s



$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2659604 383244 3596884 0 0 3 47 34 14 27 7 66 0 0
1 0 64 2509432 383244 3746080 0 0 0 0 1005 2278 5 20 75 0 0
0 0 64 2452560 383248 3807932 0 0 4 204880 1175 2321 4 12 75 9 0
0 0 64 2453144 383248 3807548 0 0 0 0 814 2677 5 2 93 0 0
1 0 64 2444868 383248 3814516 0 0 0 244 529 1746 4 2 94 0 0
0 0 64 2445756 383248 3814516 0 0 0 0 495 1957 3 1 96 0 0


I see that it performed more or less one bulk write



With direct write



$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,6902 s, 124 MB/s



$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2623556 383248 3603572 0 0 3 47 35 14 27 7 66 0 0
1 0 64 2613784 383248 3611220 0 0 0 88064 1001 2573 5 15 79 1 0
0 0 64 2612236 383256 3611804 0 0 8 116736 912 2033 1 18 78 3 0
4 0 64 2621076 383256 3604232 0 0 0 96 1086 3250 8 3 89 0 0


My question is the following:



How does this fragmentation of direct writing operation occurs?
What is the (kernel?) parameter that determined the file should be broken (when not passing via cache) into 2 chunks of 88 and 116 MB (and not say into 4 chunks of 50MB)?



Would the results be different if i had used the oflag=sync instead of direct?










share|improve this question















I am performing some dd outputs and running in parallel vmstat:



With no direct write (i.e. via cache)



$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,31332 s, 160 MB/s



$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2659604 383244 3596884 0 0 3 47 34 14 27 7 66 0 0
1 0 64 2509432 383244 3746080 0 0 0 0 1005 2278 5 20 75 0 0
0 0 64 2452560 383248 3807932 0 0 4 204880 1175 2321 4 12 75 9 0
0 0 64 2453144 383248 3807548 0 0 0 0 814 2677 5 2 93 0 0
1 0 64 2444868 383248 3814516 0 0 0 244 529 1746 4 2 94 0 0
0 0 64 2445756 383248 3814516 0 0 0 0 495 1957 3 1 96 0 0


I see that it performed more or less one bulk write



With direct write



$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,6902 s, 124 MB/s



$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2623556 383248 3603572 0 0 3 47 35 14 27 7 66 0 0
1 0 64 2613784 383248 3611220 0 0 0 88064 1001 2573 5 15 79 1 0
0 0 64 2612236 383256 3611804 0 0 8 116736 912 2033 1 18 78 3 0
4 0 64 2621076 383256 3604232 0 0 0 96 1086 3250 8 3 89 0 0


My question is the following:



How does this fragmentation of direct writing operation occurs?
What is the (kernel?) parameter that determined the file should be broken (when not passing via cache) into 2 chunks of 88 and 116 MB (and not say into 4 chunks of 50MB)?



Would the results be different if i had used the oflag=sync instead of direct?







memory dd io cache






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 15 at 8:06

























asked Dec 15 at 7:50









pkaramol

449216




449216








  • 1




    You are using vmstat to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run of dd, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?
    – Kusalananda
    Dec 15 at 8:42










  • because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using the direct flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using the direct flag causes the writes to be done immediately (in terms of the bo appearing right away in the vmstat output)
    – pkaramol
    Dec 15 at 8:47












  • If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g. perf or or ftrace. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).
    – dirkt
    Dec 15 at 9:28














  • 1




    You are using vmstat to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run of dd, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?
    – Kusalananda
    Dec 15 at 8:42










  • because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using the direct flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using the direct flag causes the writes to be done immediately (in terms of the bo appearing right away in the vmstat output)
    – pkaramol
    Dec 15 at 8:47












  • If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g. perf or or ftrace. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).
    – dirkt
    Dec 15 at 9:28








1




1




You are using vmstat to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run of dd, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?
– Kusalananda
Dec 15 at 8:42




You are using vmstat to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run of dd, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?
– Kusalananda
Dec 15 at 8:42












because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using the direct flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using the direct flag causes the writes to be done immediately (in terms of the bo appearing right away in the vmstat output)
– pkaramol
Dec 15 at 8:47






because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using the direct flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using the direct flag causes the writes to be done immediately (in terms of the bo appearing right away in the vmstat output)
– pkaramol
Dec 15 at 8:47














If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g. perf or or ftrace. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).
– dirkt
Dec 15 at 9:28




If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g. perf or or ftrace. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).
– dirkt
Dec 15 at 9:28















active

oldest

votes











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f488105%2fcache-behavior-in-dd-command%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f488105%2fcache-behavior-in-dd-command%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Morgemoulin

Scott Moir

Souastre