Cache behavior in dd command
I am performing some dd
outputs and running in parallel vmstat
:
With no direct write (i.e. via cache)
$ dd if=/dev/urandom of=somefile.txt bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,31332 s, 160 MB/s
$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2659604 383244 3596884 0 0 3 47 34 14 27 7 66 0 0
1 0 64 2509432 383244 3746080 0 0 0 0 1005 2278 5 20 75 0 0
0 0 64 2452560 383248 3807932 0 0 4 204880 1175 2321 4 12 75 9 0
0 0 64 2453144 383248 3807548 0 0 0 0 814 2677 5 2 93 0 0
1 0 64 2444868 383248 3814516 0 0 0 244 529 1746 4 2 94 0 0
0 0 64 2445756 383248 3814516 0 0 0 0 495 1957 3 1 96 0 0
I see that it performed more or less one bulk write
With direct write
$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,6902 s, 124 MB/s
$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2623556 383248 3603572 0 0 3 47 35 14 27 7 66 0 0
1 0 64 2613784 383248 3611220 0 0 0 88064 1001 2573 5 15 79 1 0
0 0 64 2612236 383256 3611804 0 0 8 116736 912 2033 1 18 78 3 0
4 0 64 2621076 383256 3604232 0 0 0 96 1086 3250 8 3 89 0 0
My question is the following:
How does this fragmentation of direct writing operation occurs?
What is the (kernel?) parameter that determined the file should be broken (when not passing via cache) into 2 chunks of 88 and 116 MB (and not say into 4 chunks of 50MB)?
Would the results be different if i had used the oflag=sync
instead of direct
?
memory dd io cache
add a comment |
I am performing some dd
outputs and running in parallel vmstat
:
With no direct write (i.e. via cache)
$ dd if=/dev/urandom of=somefile.txt bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,31332 s, 160 MB/s
$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2659604 383244 3596884 0 0 3 47 34 14 27 7 66 0 0
1 0 64 2509432 383244 3746080 0 0 0 0 1005 2278 5 20 75 0 0
0 0 64 2452560 383248 3807932 0 0 4 204880 1175 2321 4 12 75 9 0
0 0 64 2453144 383248 3807548 0 0 0 0 814 2677 5 2 93 0 0
1 0 64 2444868 383248 3814516 0 0 0 244 529 1746 4 2 94 0 0
0 0 64 2445756 383248 3814516 0 0 0 0 495 1957 3 1 96 0 0
I see that it performed more or less one bulk write
With direct write
$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,6902 s, 124 MB/s
$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2623556 383248 3603572 0 0 3 47 35 14 27 7 66 0 0
1 0 64 2613784 383248 3611220 0 0 0 88064 1001 2573 5 15 79 1 0
0 0 64 2612236 383256 3611804 0 0 8 116736 912 2033 1 18 78 3 0
4 0 64 2621076 383256 3604232 0 0 0 96 1086 3250 8 3 89 0 0
My question is the following:
How does this fragmentation of direct writing operation occurs?
What is the (kernel?) parameter that determined the file should be broken (when not passing via cache) into 2 chunks of 88 and 116 MB (and not say into 4 chunks of 50MB)?
Would the results be different if i had used the oflag=sync
instead of direct
?
memory dd io cache
1
You are usingvmstat
to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run ofdd
, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?
– Kusalananda
Dec 15 at 8:42
because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using thedirect
flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using thedirect
flag causes the writes to be done immediately (in terms of thebo
appearing right away in thevmstat
output)
– pkaramol
Dec 15 at 8:47
If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g.perf
or orftrace
. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).
– dirkt
Dec 15 at 9:28
add a comment |
I am performing some dd
outputs and running in parallel vmstat
:
With no direct write (i.e. via cache)
$ dd if=/dev/urandom of=somefile.txt bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,31332 s, 160 MB/s
$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2659604 383244 3596884 0 0 3 47 34 14 27 7 66 0 0
1 0 64 2509432 383244 3746080 0 0 0 0 1005 2278 5 20 75 0 0
0 0 64 2452560 383248 3807932 0 0 4 204880 1175 2321 4 12 75 9 0
0 0 64 2453144 383248 3807548 0 0 0 0 814 2677 5 2 93 0 0
1 0 64 2444868 383248 3814516 0 0 0 244 529 1746 4 2 94 0 0
0 0 64 2445756 383248 3814516 0 0 0 0 495 1957 3 1 96 0 0
I see that it performed more or less one bulk write
With direct write
$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,6902 s, 124 MB/s
$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2623556 383248 3603572 0 0 3 47 35 14 27 7 66 0 0
1 0 64 2613784 383248 3611220 0 0 0 88064 1001 2573 5 15 79 1 0
0 0 64 2612236 383256 3611804 0 0 8 116736 912 2033 1 18 78 3 0
4 0 64 2621076 383256 3604232 0 0 0 96 1086 3250 8 3 89 0 0
My question is the following:
How does this fragmentation of direct writing operation occurs?
What is the (kernel?) parameter that determined the file should be broken (when not passing via cache) into 2 chunks of 88 and 116 MB (and not say into 4 chunks of 50MB)?
Would the results be different if i had used the oflag=sync
instead of direct
?
memory dd io cache
I am performing some dd
outputs and running in parallel vmstat
:
With no direct write (i.e. via cache)
$ dd if=/dev/urandom of=somefile.txt bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,31332 s, 160 MB/s
$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2659604 383244 3596884 0 0 3 47 34 14 27 7 66 0 0
1 0 64 2509432 383244 3746080 0 0 0 0 1005 2278 5 20 75 0 0
0 0 64 2452560 383248 3807932 0 0 4 204880 1175 2321 4 12 75 9 0
0 0 64 2453144 383248 3807548 0 0 0 0 814 2677 5 2 93 0 0
1 0 64 2444868 383248 3814516 0 0 0 244 529 1746 4 2 94 0 0
0 0 64 2445756 383248 3814516 0 0 0 0 495 1957 3 1 96 0 0
I see that it performed more or less one bulk write
With direct write
$ dd if=/dev/urandom of=somefile.txt bs=1M count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1,6902 s, 124 MB/s
$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 64 2623556 383248 3603572 0 0 3 47 35 14 27 7 66 0 0
1 0 64 2613784 383248 3611220 0 0 0 88064 1001 2573 5 15 79 1 0
0 0 64 2612236 383256 3611804 0 0 8 116736 912 2033 1 18 78 3 0
4 0 64 2621076 383256 3604232 0 0 0 96 1086 3250 8 3 89 0 0
My question is the following:
How does this fragmentation of direct writing operation occurs?
What is the (kernel?) parameter that determined the file should be broken (when not passing via cache) into 2 chunks of 88 and 116 MB (and not say into 4 chunks of 50MB)?
Would the results be different if i had used the oflag=sync
instead of direct
?
memory dd io cache
memory dd io cache
edited Dec 15 at 8:06
asked Dec 15 at 7:50
pkaramol
449216
449216
1
You are usingvmstat
to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run ofdd
, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?
– Kusalananda
Dec 15 at 8:42
because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using thedirect
flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using thedirect
flag causes the writes to be done immediately (in terms of thebo
appearing right away in thevmstat
output)
– pkaramol
Dec 15 at 8:47
If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g.perf
or orftrace
. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).
– dirkt
Dec 15 at 9:28
add a comment |
1
You are usingvmstat
to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run ofdd
, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?
– Kusalananda
Dec 15 at 8:42
because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using thedirect
flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using thedirect
flag causes the writes to be done immediately (in terms of thebo
appearing right away in thevmstat
output)
– pkaramol
Dec 15 at 8:47
If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g.perf
or orftrace
. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).
– dirkt
Dec 15 at 9:28
1
1
You are using
vmstat
to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run of dd
, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?– Kusalananda
Dec 15 at 8:42
You are using
vmstat
to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run of dd
, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?– Kusalananda
Dec 15 at 8:42
because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using the
direct
flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using the direct
flag causes the writes to be done immediately (in terms of the bo
appearing right away in the vmstat
output)– pkaramol
Dec 15 at 8:47
because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using the
direct
flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using the direct
flag causes the writes to be done immediately (in terms of the bo
appearing right away in the vmstat
output)– pkaramol
Dec 15 at 8:47
If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g.
perf
or or ftrace
. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).– dirkt
Dec 15 at 9:28
If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g.
perf
or or ftrace
. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).– dirkt
Dec 15 at 9:28
add a comment |
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f488105%2fcache-behavior-in-dd-command%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f488105%2fcache-behavior-in-dd-command%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
You are using
vmstat
to get the stats output once every second (for 1000 seconds), why do you give significance to the fact that in one run ofdd
, all data was written more or less during one such interval while in the other run, the interval ended partway through the operation?– Kusalananda
Dec 15 at 8:42
because after multiple experiments this behavior seems consistent; after repeating it with 1G output,using the
direct
flag seems to break it into more or less 10 writes of 100MB each while not using it caused 3-4 writes (albeit uneven in size); what is more, not using thedirect
flag causes the writes to be done immediately (in terms of thebo
appearing right away in thevmstat
output)– pkaramol
Dec 15 at 8:47
If you are really interested in why the kernel does things this way, I'd (1) read kernel code, (2) use kernel tracing, e.g.
perf
or orftrace
. I wouldn't expect the high-level behaviour you observe to have a simple explanation or a single parameter (but I don't know).– dirkt
Dec 15 at 9:28