Add lines to files to make them equal length

up vote
4
down vote

favorite

I have a bunch of .csv files with N columns and different number of rows (lines). I would like to add as many empty lines ;...; (N semicolons) to make them the same length. I can get the length of the longest file manually but it would also be good to get this done automatically.

For example:

I have,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

I need,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

;;;;;

;;;;;

;;;;;

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

;;;;;

;;;;;

edited yesterday

Jeff Schaller

37.1k1052121

asked yesterday

myradio

2459

1

A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
yesterday

1

Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
yesterday

@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
yesterday

@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
yesterday

add a comment |

up vote
4
down vote

favorite

For example:

I have,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

I need,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

;;;;;

;;;;;

;;;;;

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

;;;;;

;;;;;

edited yesterday

Jeff Schaller

37.1k1052121

asked yesterday

myradio

2459

1

A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
yesterday

1

Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
yesterday

@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
yesterday

@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
yesterday

add a comment |

up vote
4
down vote

favorite

For example:

I have,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

I need,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

;;;;;

;;;;;

;;;;;

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

;;;;;

;;;;;

edited yesterday

Jeff Schaller

37.1k1052121

asked yesterday

myradio

2459

For example:

I have,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

I need,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

;;;;;

;;;;;

;;;;;

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

;;;;;

;;;;;

shell-script text-processing awk files csv

edited yesterday

Jeff Schaller

37.1k1052121

asked yesterday

myradio

2459

edited yesterday

Jeff Schaller

37.1k1052121

asked yesterday

myradio

2459

edited yesterday

Jeff Schaller

37.1k1052121

edited yesterday

Jeff Schaller

37.1k1052121

edited yesterday

Jeff Schaller

37.1k1052121

asked yesterday

myradio

2459

asked yesterday

myradio

2459

asked yesterday

myradio

2459

1

A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
yesterday

1

Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
yesterday

@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
yesterday

@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
yesterday

add a comment |

1

A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
yesterday

1

Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
yesterday

@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
yesterday

@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
yesterday

A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
yesterday

Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
yesterday

@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
yesterday

@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
yesterday

add a comment |

3 Answers
3

active

oldest

votes

up vote
3
down vote

Thanks @Sparhawk for the suggestions in the comments, I update based on those,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in files*pattern.txt;do

    lineNumber=$(wc -l < $name)

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in $(ls files*pattern.txt);do

    lineNumber=$(cat $name | wc -l )

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt

edited 16 hours ago

answered yesterday

myradio

2459

1

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
yesterday

1

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
yesterday

2

@Sparhawk: I think you meant wc -l < $name
– Thor
yesterday

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
yesterday

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
17 hours ago

|
show 1 more comment

up vote
2
down vote

An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.

max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )

for f in file*.csv; do

    awk -F';' -v max=$max 

      'END{

         s=sprintf("%*s",FS,"");

         gsub(/ /,"-",s);

         for(i=NR;i<max;i++)

           print s;

       }' "$f" >> "$f"

done

With -F you set the correct field separator of your files (here -F';').

The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.

edited 15 hours ago

answered yesterday

RoVo

2,354215

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
16 hours ago

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
15 hours ago

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
15 hours ago

add a comment |

up vote
1
down vote

In order to count the lines in each file only once:

wc -l *csv |sort -nr| sed 1d | {

    read max file

    pad=$(sed q "$file"|tr -cd ";")  # extract separators from first record

    while read lines file ; do

        while [ $((lines+=1)) -le $max ] ; do

                echo "$pad" >> "$file"

        done

    done

}

Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.

answered yesterday

JigglyNaga

3,529828

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f485857%2fadd-lines-to-files-to-make-them-equal-length%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
3
down vote

Thanks @Sparhawk for the suggestions in the comments, I update based on those,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in files*pattern.txt;do

    lineNumber=$(wc -l < $name)

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in $(ls files*pattern.txt);do

    lineNumber=$(cat $name | wc -l )

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt

edited 16 hours ago

answered yesterday

myradio

2459

1

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
yesterday

1

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
yesterday

2

@Sparhawk: I think you meant wc -l < $name
– Thor
yesterday

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
yesterday

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
17 hours ago

|
show 1 more comment

up vote
3
down vote

Thanks @Sparhawk for the suggestions in the comments, I update based on those,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in files*pattern.txt;do

    lineNumber=$(wc -l < $name)

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in $(ls files*pattern.txt);do

    lineNumber=$(cat $name | wc -l )

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt

edited 16 hours ago

answered yesterday

myradio

2459

1

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
yesterday

1

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
yesterday

2

@Sparhawk: I think you meant wc -l < $name
– Thor
yesterday

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
yesterday

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
17 hours ago

|
show 1 more comment

up vote
3
down vote

Thanks @Sparhawk for the suggestions in the comments, I update based on those,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in files*pattern.txt;do

    lineNumber=$(wc -l < $name)

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in $(ls files*pattern.txt);do

    lineNumber=$(cat $name | wc -l )

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt

edited 16 hours ago

answered yesterday

myradio

2459

Thanks @Sparhawk for the suggestions in the comments, I update based on those,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in files*pattern.txt;do

    lineNumber=$(wc -l < $name)

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in $(ls files*pattern.txt);do

    lineNumber=$(cat $name | wc -l )

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt

edited 16 hours ago

answered yesterday

myradio

2459

edited 16 hours ago

answered yesterday

myradio

2459

answered yesterday

myradio

2459

answered yesterday

myradio

2459

1

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
yesterday

1

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
yesterday

2

@Sparhawk: I think you meant wc -l < $name
– Thor
yesterday

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
yesterday

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
17 hours ago

|
show 1 more comment

1

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
yesterday

1

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
yesterday

2

@Sparhawk: I think you meant wc -l < $name
– Thor
yesterday

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
yesterday

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
17 hours ago

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
yesterday

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
yesterday

@Sparhawk: I think you meant wc -l < $name
– Thor
yesterday

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
yesterday

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
17 hours ago

|
show 1 more comment

up vote
2
down vote

An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.

max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )

for f in file*.csv; do

    awk -F';' -v max=$max 

      'END{

         s=sprintf("%*s",FS,"");

         gsub(/ /,"-",s);

         for(i=NR;i<max;i++)

           print s;

       }' "$f" >> "$f"

done

With -F you set the correct field separator of your files (here -F';').

The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.

edited 15 hours ago

answered yesterday

RoVo

2,354215

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
16 hours ago

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
15 hours ago

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
15 hours ago

add a comment |

up vote
2
down vote

An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.

max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )

for f in file*.csv; do

    awk -F';' -v max=$max 

      'END{

         s=sprintf("%*s",FS,"");

         gsub(/ /,"-",s);

         for(i=NR;i<max;i++)

           print s;

       }' "$f" >> "$f"

done

With -F you set the correct field separator of your files (here -F';').

The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.

edited 15 hours ago

answered yesterday

RoVo

2,354215

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
16 hours ago

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
15 hours ago

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
15 hours ago

add a comment |

up vote
2
down vote

An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.

max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )

for f in file*.csv; do

    awk -F';' -v max=$max 

      'END{

         s=sprintf("%*s",FS,"");

         gsub(/ /,"-",s);

         for(i=NR;i<max;i++)

           print s;

       }' "$f" >> "$f"

done

With -F you set the correct field separator of your files (here -F';').

The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.

edited 15 hours ago

answered yesterday

RoVo

2,354215

An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.

max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )

for f in file*.csv; do

    awk -F';' -v max=$max 

      'END{

         s=sprintf("%*s",FS,"");

         gsub(/ /,"-",s);

         for(i=NR;i<max;i++)

           print s;

       }' "$f" >> "$f"

done

With -F you set the correct field separator of your files (here -F';').

The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.

edited 15 hours ago

answered yesterday

RoVo

2,354215

edited 15 hours ago

answered yesterday

RoVo

2,354215

answered yesterday

RoVo

2,354215

answered yesterday

RoVo

2,354215

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
16 hours ago

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
15 hours ago

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
15 hours ago

add a comment |

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
16 hours ago

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
15 hours ago

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
15 hours ago

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
16 hours ago

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
15 hours ago

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
15 hours ago

add a comment |

up vote
1
down vote

In order to count the lines in each file only once:

wc -l *csv |sort -nr| sed 1d | {

    read max file

    pad=$(sed q "$file"|tr -cd ";")  # extract separators from first record

    while read lines file ; do

        while [ $((lines+=1)) -le $max ] ; do

                echo "$pad" >> "$file"

        done

    done

}

Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.

answered yesterday

JigglyNaga

3,529828

add a comment |

up vote
1
down vote

In order to count the lines in each file only once:

wc -l *csv |sort -nr| sed 1d | {

    read max file

    pad=$(sed q "$file"|tr -cd ";")  # extract separators from first record

    while read lines file ; do

        while [ $((lines+=1)) -le $max ] ; do

                echo "$pad" >> "$file"

        done

    done

}

Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.

answered yesterday

JigglyNaga

3,529828

add a comment |

up vote
1
down vote

In order to count the lines in each file only once:

wc -l *csv |sort -nr| sed 1d | {

    read max file

    pad=$(sed q "$file"|tr -cd ";")  # extract separators from first record

    while read lines file ; do

        while [ $((lines+=1)) -le $max ] ; do

                echo "$pad" >> "$file"

        done

    done

}

Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.

answered yesterday

JigglyNaga

3,529828

In order to count the lines in each file only once:

wc -l *csv |sort -nr| sed 1d | {

    read max file

    pad=$(sed q "$file"|tr -cd ";")  # extract separators from first record

    while read lines file ; do

        while [ $((lines+=1)) -le $max ] ; do

                echo "$pad" >> "$file"

        done

    done

}

Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.

answered yesterday

JigglyNaga

3,529828

answered yesterday

JigglyNaga

3,529828

answered yesterday

JigglyNaga

3,529828

answered yesterday

JigglyNaga

3,529828

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrtjryk