Loop a list through awk
I have two files: data.csv and list.txt. Here's an example of what they look like
data.csv:
"John","red","4"
"Basketball","orange","2"
"The Mike","blue","94"
"Lizard","purple","3"
"Johnny","pink","32"
list.txt:
Mike
John
purple
32
Now, I am trying to figure out how I can make a loop
awk -F ""*,"*" '/**LIST ITEM**/ {print $1}' data.csv > output.txt
where the command runs for each line of list.txt, replacing **LIST ITEM**. How can this be accomplished?
I'm running this though Terminal on MacOSX 10.5.7.
EDIT:
The desired output for the above example would be
The Mike
John
Johnny
Lizard
Johnny
EDIT2:
To be more clear, I am trying to avoid doing this:
awk -F ""*,"*" '/Mike/ {print $1}' data.csv
awk -F ""*,"*" '/John/ {print $1}' data.csv
awk -F ""*,"*" '/purple/ {print $1}' data.csv
awk -F ""*,"*" '/32/ {print $1}' data.csv
And instead, run it in one command, somehow looping through all the lines of list.txt.
awk
|
show 3 more comments
I have two files: data.csv and list.txt. Here's an example of what they look like
data.csv:
"John","red","4"
"Basketball","orange","2"
"The Mike","blue","94"
"Lizard","purple","3"
"Johnny","pink","32"
list.txt:
Mike
John
purple
32
Now, I am trying to figure out how I can make a loop
awk -F ""*,"*" '/**LIST ITEM**/ {print $1}' data.csv > output.txt
where the command runs for each line of list.txt, replacing **LIST ITEM**. How can this be accomplished?
I'm running this though Terminal on MacOSX 10.5.7.
EDIT:
The desired output for the above example would be
The Mike
John
Johnny
Lizard
Johnny
EDIT2:
To be more clear, I am trying to avoid doing this:
awk -F ""*,"*" '/Mike/ {print $1}' data.csv
awk -F ""*,"*" '/John/ {print $1}' data.csv
awk -F ""*,"*" '/purple/ {print $1}' data.csv
awk -F ""*,"*" '/32/ {print $1}' data.csv
And instead, run it in one command, somehow looping through all the lines of list.txt.
awk
The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
– Julien
Mar 9 '11 at 5:41
It would be quite helpful if your sample input includedLIST ITEM
somewhere (assuming it's literal) as well as providing desired output
– SiegeX
Mar 9 '11 at 5:41
@Julien When you say**LIST ITEM**
, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extraJohnny
line, yes?
– SiegeX
Mar 9 '11 at 5:58
When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
– Julien
Mar 9 '11 at 6:03
awk -F ""*,"*" '/32/ {print $1}' data.csv
would yieldJohnny
, unless I am mistaken.
– Julien
Mar 9 '11 at 6:04
|
show 3 more comments
I have two files: data.csv and list.txt. Here's an example of what they look like
data.csv:
"John","red","4"
"Basketball","orange","2"
"The Mike","blue","94"
"Lizard","purple","3"
"Johnny","pink","32"
list.txt:
Mike
John
purple
32
Now, I am trying to figure out how I can make a loop
awk -F ""*,"*" '/**LIST ITEM**/ {print $1}' data.csv > output.txt
where the command runs for each line of list.txt, replacing **LIST ITEM**. How can this be accomplished?
I'm running this though Terminal on MacOSX 10.5.7.
EDIT:
The desired output for the above example would be
The Mike
John
Johnny
Lizard
Johnny
EDIT2:
To be more clear, I am trying to avoid doing this:
awk -F ""*,"*" '/Mike/ {print $1}' data.csv
awk -F ""*,"*" '/John/ {print $1}' data.csv
awk -F ""*,"*" '/purple/ {print $1}' data.csv
awk -F ""*,"*" '/32/ {print $1}' data.csv
And instead, run it in one command, somehow looping through all the lines of list.txt.
awk
I have two files: data.csv and list.txt. Here's an example of what they look like
data.csv:
"John","red","4"
"Basketball","orange","2"
"The Mike","blue","94"
"Lizard","purple","3"
"Johnny","pink","32"
list.txt:
Mike
John
purple
32
Now, I am trying to figure out how I can make a loop
awk -F ""*,"*" '/**LIST ITEM**/ {print $1}' data.csv > output.txt
where the command runs for each line of list.txt, replacing **LIST ITEM**. How can this be accomplished?
I'm running this though Terminal on MacOSX 10.5.7.
EDIT:
The desired output for the above example would be
The Mike
John
Johnny
Lizard
Johnny
EDIT2:
To be more clear, I am trying to avoid doing this:
awk -F ""*,"*" '/Mike/ {print $1}' data.csv
awk -F ""*,"*" '/John/ {print $1}' data.csv
awk -F ""*,"*" '/purple/ {print $1}' data.csv
awk -F ""*,"*" '/32/ {print $1}' data.csv
And instead, run it in one command, somehow looping through all the lines of list.txt.
awk
awk
edited Dec 16 at 4:18
Rui F Ribeiro
38.9k1479129
38.9k1479129
asked Mar 9 '11 at 5:29
Julien
118115
118115
The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
– Julien
Mar 9 '11 at 5:41
It would be quite helpful if your sample input includedLIST ITEM
somewhere (assuming it's literal) as well as providing desired output
– SiegeX
Mar 9 '11 at 5:41
@Julien When you say**LIST ITEM**
, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extraJohnny
line, yes?
– SiegeX
Mar 9 '11 at 5:58
When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
– Julien
Mar 9 '11 at 6:03
awk -F ""*,"*" '/32/ {print $1}' data.csv
would yieldJohnny
, unless I am mistaken.
– Julien
Mar 9 '11 at 6:04
|
show 3 more comments
The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
– Julien
Mar 9 '11 at 5:41
It would be quite helpful if your sample input includedLIST ITEM
somewhere (assuming it's literal) as well as providing desired output
– SiegeX
Mar 9 '11 at 5:41
@Julien When you say**LIST ITEM**
, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extraJohnny
line, yes?
– SiegeX
Mar 9 '11 at 5:58
When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
– Julien
Mar 9 '11 at 6:03
awk -F ""*,"*" '/32/ {print $1}' data.csv
would yieldJohnny
, unless I am mistaken.
– Julien
Mar 9 '11 at 6:04
The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
– Julien
Mar 9 '11 at 5:41
The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
– Julien
Mar 9 '11 at 5:41
It would be quite helpful if your sample input included
LIST ITEM
somewhere (assuming it's literal) as well as providing desired output– SiegeX
Mar 9 '11 at 5:41
It would be quite helpful if your sample input included
LIST ITEM
somewhere (assuming it's literal) as well as providing desired output– SiegeX
Mar 9 '11 at 5:41
@Julien When you say
**LIST ITEM**
, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extra Johnny
line, yes?– SiegeX
Mar 9 '11 at 5:58
@Julien When you say
**LIST ITEM**
, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extra Johnny
line, yes?– SiegeX
Mar 9 '11 at 5:58
When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
– Julien
Mar 9 '11 at 6:03
When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
– Julien
Mar 9 '11 at 6:03
awk -F ""*,"*" '/32/ {print $1}' data.csv
would yield Johnny
, unless I am mistaken.– Julien
Mar 9 '11 at 6:04
awk -F ""*,"*" '/32/ {print $1}' data.csv
would yield Johnny
, unless I am mistaken.– Julien
Mar 9 '11 at 6:04
|
show 3 more comments
3 Answers
3
active
oldest
votes
This meets the order of your desired output:
$ awk -F, '
NR == FNR {field1[$0] = $1; next}
{
for (line in field1)
if (line ~ $0)
print field1[line]
}
' data.csv list.txt
"The Mike"
"John"
"Johnny"
"Lizard"
"Johnny"
This reads the data.csv file into memory, mapping the whole line to field1. Then, each line of the list.txt file is checked against each element of the field1 array.
If the data file is much larger than the list file, then it would make more sense to hold the smaller file in memory and loop over the larger file a line at a time:
$ awk -F, '
NR == FNR {list[$1]; next}
{
for (item in list)
if ($0 ~ item)
print $1
}
' list.txt data.csv
"John"
"The Mike"
"Lizard"
"Johnny"
"Johnny"
What if both files are very large? Is it just whichever is bigger?
– Julien
Mar 9 '11 at 6:34
This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
– Julien
Mar 9 '11 at 6:52
@Julien, this might work for you too:grep -f list.txt data.csv | cut -d, -f1
, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
– glenn jackman
Mar 9 '11 at 17:35
add a comment |
#!/bin/bash
while read -r line; do
awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv
done < list.txt
Proof of Concept
$ while read -r line; do awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv; done < list.txt
The Mike
John
Johnny
Lizard
Johnny
This field separator deals with embedded quotes and/or commas
It's not printing the second Johnny on my system... I'm not sure why.
– Julien
Mar 9 '11 at 6:26
add a comment |
I'm not entirely clear on what you're trying to do: replace LIST ITEM with what? Just looking for a match anywhere and outputting the first field? Also, your example list.txt
appears to match anywhere in the line, which could potentially be problematic: what if list.txt
at some point contains the line e
? That would match all but the last line of your sample data.csv
.
awk -F '^"?|"?,"?|"$?' 'BEGIN {
# read list.txt into an array
while (getline pat < "list.txt") {
pats[pat] = 1
}
close("list.txt")
}
{
# skip empty field before leading "
if ($1 == "") {
res = $2
} else {
res = $1
}
# scan record for patterns stored earlier,
# output the first real data field (res) if
# found
for (pat in pats) {
if ($0 ~ pat) {
print res
}
}
}' data.csv
This is a bit more complex than it could be; your field separator doesn't deal with the optional leading quotation mark on the first field or the optional trailing one on the last field. Mine does, but at the price that if it's there the first field will be empty (the empty string before ^"?
). It also doesn't try to deal with embedded quotes. A dedicated CSV parser would be a better idea if you need to support random generalized CSV.
I see why, now, after reading glenn jackman's answer
– Julien
Mar 9 '11 at 6:35
@Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
– Gilles
Mar 9 '11 at 20:42
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f8894%2floop-a-list-through-awk%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
This meets the order of your desired output:
$ awk -F, '
NR == FNR {field1[$0] = $1; next}
{
for (line in field1)
if (line ~ $0)
print field1[line]
}
' data.csv list.txt
"The Mike"
"John"
"Johnny"
"Lizard"
"Johnny"
This reads the data.csv file into memory, mapping the whole line to field1. Then, each line of the list.txt file is checked against each element of the field1 array.
If the data file is much larger than the list file, then it would make more sense to hold the smaller file in memory and loop over the larger file a line at a time:
$ awk -F, '
NR == FNR {list[$1]; next}
{
for (item in list)
if ($0 ~ item)
print $1
}
' list.txt data.csv
"John"
"The Mike"
"Lizard"
"Johnny"
"Johnny"
What if both files are very large? Is it just whichever is bigger?
– Julien
Mar 9 '11 at 6:34
This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
– Julien
Mar 9 '11 at 6:52
@Julien, this might work for you too:grep -f list.txt data.csv | cut -d, -f1
, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
– glenn jackman
Mar 9 '11 at 17:35
add a comment |
This meets the order of your desired output:
$ awk -F, '
NR == FNR {field1[$0] = $1; next}
{
for (line in field1)
if (line ~ $0)
print field1[line]
}
' data.csv list.txt
"The Mike"
"John"
"Johnny"
"Lizard"
"Johnny"
This reads the data.csv file into memory, mapping the whole line to field1. Then, each line of the list.txt file is checked against each element of the field1 array.
If the data file is much larger than the list file, then it would make more sense to hold the smaller file in memory and loop over the larger file a line at a time:
$ awk -F, '
NR == FNR {list[$1]; next}
{
for (item in list)
if ($0 ~ item)
print $1
}
' list.txt data.csv
"John"
"The Mike"
"Lizard"
"Johnny"
"Johnny"
What if both files are very large? Is it just whichever is bigger?
– Julien
Mar 9 '11 at 6:34
This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
– Julien
Mar 9 '11 at 6:52
@Julien, this might work for you too:grep -f list.txt data.csv | cut -d, -f1
, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
– glenn jackman
Mar 9 '11 at 17:35
add a comment |
This meets the order of your desired output:
$ awk -F, '
NR == FNR {field1[$0] = $1; next}
{
for (line in field1)
if (line ~ $0)
print field1[line]
}
' data.csv list.txt
"The Mike"
"John"
"Johnny"
"Lizard"
"Johnny"
This reads the data.csv file into memory, mapping the whole line to field1. Then, each line of the list.txt file is checked against each element of the field1 array.
If the data file is much larger than the list file, then it would make more sense to hold the smaller file in memory and loop over the larger file a line at a time:
$ awk -F, '
NR == FNR {list[$1]; next}
{
for (item in list)
if ($0 ~ item)
print $1
}
' list.txt data.csv
"John"
"The Mike"
"Lizard"
"Johnny"
"Johnny"
This meets the order of your desired output:
$ awk -F, '
NR == FNR {field1[$0] = $1; next}
{
for (line in field1)
if (line ~ $0)
print field1[line]
}
' data.csv list.txt
"The Mike"
"John"
"Johnny"
"Lizard"
"Johnny"
This reads the data.csv file into memory, mapping the whole line to field1. Then, each line of the list.txt file is checked against each element of the field1 array.
If the data file is much larger than the list file, then it would make more sense to hold the smaller file in memory and loop over the larger file a line at a time:
$ awk -F, '
NR == FNR {list[$1]; next}
{
for (item in list)
if ($0 ~ item)
print $1
}
' list.txt data.csv
"John"
"The Mike"
"Lizard"
"Johnny"
"Johnny"
edited Mar 9 '11 at 6:28
answered Mar 9 '11 at 6:20
glenn jackman
50.2k570107
50.2k570107
What if both files are very large? Is it just whichever is bigger?
– Julien
Mar 9 '11 at 6:34
This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
– Julien
Mar 9 '11 at 6:52
@Julien, this might work for you too:grep -f list.txt data.csv | cut -d, -f1
, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
– glenn jackman
Mar 9 '11 at 17:35
add a comment |
What if both files are very large? Is it just whichever is bigger?
– Julien
Mar 9 '11 at 6:34
This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
– Julien
Mar 9 '11 at 6:52
@Julien, this might work for you too:grep -f list.txt data.csv | cut -d, -f1
, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
– glenn jackman
Mar 9 '11 at 17:35
What if both files are very large? Is it just whichever is bigger?
– Julien
Mar 9 '11 at 6:34
What if both files are very large? Is it just whichever is bigger?
– Julien
Mar 9 '11 at 6:34
This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
– Julien
Mar 9 '11 at 6:52
This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
– Julien
Mar 9 '11 at 6:52
@Julien, this might work for you too:
grep -f list.txt data.csv | cut -d, -f1
, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.– glenn jackman
Mar 9 '11 at 17:35
@Julien, this might work for you too:
grep -f list.txt data.csv | cut -d, -f1
, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.– glenn jackman
Mar 9 '11 at 17:35
add a comment |
#!/bin/bash
while read -r line; do
awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv
done < list.txt
Proof of Concept
$ while read -r line; do awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv; done < list.txt
The Mike
John
Johnny
Lizard
Johnny
This field separator deals with embedded quotes and/or commas
It's not printing the second Johnny on my system... I'm not sure why.
– Julien
Mar 9 '11 at 6:26
add a comment |
#!/bin/bash
while read -r line; do
awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv
done < list.txt
Proof of Concept
$ while read -r line; do awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv; done < list.txt
The Mike
John
Johnny
Lizard
Johnny
This field separator deals with embedded quotes and/or commas
It's not printing the second Johnny on my system... I'm not sure why.
– Julien
Mar 9 '11 at 6:26
add a comment |
#!/bin/bash
while read -r line; do
awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv
done < list.txt
Proof of Concept
$ while read -r line; do awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv; done < list.txt
The Mike
John
Johnny
Lizard
Johnny
This field separator deals with embedded quotes and/or commas
#!/bin/bash
while read -r line; do
awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv
done < list.txt
Proof of Concept
$ while read -r line; do awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv; done < list.txt
The Mike
John
Johnny
Lizard
Johnny
This field separator deals with embedded quotes and/or commas
answered Mar 9 '11 at 6:10
SiegeX
5,30112723
5,30112723
It's not printing the second Johnny on my system... I'm not sure why.
– Julien
Mar 9 '11 at 6:26
add a comment |
It's not printing the second Johnny on my system... I'm not sure why.
– Julien
Mar 9 '11 at 6:26
It's not printing the second Johnny on my system... I'm not sure why.
– Julien
Mar 9 '11 at 6:26
It's not printing the second Johnny on my system... I'm not sure why.
– Julien
Mar 9 '11 at 6:26
add a comment |
I'm not entirely clear on what you're trying to do: replace LIST ITEM with what? Just looking for a match anywhere and outputting the first field? Also, your example list.txt
appears to match anywhere in the line, which could potentially be problematic: what if list.txt
at some point contains the line e
? That would match all but the last line of your sample data.csv
.
awk -F '^"?|"?,"?|"$?' 'BEGIN {
# read list.txt into an array
while (getline pat < "list.txt") {
pats[pat] = 1
}
close("list.txt")
}
{
# skip empty field before leading "
if ($1 == "") {
res = $2
} else {
res = $1
}
# scan record for patterns stored earlier,
# output the first real data field (res) if
# found
for (pat in pats) {
if ($0 ~ pat) {
print res
}
}
}' data.csv
This is a bit more complex than it could be; your field separator doesn't deal with the optional leading quotation mark on the first field or the optional trailing one on the last field. Mine does, but at the price that if it's there the first field will be empty (the empty string before ^"?
). It also doesn't try to deal with embedded quotes. A dedicated CSV parser would be a better idea if you need to support random generalized CSV.
I see why, now, after reading glenn jackman's answer
– Julien
Mar 9 '11 at 6:35
@Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
– Gilles
Mar 9 '11 at 20:42
add a comment |
I'm not entirely clear on what you're trying to do: replace LIST ITEM with what? Just looking for a match anywhere and outputting the first field? Also, your example list.txt
appears to match anywhere in the line, which could potentially be problematic: what if list.txt
at some point contains the line e
? That would match all but the last line of your sample data.csv
.
awk -F '^"?|"?,"?|"$?' 'BEGIN {
# read list.txt into an array
while (getline pat < "list.txt") {
pats[pat] = 1
}
close("list.txt")
}
{
# skip empty field before leading "
if ($1 == "") {
res = $2
} else {
res = $1
}
# scan record for patterns stored earlier,
# output the first real data field (res) if
# found
for (pat in pats) {
if ($0 ~ pat) {
print res
}
}
}' data.csv
This is a bit more complex than it could be; your field separator doesn't deal with the optional leading quotation mark on the first field or the optional trailing one on the last field. Mine does, but at the price that if it's there the first field will be empty (the empty string before ^"?
). It also doesn't try to deal with embedded quotes. A dedicated CSV parser would be a better idea if you need to support random generalized CSV.
I see why, now, after reading glenn jackman's answer
– Julien
Mar 9 '11 at 6:35
@Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
– Gilles
Mar 9 '11 at 20:42
add a comment |
I'm not entirely clear on what you're trying to do: replace LIST ITEM with what? Just looking for a match anywhere and outputting the first field? Also, your example list.txt
appears to match anywhere in the line, which could potentially be problematic: what if list.txt
at some point contains the line e
? That would match all but the last line of your sample data.csv
.
awk -F '^"?|"?,"?|"$?' 'BEGIN {
# read list.txt into an array
while (getline pat < "list.txt") {
pats[pat] = 1
}
close("list.txt")
}
{
# skip empty field before leading "
if ($1 == "") {
res = $2
} else {
res = $1
}
# scan record for patterns stored earlier,
# output the first real data field (res) if
# found
for (pat in pats) {
if ($0 ~ pat) {
print res
}
}
}' data.csv
This is a bit more complex than it could be; your field separator doesn't deal with the optional leading quotation mark on the first field or the optional trailing one on the last field. Mine does, but at the price that if it's there the first field will be empty (the empty string before ^"?
). It also doesn't try to deal with embedded quotes. A dedicated CSV parser would be a better idea if you need to support random generalized CSV.
I'm not entirely clear on what you're trying to do: replace LIST ITEM with what? Just looking for a match anywhere and outputting the first field? Also, your example list.txt
appears to match anywhere in the line, which could potentially be problematic: what if list.txt
at some point contains the line e
? That would match all but the last line of your sample data.csv
.
awk -F '^"?|"?,"?|"$?' 'BEGIN {
# read list.txt into an array
while (getline pat < "list.txt") {
pats[pat] = 1
}
close("list.txt")
}
{
# skip empty field before leading "
if ($1 == "") {
res = $2
} else {
res = $1
}
# scan record for patterns stored earlier,
# output the first real data field (res) if
# found
for (pat in pats) {
if ($0 ~ pat) {
print res
}
}
}' data.csv
This is a bit more complex than it could be; your field separator doesn't deal with the optional leading quotation mark on the first field or the optional trailing one on the last field. Mine does, but at the price that if it's there the first field will be empty (the empty string before ^"?
). It also doesn't try to deal with embedded quotes. A dedicated CSV parser would be a better idea if you need to support random generalized CSV.
answered Mar 9 '11 at 6:08
geekosaur
22.3k25853
22.3k25853
I see why, now, after reading glenn jackman's answer
– Julien
Mar 9 '11 at 6:35
@Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
– Gilles
Mar 9 '11 at 20:42
add a comment |
I see why, now, after reading glenn jackman's answer
– Julien
Mar 9 '11 at 6:35
@Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
– Gilles
Mar 9 '11 at 20:42
I see why, now, after reading glenn jackman's answer
– Julien
Mar 9 '11 at 6:35
I see why, now, after reading glenn jackman's answer
– Julien
Mar 9 '11 at 6:35
@Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
– Gilles
Mar 9 '11 at 20:42
@Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
– Gilles
Mar 9 '11 at 20:42
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f8894%2floop-a-list-through-awk%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
– Julien
Mar 9 '11 at 5:41
It would be quite helpful if your sample input included
LIST ITEM
somewhere (assuming it's literal) as well as providing desired output– SiegeX
Mar 9 '11 at 5:41
@Julien When you say
**LIST ITEM**
, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extraJohnny
line, yes?– SiegeX
Mar 9 '11 at 5:58
When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
– Julien
Mar 9 '11 at 6:03
awk -F ""*,"*" '/32/ {print $1}' data.csv
would yieldJohnny
, unless I am mistaken.– Julien
Mar 9 '11 at 6:04