Loop a list through awk












3














I have two files: data.csv and list.txt. Here's an example of what they look like



data.csv:



"John","red","4"
"Basketball","orange","2"
"The Mike","blue","94"
"Lizard","purple","3"
"Johnny","pink","32"


list.txt:



Mike
John
purple
32


Now, I am trying to figure out how I can make a loop



awk -F ""*,"*" '/**LIST ITEM**/ {print $1}' data.csv > output.txt


where the command runs for each line of list.txt, replacing **LIST ITEM**. How can this be accomplished?



I'm running this though Terminal on MacOSX 10.5.7.



EDIT:



The desired output for the above example would be



The Mike
John
Johnny
Lizard
Johnny


EDIT2:



To be more clear, I am trying to avoid doing this:



awk -F ""*,"*" '/Mike/ {print $1}' data.csv
awk -F ""*,"*" '/John/ {print $1}' data.csv
awk -F ""*,"*" '/purple/ {print $1}' data.csv
awk -F ""*,"*" '/32/ {print $1}' data.csv


And instead, run it in one command, somehow looping through all the lines of list.txt.










share|improve this question
























  • The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
    – Julien
    Mar 9 '11 at 5:41












  • It would be quite helpful if your sample input included LIST ITEM somewhere (assuming it's literal) as well as providing desired output
    – SiegeX
    Mar 9 '11 at 5:41










  • @Julien When you say **LIST ITEM**, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extra Johnny line, yes?
    – SiegeX
    Mar 9 '11 at 5:58










  • When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
    – Julien
    Mar 9 '11 at 6:03










  • awk -F ""*,"*" '/32/ {print $1}' data.csv would yield Johnny, unless I am mistaken.
    – Julien
    Mar 9 '11 at 6:04
















3














I have two files: data.csv and list.txt. Here's an example of what they look like



data.csv:



"John","red","4"
"Basketball","orange","2"
"The Mike","blue","94"
"Lizard","purple","3"
"Johnny","pink","32"


list.txt:



Mike
John
purple
32


Now, I am trying to figure out how I can make a loop



awk -F ""*,"*" '/**LIST ITEM**/ {print $1}' data.csv > output.txt


where the command runs for each line of list.txt, replacing **LIST ITEM**. How can this be accomplished?



I'm running this though Terminal on MacOSX 10.5.7.



EDIT:



The desired output for the above example would be



The Mike
John
Johnny
Lizard
Johnny


EDIT2:



To be more clear, I am trying to avoid doing this:



awk -F ""*,"*" '/Mike/ {print $1}' data.csv
awk -F ""*,"*" '/John/ {print $1}' data.csv
awk -F ""*,"*" '/purple/ {print $1}' data.csv
awk -F ""*,"*" '/32/ {print $1}' data.csv


And instead, run it in one command, somehow looping through all the lines of list.txt.










share|improve this question
























  • The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
    – Julien
    Mar 9 '11 at 5:41












  • It would be quite helpful if your sample input included LIST ITEM somewhere (assuming it's literal) as well as providing desired output
    – SiegeX
    Mar 9 '11 at 5:41










  • @Julien When you say **LIST ITEM**, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extra Johnny line, yes?
    – SiegeX
    Mar 9 '11 at 5:58










  • When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
    – Julien
    Mar 9 '11 at 6:03










  • awk -F ""*,"*" '/32/ {print $1}' data.csv would yield Johnny, unless I am mistaken.
    – Julien
    Mar 9 '11 at 6:04














3












3








3


1





I have two files: data.csv and list.txt. Here's an example of what they look like



data.csv:



"John","red","4"
"Basketball","orange","2"
"The Mike","blue","94"
"Lizard","purple","3"
"Johnny","pink","32"


list.txt:



Mike
John
purple
32


Now, I am trying to figure out how I can make a loop



awk -F ""*,"*" '/**LIST ITEM**/ {print $1}' data.csv > output.txt


where the command runs for each line of list.txt, replacing **LIST ITEM**. How can this be accomplished?



I'm running this though Terminal on MacOSX 10.5.7.



EDIT:



The desired output for the above example would be



The Mike
John
Johnny
Lizard
Johnny


EDIT2:



To be more clear, I am trying to avoid doing this:



awk -F ""*,"*" '/Mike/ {print $1}' data.csv
awk -F ""*,"*" '/John/ {print $1}' data.csv
awk -F ""*,"*" '/purple/ {print $1}' data.csv
awk -F ""*,"*" '/32/ {print $1}' data.csv


And instead, run it in one command, somehow looping through all the lines of list.txt.










share|improve this question















I have two files: data.csv and list.txt. Here's an example of what they look like



data.csv:



"John","red","4"
"Basketball","orange","2"
"The Mike","blue","94"
"Lizard","purple","3"
"Johnny","pink","32"


list.txt:



Mike
John
purple
32


Now, I am trying to figure out how I can make a loop



awk -F ""*,"*" '/**LIST ITEM**/ {print $1}' data.csv > output.txt


where the command runs for each line of list.txt, replacing **LIST ITEM**. How can this be accomplished?



I'm running this though Terminal on MacOSX 10.5.7.



EDIT:



The desired output for the above example would be



The Mike
John
Johnny
Lizard
Johnny


EDIT2:



To be more clear, I am trying to avoid doing this:



awk -F ""*,"*" '/Mike/ {print $1}' data.csv
awk -F ""*,"*" '/John/ {print $1}' data.csv
awk -F ""*,"*" '/purple/ {print $1}' data.csv
awk -F ""*,"*" '/32/ {print $1}' data.csv


And instead, run it in one command, somehow looping through all the lines of list.txt.







awk






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 16 at 4:18









Rui F Ribeiro

38.9k1479129




38.9k1479129










asked Mar 9 '11 at 5:29









Julien

118115




118115












  • The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
    – Julien
    Mar 9 '11 at 5:41












  • It would be quite helpful if your sample input included LIST ITEM somewhere (assuming it's literal) as well as providing desired output
    – SiegeX
    Mar 9 '11 at 5:41










  • @Julien When you say **LIST ITEM**, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extra Johnny line, yes?
    – SiegeX
    Mar 9 '11 at 5:58










  • When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
    – Julien
    Mar 9 '11 at 6:03










  • awk -F ""*,"*" '/32/ {print $1}' data.csv would yield Johnny, unless I am mistaken.
    – Julien
    Mar 9 '11 at 6:04


















  • The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
    – Julien
    Mar 9 '11 at 5:41












  • It would be quite helpful if your sample input included LIST ITEM somewhere (assuming it's literal) as well as providing desired output
    – SiegeX
    Mar 9 '11 at 5:41










  • @Julien When you say **LIST ITEM**, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extra Johnny line, yes?
    – SiegeX
    Mar 9 '11 at 5:58










  • When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
    – Julien
    Mar 9 '11 at 6:03










  • awk -F ""*,"*" '/32/ {print $1}' data.csv would yield Johnny, unless I am mistaken.
    – Julien
    Mar 9 '11 at 6:04
















The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
– Julien
Mar 9 '11 at 5:41






The command should run replacing **LIST ITEM** with 'Mike', then 'John', then 'purple', then '32'.
– Julien
Mar 9 '11 at 5:41














It would be quite helpful if your sample input included LIST ITEM somewhere (assuming it's literal) as well as providing desired output
– SiegeX
Mar 9 '11 at 5:41




It would be quite helpful if your sample input included LIST ITEM somewhere (assuming it's literal) as well as providing desired output
– SiegeX
Mar 9 '11 at 5:41












@Julien When you say **LIST ITEM**, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extra Johnny line, yes?
– SiegeX
Mar 9 '11 at 5:58




@Julien When you say **LIST ITEM**, it appears you mean the first field of your CSV, yes? Also, I believe your desired output is wrong, there is an extra Johnny line, yes?
– SiegeX
Mar 9 '11 at 5:58












When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
– Julien
Mar 9 '11 at 6:03




When I say **LIST ITEM** I mean a line from list.txt. Hence an item from the list, or list item.
– Julien
Mar 9 '11 at 6:03












awk -F ""*,"*" '/32/ {print $1}' data.csv would yield Johnny, unless I am mistaken.
– Julien
Mar 9 '11 at 6:04




awk -F ""*,"*" '/32/ {print $1}' data.csv would yield Johnny, unless I am mistaken.
– Julien
Mar 9 '11 at 6:04










3 Answers
3






active

oldest

votes


















2














This meets the order of your desired output:



$ awk -F, '
NR == FNR {field1[$0] = $1; next}
{
for (line in field1)
if (line ~ $0)
print field1[line]
}
' data.csv list.txt
"The Mike"
"John"
"Johnny"
"Lizard"
"Johnny"


This reads the data.csv file into memory, mapping the whole line to field1. Then, each line of the list.txt file is checked against each element of the field1 array.



If the data file is much larger than the list file, then it would make more sense to hold the smaller file in memory and loop over the larger file a line at a time:



$ awk -F, '
NR == FNR {list[$1]; next}
{
for (item in list)
if ($0 ~ item)
print $1
}
' list.txt data.csv
"John"
"The Mike"
"Lizard"
"Johnny"
"Johnny"





share|improve this answer























  • What if both files are very large? Is it just whichever is bigger?
    – Julien
    Mar 9 '11 at 6:34










  • This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
    – Julien
    Mar 9 '11 at 6:52










  • @Julien, this might work for you too: grep -f list.txt data.csv | cut -d, -f1, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
    – glenn jackman
    Mar 9 '11 at 17:35



















2














#!/bin/bash

while read -r line; do
awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv
done < list.txt


Proof of Concept



$ while read -r line; do awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv; done < list.txt
The Mike
John
Johnny
Lizard
Johnny


This field separator deals with embedded quotes and/or commas






share|improve this answer





















  • It's not printing the second Johnny on my system... I'm not sure why.
    – Julien
    Mar 9 '11 at 6:26



















1














I'm not entirely clear on what you're trying to do: replace LIST ITEM with what? Just looking for a match anywhere and outputting the first field? Also, your example list.txt appears to match anywhere in the line, which could potentially be problematic: what if list.txt at some point contains the line e? That would match all but the last line of your sample data.csv.



awk -F '^"?|"?,"?|"$?' 'BEGIN {
# read list.txt into an array
while (getline pat < "list.txt") {
pats[pat] = 1
}
close("list.txt")
}
{
# skip empty field before leading "
if ($1 == "") {
res = $2
} else {
res = $1
}
# scan record for patterns stored earlier,
# output the first real data field (res) if
# found
for (pat in pats) {
if ($0 ~ pat) {
print res
}
}
}' data.csv


This is a bit more complex than it could be; your field separator doesn't deal with the optional leading quotation mark on the first field or the optional trailing one on the last field. Mine does, but at the price that if it's there the first field will be empty (the empty string before ^"?). It also doesn't try to deal with embedded quotes. A dedicated CSV parser would be a better idea if you need to support random generalized CSV.






share|improve this answer





















  • I see why, now, after reading glenn jackman's answer
    – Julien
    Mar 9 '11 at 6:35










  • @Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
    – Gilles
    Mar 9 '11 at 20:42











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f8894%2floop-a-list-through-awk%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














This meets the order of your desired output:



$ awk -F, '
NR == FNR {field1[$0] = $1; next}
{
for (line in field1)
if (line ~ $0)
print field1[line]
}
' data.csv list.txt
"The Mike"
"John"
"Johnny"
"Lizard"
"Johnny"


This reads the data.csv file into memory, mapping the whole line to field1. Then, each line of the list.txt file is checked against each element of the field1 array.



If the data file is much larger than the list file, then it would make more sense to hold the smaller file in memory and loop over the larger file a line at a time:



$ awk -F, '
NR == FNR {list[$1]; next}
{
for (item in list)
if ($0 ~ item)
print $1
}
' list.txt data.csv
"John"
"The Mike"
"Lizard"
"Johnny"
"Johnny"





share|improve this answer























  • What if both files are very large? Is it just whichever is bigger?
    – Julien
    Mar 9 '11 at 6:34










  • This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
    – Julien
    Mar 9 '11 at 6:52










  • @Julien, this might work for you too: grep -f list.txt data.csv | cut -d, -f1, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
    – glenn jackman
    Mar 9 '11 at 17:35
















2














This meets the order of your desired output:



$ awk -F, '
NR == FNR {field1[$0] = $1; next}
{
for (line in field1)
if (line ~ $0)
print field1[line]
}
' data.csv list.txt
"The Mike"
"John"
"Johnny"
"Lizard"
"Johnny"


This reads the data.csv file into memory, mapping the whole line to field1. Then, each line of the list.txt file is checked against each element of the field1 array.



If the data file is much larger than the list file, then it would make more sense to hold the smaller file in memory and loop over the larger file a line at a time:



$ awk -F, '
NR == FNR {list[$1]; next}
{
for (item in list)
if ($0 ~ item)
print $1
}
' list.txt data.csv
"John"
"The Mike"
"Lizard"
"Johnny"
"Johnny"





share|improve this answer























  • What if both files are very large? Is it just whichever is bigger?
    – Julien
    Mar 9 '11 at 6:34










  • This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
    – Julien
    Mar 9 '11 at 6:52










  • @Julien, this might work for you too: grep -f list.txt data.csv | cut -d, -f1, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
    – glenn jackman
    Mar 9 '11 at 17:35














2












2








2






This meets the order of your desired output:



$ awk -F, '
NR == FNR {field1[$0] = $1; next}
{
for (line in field1)
if (line ~ $0)
print field1[line]
}
' data.csv list.txt
"The Mike"
"John"
"Johnny"
"Lizard"
"Johnny"


This reads the data.csv file into memory, mapping the whole line to field1. Then, each line of the list.txt file is checked against each element of the field1 array.



If the data file is much larger than the list file, then it would make more sense to hold the smaller file in memory and loop over the larger file a line at a time:



$ awk -F, '
NR == FNR {list[$1]; next}
{
for (item in list)
if ($0 ~ item)
print $1
}
' list.txt data.csv
"John"
"The Mike"
"Lizard"
"Johnny"
"Johnny"





share|improve this answer














This meets the order of your desired output:



$ awk -F, '
NR == FNR {field1[$0] = $1; next}
{
for (line in field1)
if (line ~ $0)
print field1[line]
}
' data.csv list.txt
"The Mike"
"John"
"Johnny"
"Lizard"
"Johnny"


This reads the data.csv file into memory, mapping the whole line to field1. Then, each line of the list.txt file is checked against each element of the field1 array.



If the data file is much larger than the list file, then it would make more sense to hold the smaller file in memory and loop over the larger file a line at a time:



$ awk -F, '
NR == FNR {list[$1]; next}
{
for (item in list)
if ($0 ~ item)
print $1
}
' list.txt data.csv
"John"
"The Mike"
"Lizard"
"Johnny"
"Johnny"






share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 9 '11 at 6:28

























answered Mar 9 '11 at 6:20









glenn jackman

50.2k570107




50.2k570107












  • What if both files are very large? Is it just whichever is bigger?
    – Julien
    Mar 9 '11 at 6:34










  • This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
    – Julien
    Mar 9 '11 at 6:52










  • @Julien, this might work for you too: grep -f list.txt data.csv | cut -d, -f1, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
    – glenn jackman
    Mar 9 '11 at 17:35


















  • What if both files are very large? Is it just whichever is bigger?
    – Julien
    Mar 9 '11 at 6:34










  • This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
    – Julien
    Mar 9 '11 at 6:52










  • @Julien, this might work for you too: grep -f list.txt data.csv | cut -d, -f1, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
    – glenn jackman
    Mar 9 '11 at 17:35
















What if both files are very large? Is it just whichever is bigger?
– Julien
Mar 9 '11 at 6:34




What if both files are very large? Is it just whichever is bigger?
– Julien
Mar 9 '11 at 6:34












This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
– Julien
Mar 9 '11 at 6:52




This is great, now what if I only wanted it to match the beginning of the lines in data.csv?
– Julien
Mar 9 '11 at 6:52












@Julien, this might work for you too: grep -f list.txt data.csv | cut -d, -f1, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
– glenn jackman
Mar 9 '11 at 17:35




@Julien, this might work for you too: grep -f list.txt data.csv | cut -d, -f1, but you only get "Johnny" once. It's a more lightweight pipeline but I can't tell if it would meet your requirements.
– glenn jackman
Mar 9 '11 at 17:35













2














#!/bin/bash

while read -r line; do
awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv
done < list.txt


Proof of Concept



$ while read -r line; do awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv; done < list.txt
The Mike
John
Johnny
Lizard
Johnny


This field separator deals with embedded quotes and/or commas






share|improve this answer





















  • It's not printing the second Johnny on my system... I'm not sure why.
    – Julien
    Mar 9 '11 at 6:26
















2














#!/bin/bash

while read -r line; do
awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv
done < list.txt


Proof of Concept



$ while read -r line; do awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv; done < list.txt
The Mike
John
Johnny
Lizard
Johnny


This field separator deals with embedded quotes and/or commas






share|improve this answer





















  • It's not printing the second Johnny on my system... I'm not sure why.
    – Julien
    Mar 9 '11 at 6:26














2












2








2






#!/bin/bash

while read -r line; do
awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv
done < list.txt


Proof of Concept



$ while read -r line; do awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv; done < list.txt
The Mike
John
Johnny
Lizard
Johnny


This field separator deals with embedded quotes and/or commas






share|improve this answer












#!/bin/bash

while read -r line; do
awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv
done < list.txt


Proof of Concept



$ while read -r line; do awk -F '^"|","|"$' '$0 ~ line{print $2}' line="$line" data.csv; done < list.txt
The Mike
John
Johnny
Lizard
Johnny


This field separator deals with embedded quotes and/or commas







share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 9 '11 at 6:10









SiegeX

5,30112723




5,30112723












  • It's not printing the second Johnny on my system... I'm not sure why.
    – Julien
    Mar 9 '11 at 6:26


















  • It's not printing the second Johnny on my system... I'm not sure why.
    – Julien
    Mar 9 '11 at 6:26
















It's not printing the second Johnny on my system... I'm not sure why.
– Julien
Mar 9 '11 at 6:26




It's not printing the second Johnny on my system... I'm not sure why.
– Julien
Mar 9 '11 at 6:26











1














I'm not entirely clear on what you're trying to do: replace LIST ITEM with what? Just looking for a match anywhere and outputting the first field? Also, your example list.txt appears to match anywhere in the line, which could potentially be problematic: what if list.txt at some point contains the line e? That would match all but the last line of your sample data.csv.



awk -F '^"?|"?,"?|"$?' 'BEGIN {
# read list.txt into an array
while (getline pat < "list.txt") {
pats[pat] = 1
}
close("list.txt")
}
{
# skip empty field before leading "
if ($1 == "") {
res = $2
} else {
res = $1
}
# scan record for patterns stored earlier,
# output the first real data field (res) if
# found
for (pat in pats) {
if ($0 ~ pat) {
print res
}
}
}' data.csv


This is a bit more complex than it could be; your field separator doesn't deal with the optional leading quotation mark on the first field or the optional trailing one on the last field. Mine does, but at the price that if it's there the first field will be empty (the empty string before ^"?). It also doesn't try to deal with embedded quotes. A dedicated CSV parser would be a better idea if you need to support random generalized CSV.






share|improve this answer





















  • I see why, now, after reading glenn jackman's answer
    – Julien
    Mar 9 '11 at 6:35










  • @Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
    – Gilles
    Mar 9 '11 at 20:42
















1














I'm not entirely clear on what you're trying to do: replace LIST ITEM with what? Just looking for a match anywhere and outputting the first field? Also, your example list.txt appears to match anywhere in the line, which could potentially be problematic: what if list.txt at some point contains the line e? That would match all but the last line of your sample data.csv.



awk -F '^"?|"?,"?|"$?' 'BEGIN {
# read list.txt into an array
while (getline pat < "list.txt") {
pats[pat] = 1
}
close("list.txt")
}
{
# skip empty field before leading "
if ($1 == "") {
res = $2
} else {
res = $1
}
# scan record for patterns stored earlier,
# output the first real data field (res) if
# found
for (pat in pats) {
if ($0 ~ pat) {
print res
}
}
}' data.csv


This is a bit more complex than it could be; your field separator doesn't deal with the optional leading quotation mark on the first field or the optional trailing one on the last field. Mine does, but at the price that if it's there the first field will be empty (the empty string before ^"?). It also doesn't try to deal with embedded quotes. A dedicated CSV parser would be a better idea if you need to support random generalized CSV.






share|improve this answer





















  • I see why, now, after reading glenn jackman's answer
    – Julien
    Mar 9 '11 at 6:35










  • @Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
    – Gilles
    Mar 9 '11 at 20:42














1












1








1






I'm not entirely clear on what you're trying to do: replace LIST ITEM with what? Just looking for a match anywhere and outputting the first field? Also, your example list.txt appears to match anywhere in the line, which could potentially be problematic: what if list.txt at some point contains the line e? That would match all but the last line of your sample data.csv.



awk -F '^"?|"?,"?|"$?' 'BEGIN {
# read list.txt into an array
while (getline pat < "list.txt") {
pats[pat] = 1
}
close("list.txt")
}
{
# skip empty field before leading "
if ($1 == "") {
res = $2
} else {
res = $1
}
# scan record for patterns stored earlier,
# output the first real data field (res) if
# found
for (pat in pats) {
if ($0 ~ pat) {
print res
}
}
}' data.csv


This is a bit more complex than it could be; your field separator doesn't deal with the optional leading quotation mark on the first field or the optional trailing one on the last field. Mine does, but at the price that if it's there the first field will be empty (the empty string before ^"?). It also doesn't try to deal with embedded quotes. A dedicated CSV parser would be a better idea if you need to support random generalized CSV.






share|improve this answer












I'm not entirely clear on what you're trying to do: replace LIST ITEM with what? Just looking for a match anywhere and outputting the first field? Also, your example list.txt appears to match anywhere in the line, which could potentially be problematic: what if list.txt at some point contains the line e? That would match all but the last line of your sample data.csv.



awk -F '^"?|"?,"?|"$?' 'BEGIN {
# read list.txt into an array
while (getline pat < "list.txt") {
pats[pat] = 1
}
close("list.txt")
}
{
# skip empty field before leading "
if ($1 == "") {
res = $2
} else {
res = $1
}
# scan record for patterns stored earlier,
# output the first real data field (res) if
# found
for (pat in pats) {
if ($0 ~ pat) {
print res
}
}
}' data.csv


This is a bit more complex than it could be; your field separator doesn't deal with the optional leading quotation mark on the first field or the optional trailing one on the last field. Mine does, but at the price that if it's there the first field will be empty (the empty string before ^"?). It also doesn't try to deal with embedded quotes. A dedicated CSV parser would be a better idea if you need to support random generalized CSV.







share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 9 '11 at 6:08









geekosaur

22.3k25853




22.3k25853












  • I see why, now, after reading glenn jackman's answer
    – Julien
    Mar 9 '11 at 6:35










  • @Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
    – Gilles
    Mar 9 '11 at 20:42


















  • I see why, now, after reading glenn jackman's answer
    – Julien
    Mar 9 '11 at 6:35










  • @Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
    – Gilles
    Mar 9 '11 at 20:42
















I see why, now, after reading glenn jackman's answer
– Julien
Mar 9 '11 at 6:35




I see why, now, after reading glenn jackman's answer
– Julien
Mar 9 '11 at 6:35












@Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
– Gilles
Mar 9 '11 at 20:42




@Julien: Your proposed edit should have been a comment. Post it as a comment here if you still have a question about geekosaur's answer.
– Gilles
Mar 9 '11 at 20:42


















draft saved

draft discarded




















































Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f8894%2floop-a-list-through-awk%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Morgemoulin

Scott Moir

Souastre