How can I delete all characters falling under /* … */ including /* & */?











up vote
12
down vote

favorite
2












I did tried sed and awk, but its not working as the character involves "/" which is already there in command as delimiter.



Please let me know how can I achieve this.



Below is a sample Example.We want to remove the commented sections i.e /*.....*/



/*This is to print the output
data*/
proc print data=sashelp.cars;
run;
/*Creating dataset*/
data abc;
set xyz;
run;









share|improve this question
























  • -bash-4.1$ sed 's,/*.**/,,g' test.sas Below is the ouput i get , the first comment is still there. /*This is to print the output data*/ proc print data=sashelp.cars; run; data abc; set xyz; run;
    – Sharique Alam
    Jul 21 '16 at 11:18








  • 1




    Thanks for the edit. It would be even better if you included your desired output as well. Also include what you tried and how it failed in the question not in the comments.
    – terdon
    Jul 21 '16 at 11:33






  • 2




    What should happen to string literals containing comments or comment delimiters? (e.g. INSERT INTO string_table VALUES('/*'), ('*/'), ('/**/'); )
    – zwol
    Jul 21 '16 at 17:20






  • 1




    Related (sorry I can't resist!): codegolf.stackexchange.com/questions/48326/…
    – ilkkachu
    Jul 21 '16 at 21:27










  • I updated my post with another solutions, please recheck if now it is good for you.
    – Luciano Andress Martini
    Jun 6 at 14:50















up vote
12
down vote

favorite
2












I did tried sed and awk, but its not working as the character involves "/" which is already there in command as delimiter.



Please let me know how can I achieve this.



Below is a sample Example.We want to remove the commented sections i.e /*.....*/



/*This is to print the output
data*/
proc print data=sashelp.cars;
run;
/*Creating dataset*/
data abc;
set xyz;
run;









share|improve this question
























  • -bash-4.1$ sed 's,/*.**/,,g' test.sas Below is the ouput i get , the first comment is still there. /*This is to print the output data*/ proc print data=sashelp.cars; run; data abc; set xyz; run;
    – Sharique Alam
    Jul 21 '16 at 11:18








  • 1




    Thanks for the edit. It would be even better if you included your desired output as well. Also include what you tried and how it failed in the question not in the comments.
    – terdon
    Jul 21 '16 at 11:33






  • 2




    What should happen to string literals containing comments or comment delimiters? (e.g. INSERT INTO string_table VALUES('/*'), ('*/'), ('/**/'); )
    – zwol
    Jul 21 '16 at 17:20






  • 1




    Related (sorry I can't resist!): codegolf.stackexchange.com/questions/48326/…
    – ilkkachu
    Jul 21 '16 at 21:27










  • I updated my post with another solutions, please recheck if now it is good for you.
    – Luciano Andress Martini
    Jun 6 at 14:50













up vote
12
down vote

favorite
2









up vote
12
down vote

favorite
2






2





I did tried sed and awk, but its not working as the character involves "/" which is already there in command as delimiter.



Please let me know how can I achieve this.



Below is a sample Example.We want to remove the commented sections i.e /*.....*/



/*This is to print the output
data*/
proc print data=sashelp.cars;
run;
/*Creating dataset*/
data abc;
set xyz;
run;









share|improve this question















I did tried sed and awk, but its not working as the character involves "/" which is already there in command as delimiter.



Please let me know how can I achieve this.



Below is a sample Example.We want to remove the commented sections i.e /*.....*/



/*This is to print the output
data*/
proc print data=sashelp.cars;
run;
/*Creating dataset*/
data abc;
set xyz;
run;






text-processing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jul 21 '16 at 11:07









don_crissti

48.8k15129157




48.8k15129157










asked Jul 21 '16 at 10:53









Sharique Alam

614




614












  • -bash-4.1$ sed 's,/*.**/,,g' test.sas Below is the ouput i get , the first comment is still there. /*This is to print the output data*/ proc print data=sashelp.cars; run; data abc; set xyz; run;
    – Sharique Alam
    Jul 21 '16 at 11:18








  • 1




    Thanks for the edit. It would be even better if you included your desired output as well. Also include what you tried and how it failed in the question not in the comments.
    – terdon
    Jul 21 '16 at 11:33






  • 2




    What should happen to string literals containing comments or comment delimiters? (e.g. INSERT INTO string_table VALUES('/*'), ('*/'), ('/**/'); )
    – zwol
    Jul 21 '16 at 17:20






  • 1




    Related (sorry I can't resist!): codegolf.stackexchange.com/questions/48326/…
    – ilkkachu
    Jul 21 '16 at 21:27










  • I updated my post with another solutions, please recheck if now it is good for you.
    – Luciano Andress Martini
    Jun 6 at 14:50


















  • -bash-4.1$ sed 's,/*.**/,,g' test.sas Below is the ouput i get , the first comment is still there. /*This is to print the output data*/ proc print data=sashelp.cars; run; data abc; set xyz; run;
    – Sharique Alam
    Jul 21 '16 at 11:18








  • 1




    Thanks for the edit. It would be even better if you included your desired output as well. Also include what you tried and how it failed in the question not in the comments.
    – terdon
    Jul 21 '16 at 11:33






  • 2




    What should happen to string literals containing comments or comment delimiters? (e.g. INSERT INTO string_table VALUES('/*'), ('*/'), ('/**/'); )
    – zwol
    Jul 21 '16 at 17:20






  • 1




    Related (sorry I can't resist!): codegolf.stackexchange.com/questions/48326/…
    – ilkkachu
    Jul 21 '16 at 21:27










  • I updated my post with another solutions, please recheck if now it is good for you.
    – Luciano Andress Martini
    Jun 6 at 14:50
















-bash-4.1$ sed 's,/*.**/,,g' test.sas Below is the ouput i get , the first comment is still there. /*This is to print the output data*/ proc print data=sashelp.cars; run; data abc; set xyz; run;
– Sharique Alam
Jul 21 '16 at 11:18






-bash-4.1$ sed 's,/*.**/,,g' test.sas Below is the ouput i get , the first comment is still there. /*This is to print the output data*/ proc print data=sashelp.cars; run; data abc; set xyz; run;
– Sharique Alam
Jul 21 '16 at 11:18






1




1




Thanks for the edit. It would be even better if you included your desired output as well. Also include what you tried and how it failed in the question not in the comments.
– terdon
Jul 21 '16 at 11:33




Thanks for the edit. It would be even better if you included your desired output as well. Also include what you tried and how it failed in the question not in the comments.
– terdon
Jul 21 '16 at 11:33




2




2




What should happen to string literals containing comments or comment delimiters? (e.g. INSERT INTO string_table VALUES('/*'), ('*/'), ('/**/'); )
– zwol
Jul 21 '16 at 17:20




What should happen to string literals containing comments or comment delimiters? (e.g. INSERT INTO string_table VALUES('/*'), ('*/'), ('/**/'); )
– zwol
Jul 21 '16 at 17:20




1




1




Related (sorry I can't resist!): codegolf.stackexchange.com/questions/48326/…
– ilkkachu
Jul 21 '16 at 21:27




Related (sorry I can't resist!): codegolf.stackexchange.com/questions/48326/…
– ilkkachu
Jul 21 '16 at 21:27












I updated my post with another solutions, please recheck if now it is good for you.
– Luciano Andress Martini
Jun 6 at 14:50




I updated my post with another solutions, please recheck if now it is good for you.
– Luciano Andress Martini
Jun 6 at 14:50










7 Answers
7






active

oldest

votes

















up vote
21
down vote













I think i found a easy solution!



cpp -P yourcommentedfile.txt 


SOME UPDATES:



Quote from the user ilkachu (original text from the user comments):



I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.



But i think it is still reasonable and a easy solution for the most of the cases, if you disable the macro expansion and other things i think you will get good results... - and yes you can combine that with shell script for getting better... and much more...



I am personally using cpp -P (without any other parameter) for removing comments from php files without any problem for years - but maybe you will not have that lucky... so consider your problem with care and it should do good.



If that is not enough for you, try stripcmt - a comment remover:
http://www.bdc.cx/software/stripcmt/






share|improve this answer



















  • 1




    Using the C preprocessor is likely the most robust solution. Since the preprocessor is likely the most robust parser of C comments. Clever.
    – grochmal
    Jul 21 '16 at 13:33






  • 14




    But cpp will do a lot more than removing comments (process #include, expand macros, including builtin ones...)
    – Stéphane Chazelas
    Jul 21 '16 at 14:17






  • 3




    @LucianoAndressMartini, no, tail -n +7 will just remove the first 7 lines, it will not prevent the #include processing or macro expansions. Try echo __LINE__ | cpp for instance. Or echo '#include /dev/zero' | cpp
    – Stéphane Chazelas
    Jul 21 '16 at 15:02






  • 2




    You probably want to use -P mode if you do this. (This may eliminate the need to use tail.)
    – zwol
    Jul 21 '16 at 17:14






  • 3




    I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.
    – ilkkachu
    Jul 21 '16 at 21:51


















up vote
10
down vote













I once came up with this which we can refine to:



perl -0777 -pe '
BEGIN{
$bs=qr{(?:\|??/)};
$lc=qr{(?:$bsn|$bsrn?)}
}
s{
/$lc**.*?*$lc*/
| /$lc*/(?:$lc|[^rn])*
| (
"(?:$bs$lc*.|.)*?"
| '''$lc*(?:$bs$lc*(?:??.|.))?(?:??.|.)*?'''
| ??'''
| .[^'''"/?]*
)
}{$1 eq "" ? " " : "$1"}exsg'


to handle a few more corner cases.



Note that if you remove a comment, you could change the meaning of the code (1-/* comment */-1 is parsed like 1 - -1 while 1--1 (which you'd obtain if you removed the comment) would give you an error). It's better to replace the comment with a space character (as we do here) instead of completely removing it.



The above should work properly on this valid ANSI C code for instance that tries to include a few corner cases:




#include <stdio.h>
int main()
{
printf("%d %s %c%c%c%c%c %s %s %dn",
1-/* comment */-1,
/
* comment */
"/* not a comment */",
/* multiline
comment */
'"' /* comment */ , '"',
''','"'/* comment */,
'

"', /* comment */
"\
" /* not a comment */ ",
"??/" /* not a comment */ ",
'??''+'"' /* "comment" */);
return 0;
}


Which gives this output:




#include <stdio.h>
int main()
{
printf("%d %s %c%c%c%c%c %s %s %dn",
1- -1,

"/* not a comment */",

'"' , '"',
''','"' ,
'

"',
"\
" /* not a comment */ ",
"??/" /* not a comment */ ",
'??''+'"' );
return 0;
}


Both printing the same output when compiled and run.



You can compare with the output of gcc -ansi -E to see what the pre-processor would do on it. That code is also valid C99 or C11 code, however gcc disables trigraphs support by default so it won't work with gcc unless you specify the standard like gcc -std=c99 or gcc -std=c11 or add the -trigraphs option).



It also works on this C99/C11 (non-ANSI/C90) code:




// comment
/
/ comment
// multiline
comment
"// not a comment"


(compare with gcc -E/gcc -std=c99 -E/gcc -std=c11 -E)



ANSI C didn't support the // form of comment. // is not otherwise valid in ANSI C so wouldn't appear there. One contrived case where // may genuinely appear in ANSI C (as noted there, and you may find the rest of the discussion interesting) is when the stringify operator is in use.



This is a valid ANSI C code:



#define s(x) #x
s(//not a comment)


And at the time of the discussion in 2004, gcc -ansi -E did indeed expand it to "//not a comment". However today, gcc-5.4 returns an error on it, so I'd doubt we'll find a lot of C code using this kind of construct.



The GNU sed equivalent could be something like:



lc='([\%]n|[\%]rn?)'
sed -zE "
s/_/_u/g;s/!/_b/g;s/</_l/g;s/>/_r/g;s/:/_c/g;s/;/_s/g;s/@/_a/g;s/%/_p/g;
s@??/@%@g;s@/$lc**@:&@g;s@*$lc*/@;&@g
s:/$lc*/:@&:g;s/??'/!/g
s#:/$lc**[^;]*;*$lc*/|@/$lc*/$lc*|("([\\%]$lc*.|[^\\%"])*"|'$lc*([\\%]$lc*.)?[^\\%']*'|[^'"@;:]+)#<5>#g
s/<>/ /g;s/!/??'/g;s@%@??/@g;s/[<>@:;]//g
s/_p/%/g;s/_a/@/g;s/_s/;/g;s/_c/:/g;s/_r/>/g;s/_l/</g;s/_b/!/g;s/_u/_/g"


If your GNU sed is too old to support -E or -z, you can replace the first line with:



sed -r ":1;$!{N;b1}





share|improve this answer























  • perl solution have problem with multi line: test it with this output => echo -e "BEGIN/*comment*/ COMMAND /*comnment*/END"
    – بارپابابا
    Jul 21 '16 at 14:18










  • @Babby, works for me. I've added a multi-line comment and the resulting output in my test case.
    – Stéphane Chazelas
    Jul 21 '16 at 14:28










  • The best thing to compare to nowadays would be gcc -std=c11 -E -P (-ansi is just another name for -std=c90).
    – zwol
    Jul 21 '16 at 17:16










  • @zwol, the idea is to be able to handle code written for any C/C++ standard (c90, c11 or other). Strictly speaking, it's not possible (see my 2nd contrived example). The code still tries to handle C90 constructs (like ??'), hence we compare with cpp -ansi for those and C99/C11... one (like // xxx), hence we compare with cpp (or cpp -std=c11...)
    – Stéphane Chazelas
    Jul 21 '16 at 17:29










  • @zwol, I've split the test case in an attempt to clarify a bit. It looks like trigraphs are still in C11, so my second test case is not standard C anyway.
    – Stéphane Chazelas
    Jul 21 '16 at 17:47


















up vote
6
down vote













with sed:



UPDATE



//*/ {
/*// {
s//*.**///g;
b next
};

:loop;
/*//! {
N;
b loop
};
/*// {
s//*.**//n/g
}
:next
}


support all possible (multi line comment, data after [or and] befor, );



 e1/*comment*/
-------------------
e1/*comment*/e2
-------------------
/*comment*/e2
-------------------
e1/*com
ment*/
-------------------
e1/*com
ment*/e2
-------------------
/*com
ment*/e2
-------------------
e1/*com
1
2
ment*/
-------------------
e1/*com
1
2
ment*/e2
-------------------
/*com
1
2
ment*/e2
-------------------


run:

$ sed -f command.sed FILENAME

e1
-------------------
e1e2
-------------------
e2
-------------------
e1

-------------------
e1
e2
-------------------

e2
-------------------
e1

-------------------
e1
e2
-------------------

e2
-------------------





share|improve this answer























  • won't work for a comment starting after data, like proc print data 2nd /*another comment is here*/
    – mazs
    Jul 21 '16 at 12:44












  • @mazs updated, check it
    – بارپابابا
    Jul 21 '16 at 13:19










  • This does not handle comments inside string literals, which may actually matter, depending on what the SQL does
    – zwol
    Jul 21 '16 at 17:18


















up vote
4
down vote













 $ cat file | perl -pe 'BEGIN{$/=undef}s!/*.+?*/!!sg'

proc print data=sashelp.cars;
run;

data abc;
set xyz;
run;


Remove blank lines if any:



 $ cat file | perl -pe 'BEGIN{$/=undef}s!/*.+?*/n?!!sg'


Edit - the shorter version by Stephane:



 $ cat file | perl -0777 -pe 's!/*.*?*/!!sg'





share|improve this answer























  • well, I agree with terdon: Lets see the expected output.
    – Hans Schou
    Jul 21 '16 at 12:20










  • BTW: What should happen to a single line containing: "/*foo*/run;/*bar*/" ? Should that just be "run;" ?
    – Hans Schou
    Jul 21 '16 at 12:29










  • Great! Then my solution works. Note I use non-greedy: ".+?"
    – Hans Schou
    Jul 21 '16 at 12:32






  • 2




    See -0777 as a shorter way to do BEGIN{$/=undef}
    – Stéphane Chazelas
    Jul 21 '16 at 13:30






  • 1




    Perhaps .*? instead of .+? if /**/ is a valid comment too.
    – ilkkachu
    Jul 21 '16 at 20:57


















up vote
2
down vote













Solution by Using SED command and no Script



Here you are:



sed 's/*//n&/g' test | sed '//*/,/*//d'



N.B. This doesn't work on OS X, unless you install gnu-sed. But it works on Linux Distros.






share|improve this answer



















  • 1




    you can use -i option to edit file in-place instead of redirecting output to new file. or much safer -i.bak to backup file
    – Rahul
    Jul 21 '16 at 12:18






  • 1




    It is not working for all the cases too, try to put a comment in the same line and watch what happens... Example set xy; /*test*/ I think we will need perl too solve this in a easy way.
    – Luciano Andress Martini
    Jul 21 '16 at 12:19












  • @Rahul exactly, thanks for mentioning. I just wanted to keep it more simple.
    – FarazX
    Jul 21 '16 at 12:21










  • Im very sorry to say that it is not working for comments in the same line.
    – Luciano Andress Martini
    Jul 21 '16 at 12:38












  • @LucianoAndressMartini Now it does!
    – FarazX
    Jul 21 '16 at 18:28


















up vote
1
down vote













sed operates on one line at a time, but some of the comments in the input span multiple lines. As per https://unix.stackexchange.com/a/152389/90751 , you can first use tr to turn the line-breaks into some other character. Then sed can process the input as a single line, and you use tr again to restore the line-breaks.



tr 'n' '' | sed ... | tr '' n'


I've used null bytes, but you can pick any character that doesn't appear in your input file.



* has a special meaning in regular expressions, so it will need escaping as * to match a literal *.



.* is greedy -- it will match the longest possible text, including more */ and /*. That means the first comment, the last comment, and everything in between. To restrict this, replace .* with a stricter pattern: comments can contain anything that's not a "*", and also "*" followed by anything that's not a "/". Runs of multiple *s also have to be accounted for:



tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' | tr '' 'n'


This will remove any linebreaks in the multiline comments, ie.



data1 /* multiline
comment */ data2


will become



data1  data2


If this isn't what was wanted, sed can be told to keep one of the linebreaks. This means picking a linebreak replacement character that can be matched.



tr 'n' 'f' | sed -e 's,/*((f)|[^*]|*+[^*/])**+/,2,g' | tr 'f' 'n'


The special character f, and the use of a back-reference that may not have matched anything, aren't guaranteed to work as intended in all sed implementations. (I confirmed it works on GNU sed 4.07 and 4.2.2.)






share|improve this answer























  • Could you please let mne know how it will work .I tried as below. tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' test.sas | tr '' 'n' and i got as below: /*This is to print the output data*/data abcdf; set cfgtr; run; proc print data=sashelp.cars; run; data abc; set xyz; run;
    – Sharique Alam
    Aug 5 '16 at 13:25












  • @ShariqueAlam You've put test.sas in the middle of the pipeline there, so sed reads from it directly, and the first tr has no effect. You need to use cat test.sas | tr ...
    – JigglyNaga
    Aug 6 '16 at 14:49


















up vote
0
down vote













using one line sed to remove comments:



sed '//*/d;/*//d' file

proc print data=sashelp.cars;
run;
data abc;
set xyz;
run;





share|improve this answer





















    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "106"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f297346%2fhow-can-i-delete-all-characters-falling-under-including%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    7 Answers
    7






    active

    oldest

    votes








    7 Answers
    7






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    21
    down vote













    I think i found a easy solution!



    cpp -P yourcommentedfile.txt 


    SOME UPDATES:



    Quote from the user ilkachu (original text from the user comments):



    I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.



    But i think it is still reasonable and a easy solution for the most of the cases, if you disable the macro expansion and other things i think you will get good results... - and yes you can combine that with shell script for getting better... and much more...



    I am personally using cpp -P (without any other parameter) for removing comments from php files without any problem for years - but maybe you will not have that lucky... so consider your problem with care and it should do good.



    If that is not enough for you, try stripcmt - a comment remover:
    http://www.bdc.cx/software/stripcmt/






    share|improve this answer



















    • 1




      Using the C preprocessor is likely the most robust solution. Since the preprocessor is likely the most robust parser of C comments. Clever.
      – grochmal
      Jul 21 '16 at 13:33






    • 14




      But cpp will do a lot more than removing comments (process #include, expand macros, including builtin ones...)
      – Stéphane Chazelas
      Jul 21 '16 at 14:17






    • 3




      @LucianoAndressMartini, no, tail -n +7 will just remove the first 7 lines, it will not prevent the #include processing or macro expansions. Try echo __LINE__ | cpp for instance. Or echo '#include /dev/zero' | cpp
      – Stéphane Chazelas
      Jul 21 '16 at 15:02






    • 2




      You probably want to use -P mode if you do this. (This may eliminate the need to use tail.)
      – zwol
      Jul 21 '16 at 17:14






    • 3




      I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.
      – ilkkachu
      Jul 21 '16 at 21:51















    up vote
    21
    down vote













    I think i found a easy solution!



    cpp -P yourcommentedfile.txt 


    SOME UPDATES:



    Quote from the user ilkachu (original text from the user comments):



    I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.



    But i think it is still reasonable and a easy solution for the most of the cases, if you disable the macro expansion and other things i think you will get good results... - and yes you can combine that with shell script for getting better... and much more...



    I am personally using cpp -P (without any other parameter) for removing comments from php files without any problem for years - but maybe you will not have that lucky... so consider your problem with care and it should do good.



    If that is not enough for you, try stripcmt - a comment remover:
    http://www.bdc.cx/software/stripcmt/






    share|improve this answer



















    • 1




      Using the C preprocessor is likely the most robust solution. Since the preprocessor is likely the most robust parser of C comments. Clever.
      – grochmal
      Jul 21 '16 at 13:33






    • 14




      But cpp will do a lot more than removing comments (process #include, expand macros, including builtin ones...)
      – Stéphane Chazelas
      Jul 21 '16 at 14:17






    • 3




      @LucianoAndressMartini, no, tail -n +7 will just remove the first 7 lines, it will not prevent the #include processing or macro expansions. Try echo __LINE__ | cpp for instance. Or echo '#include /dev/zero' | cpp
      – Stéphane Chazelas
      Jul 21 '16 at 15:02






    • 2




      You probably want to use -P mode if you do this. (This may eliminate the need to use tail.)
      – zwol
      Jul 21 '16 at 17:14






    • 3




      I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.
      – ilkkachu
      Jul 21 '16 at 21:51













    up vote
    21
    down vote










    up vote
    21
    down vote









    I think i found a easy solution!



    cpp -P yourcommentedfile.txt 


    SOME UPDATES:



    Quote from the user ilkachu (original text from the user comments):



    I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.



    But i think it is still reasonable and a easy solution for the most of the cases, if you disable the macro expansion and other things i think you will get good results... - and yes you can combine that with shell script for getting better... and much more...



    I am personally using cpp -P (without any other parameter) for removing comments from php files without any problem for years - but maybe you will not have that lucky... so consider your problem with care and it should do good.



    If that is not enough for you, try stripcmt - a comment remover:
    http://www.bdc.cx/software/stripcmt/






    share|improve this answer














    I think i found a easy solution!



    cpp -P yourcommentedfile.txt 


    SOME UPDATES:



    Quote from the user ilkachu (original text from the user comments):



    I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.



    But i think it is still reasonable and a easy solution for the most of the cases, if you disable the macro expansion and other things i think you will get good results... - and yes you can combine that with shell script for getting better... and much more...



    I am personally using cpp -P (without any other parameter) for removing comments from php files without any problem for years - but maybe you will not have that lucky... so consider your problem with care and it should do good.



    If that is not enough for you, try stripcmt - a comment remover:
    http://www.bdc.cx/software/stripcmt/







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 19 at 18:35

























    answered Jul 21 '16 at 13:10









    Luciano Andress Martini

    3,335930




    3,335930








    • 1




      Using the C preprocessor is likely the most robust solution. Since the preprocessor is likely the most robust parser of C comments. Clever.
      – grochmal
      Jul 21 '16 at 13:33






    • 14




      But cpp will do a lot more than removing comments (process #include, expand macros, including builtin ones...)
      – Stéphane Chazelas
      Jul 21 '16 at 14:17






    • 3




      @LucianoAndressMartini, no, tail -n +7 will just remove the first 7 lines, it will not prevent the #include processing or macro expansions. Try echo __LINE__ | cpp for instance. Or echo '#include /dev/zero' | cpp
      – Stéphane Chazelas
      Jul 21 '16 at 15:02






    • 2




      You probably want to use -P mode if you do this. (This may eliminate the need to use tail.)
      – zwol
      Jul 21 '16 at 17:14






    • 3




      I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.
      – ilkkachu
      Jul 21 '16 at 21:51














    • 1




      Using the C preprocessor is likely the most robust solution. Since the preprocessor is likely the most robust parser of C comments. Clever.
      – grochmal
      Jul 21 '16 at 13:33






    • 14




      But cpp will do a lot more than removing comments (process #include, expand macros, including builtin ones...)
      – Stéphane Chazelas
      Jul 21 '16 at 14:17






    • 3




      @LucianoAndressMartini, no, tail -n +7 will just remove the first 7 lines, it will not prevent the #include processing or macro expansions. Try echo __LINE__ | cpp for instance. Or echo '#include /dev/zero' | cpp
      – Stéphane Chazelas
      Jul 21 '16 at 15:02






    • 2




      You probably want to use -P mode if you do this. (This may eliminate the need to use tail.)
      – zwol
      Jul 21 '16 at 17:14






    • 3




      I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.
      – ilkkachu
      Jul 21 '16 at 21:51








    1




    1




    Using the C preprocessor is likely the most robust solution. Since the preprocessor is likely the most robust parser of C comments. Clever.
    – grochmal
    Jul 21 '16 at 13:33




    Using the C preprocessor is likely the most robust solution. Since the preprocessor is likely the most robust parser of C comments. Clever.
    – grochmal
    Jul 21 '16 at 13:33




    14




    14




    But cpp will do a lot more than removing comments (process #include, expand macros, including builtin ones...)
    – Stéphane Chazelas
    Jul 21 '16 at 14:17




    But cpp will do a lot more than removing comments (process #include, expand macros, including builtin ones...)
    – Stéphane Chazelas
    Jul 21 '16 at 14:17




    3




    3




    @LucianoAndressMartini, no, tail -n +7 will just remove the first 7 lines, it will not prevent the #include processing or macro expansions. Try echo __LINE__ | cpp for instance. Or echo '#include /dev/zero' | cpp
    – Stéphane Chazelas
    Jul 21 '16 at 15:02




    @LucianoAndressMartini, no, tail -n +7 will just remove the first 7 lines, it will not prevent the #include processing or macro expansions. Try echo __LINE__ | cpp for instance. Or echo '#include /dev/zero' | cpp
    – Stéphane Chazelas
    Jul 21 '16 at 15:02




    2




    2




    You probably want to use -P mode if you do this. (This may eliminate the need to use tail.)
    – zwol
    Jul 21 '16 at 17:14




    You probably want to use -P mode if you do this. (This may eliminate the need to use tail.)
    – zwol
    Jul 21 '16 at 17:14




    3




    3




    I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.
    – ilkkachu
    Jul 21 '16 at 21:51




    I played a bit with the options for gcc: -fpreprocessed will disable most directives and macro expansions (except #define and #undef apparently). Adding -dD will leave defines in too; and std=c89 can be used to ignore new style // comments. Even with them, cpp replaces comments with spaces (instead of removing them), and collapses spaces and empty lines.
    – ilkkachu
    Jul 21 '16 at 21:51












    up vote
    10
    down vote













    I once came up with this which we can refine to:



    perl -0777 -pe '
    BEGIN{
    $bs=qr{(?:\|??/)};
    $lc=qr{(?:$bsn|$bsrn?)}
    }
    s{
    /$lc**.*?*$lc*/
    | /$lc*/(?:$lc|[^rn])*
    | (
    "(?:$bs$lc*.|.)*?"
    | '''$lc*(?:$bs$lc*(?:??.|.))?(?:??.|.)*?'''
    | ??'''
    | .[^'''"/?]*
    )
    }{$1 eq "" ? " " : "$1"}exsg'


    to handle a few more corner cases.



    Note that if you remove a comment, you could change the meaning of the code (1-/* comment */-1 is parsed like 1 - -1 while 1--1 (which you'd obtain if you removed the comment) would give you an error). It's better to replace the comment with a space character (as we do here) instead of completely removing it.



    The above should work properly on this valid ANSI C code for instance that tries to include a few corner cases:




    #include <stdio.h>
    int main()
    {
    printf("%d %s %c%c%c%c%c %s %s %dn",
    1-/* comment */-1,
    /
    * comment */
    "/* not a comment */",
    /* multiline
    comment */
    '"' /* comment */ , '"',
    ''','"'/* comment */,
    '

    "', /* comment */
    "\
    " /* not a comment */ ",
    "??/" /* not a comment */ ",
    '??''+'"' /* "comment" */);
    return 0;
    }


    Which gives this output:




    #include <stdio.h>
    int main()
    {
    printf("%d %s %c%c%c%c%c %s %s %dn",
    1- -1,

    "/* not a comment */",

    '"' , '"',
    ''','"' ,
    '

    "',
    "\
    " /* not a comment */ ",
    "??/" /* not a comment */ ",
    '??''+'"' );
    return 0;
    }


    Both printing the same output when compiled and run.



    You can compare with the output of gcc -ansi -E to see what the pre-processor would do on it. That code is also valid C99 or C11 code, however gcc disables trigraphs support by default so it won't work with gcc unless you specify the standard like gcc -std=c99 or gcc -std=c11 or add the -trigraphs option).



    It also works on this C99/C11 (non-ANSI/C90) code:




    // comment
    /
    / comment
    // multiline
    comment
    "// not a comment"


    (compare with gcc -E/gcc -std=c99 -E/gcc -std=c11 -E)



    ANSI C didn't support the // form of comment. // is not otherwise valid in ANSI C so wouldn't appear there. One contrived case where // may genuinely appear in ANSI C (as noted there, and you may find the rest of the discussion interesting) is when the stringify operator is in use.



    This is a valid ANSI C code:



    #define s(x) #x
    s(//not a comment)


    And at the time of the discussion in 2004, gcc -ansi -E did indeed expand it to "//not a comment". However today, gcc-5.4 returns an error on it, so I'd doubt we'll find a lot of C code using this kind of construct.



    The GNU sed equivalent could be something like:



    lc='([\%]n|[\%]rn?)'
    sed -zE "
    s/_/_u/g;s/!/_b/g;s/</_l/g;s/>/_r/g;s/:/_c/g;s/;/_s/g;s/@/_a/g;s/%/_p/g;
    s@??/@%@g;s@/$lc**@:&@g;s@*$lc*/@;&@g
    s:/$lc*/:@&:g;s/??'/!/g
    s#:/$lc**[^;]*;*$lc*/|@/$lc*/$lc*|("([\\%]$lc*.|[^\\%"])*"|'$lc*([\\%]$lc*.)?[^\\%']*'|[^'"@;:]+)#<5>#g
    s/<>/ /g;s/!/??'/g;s@%@??/@g;s/[<>@:;]//g
    s/_p/%/g;s/_a/@/g;s/_s/;/g;s/_c/:/g;s/_r/>/g;s/_l/</g;s/_b/!/g;s/_u/_/g"


    If your GNU sed is too old to support -E or -z, you can replace the first line with:



    sed -r ":1;$!{N;b1}





    share|improve this answer























    • perl solution have problem with multi line: test it with this output => echo -e "BEGIN/*comment*/ COMMAND /*comnment*/END"
      – بارپابابا
      Jul 21 '16 at 14:18










    • @Babby, works for me. I've added a multi-line comment and the resulting output in my test case.
      – Stéphane Chazelas
      Jul 21 '16 at 14:28










    • The best thing to compare to nowadays would be gcc -std=c11 -E -P (-ansi is just another name for -std=c90).
      – zwol
      Jul 21 '16 at 17:16










    • @zwol, the idea is to be able to handle code written for any C/C++ standard (c90, c11 or other). Strictly speaking, it's not possible (see my 2nd contrived example). The code still tries to handle C90 constructs (like ??'), hence we compare with cpp -ansi for those and C99/C11... one (like // xxx), hence we compare with cpp (or cpp -std=c11...)
      – Stéphane Chazelas
      Jul 21 '16 at 17:29










    • @zwol, I've split the test case in an attempt to clarify a bit. It looks like trigraphs are still in C11, so my second test case is not standard C anyway.
      – Stéphane Chazelas
      Jul 21 '16 at 17:47















    up vote
    10
    down vote













    I once came up with this which we can refine to:



    perl -0777 -pe '
    BEGIN{
    $bs=qr{(?:\|??/)};
    $lc=qr{(?:$bsn|$bsrn?)}
    }
    s{
    /$lc**.*?*$lc*/
    | /$lc*/(?:$lc|[^rn])*
    | (
    "(?:$bs$lc*.|.)*?"
    | '''$lc*(?:$bs$lc*(?:??.|.))?(?:??.|.)*?'''
    | ??'''
    | .[^'''"/?]*
    )
    }{$1 eq "" ? " " : "$1"}exsg'


    to handle a few more corner cases.



    Note that if you remove a comment, you could change the meaning of the code (1-/* comment */-1 is parsed like 1 - -1 while 1--1 (which you'd obtain if you removed the comment) would give you an error). It's better to replace the comment with a space character (as we do here) instead of completely removing it.



    The above should work properly on this valid ANSI C code for instance that tries to include a few corner cases:




    #include <stdio.h>
    int main()
    {
    printf("%d %s %c%c%c%c%c %s %s %dn",
    1-/* comment */-1,
    /
    * comment */
    "/* not a comment */",
    /* multiline
    comment */
    '"' /* comment */ , '"',
    ''','"'/* comment */,
    '

    "', /* comment */
    "\
    " /* not a comment */ ",
    "??/" /* not a comment */ ",
    '??''+'"' /* "comment" */);
    return 0;
    }


    Which gives this output:




    #include <stdio.h>
    int main()
    {
    printf("%d %s %c%c%c%c%c %s %s %dn",
    1- -1,

    "/* not a comment */",

    '"' , '"',
    ''','"' ,
    '

    "',
    "\
    " /* not a comment */ ",
    "??/" /* not a comment */ ",
    '??''+'"' );
    return 0;
    }


    Both printing the same output when compiled and run.



    You can compare with the output of gcc -ansi -E to see what the pre-processor would do on it. That code is also valid C99 or C11 code, however gcc disables trigraphs support by default so it won't work with gcc unless you specify the standard like gcc -std=c99 or gcc -std=c11 or add the -trigraphs option).



    It also works on this C99/C11 (non-ANSI/C90) code:




    // comment
    /
    / comment
    // multiline
    comment
    "// not a comment"


    (compare with gcc -E/gcc -std=c99 -E/gcc -std=c11 -E)



    ANSI C didn't support the // form of comment. // is not otherwise valid in ANSI C so wouldn't appear there. One contrived case where // may genuinely appear in ANSI C (as noted there, and you may find the rest of the discussion interesting) is when the stringify operator is in use.



    This is a valid ANSI C code:



    #define s(x) #x
    s(//not a comment)


    And at the time of the discussion in 2004, gcc -ansi -E did indeed expand it to "//not a comment". However today, gcc-5.4 returns an error on it, so I'd doubt we'll find a lot of C code using this kind of construct.



    The GNU sed equivalent could be something like:



    lc='([\%]n|[\%]rn?)'
    sed -zE "
    s/_/_u/g;s/!/_b/g;s/</_l/g;s/>/_r/g;s/:/_c/g;s/;/_s/g;s/@/_a/g;s/%/_p/g;
    s@??/@%@g;s@/$lc**@:&@g;s@*$lc*/@;&@g
    s:/$lc*/:@&:g;s/??'/!/g
    s#:/$lc**[^;]*;*$lc*/|@/$lc*/$lc*|("([\\%]$lc*.|[^\\%"])*"|'$lc*([\\%]$lc*.)?[^\\%']*'|[^'"@;:]+)#<5>#g
    s/<>/ /g;s/!/??'/g;s@%@??/@g;s/[<>@:;]//g
    s/_p/%/g;s/_a/@/g;s/_s/;/g;s/_c/:/g;s/_r/>/g;s/_l/</g;s/_b/!/g;s/_u/_/g"


    If your GNU sed is too old to support -E or -z, you can replace the first line with:



    sed -r ":1;$!{N;b1}





    share|improve this answer























    • perl solution have problem with multi line: test it with this output => echo -e "BEGIN/*comment*/ COMMAND /*comnment*/END"
      – بارپابابا
      Jul 21 '16 at 14:18










    • @Babby, works for me. I've added a multi-line comment and the resulting output in my test case.
      – Stéphane Chazelas
      Jul 21 '16 at 14:28










    • The best thing to compare to nowadays would be gcc -std=c11 -E -P (-ansi is just another name for -std=c90).
      – zwol
      Jul 21 '16 at 17:16










    • @zwol, the idea is to be able to handle code written for any C/C++ standard (c90, c11 or other). Strictly speaking, it's not possible (see my 2nd contrived example). The code still tries to handle C90 constructs (like ??'), hence we compare with cpp -ansi for those and C99/C11... one (like // xxx), hence we compare with cpp (or cpp -std=c11...)
      – Stéphane Chazelas
      Jul 21 '16 at 17:29










    • @zwol, I've split the test case in an attempt to clarify a bit. It looks like trigraphs are still in C11, so my second test case is not standard C anyway.
      – Stéphane Chazelas
      Jul 21 '16 at 17:47













    up vote
    10
    down vote










    up vote
    10
    down vote









    I once came up with this which we can refine to:



    perl -0777 -pe '
    BEGIN{
    $bs=qr{(?:\|??/)};
    $lc=qr{(?:$bsn|$bsrn?)}
    }
    s{
    /$lc**.*?*$lc*/
    | /$lc*/(?:$lc|[^rn])*
    | (
    "(?:$bs$lc*.|.)*?"
    | '''$lc*(?:$bs$lc*(?:??.|.))?(?:??.|.)*?'''
    | ??'''
    | .[^'''"/?]*
    )
    }{$1 eq "" ? " " : "$1"}exsg'


    to handle a few more corner cases.



    Note that if you remove a comment, you could change the meaning of the code (1-/* comment */-1 is parsed like 1 - -1 while 1--1 (which you'd obtain if you removed the comment) would give you an error). It's better to replace the comment with a space character (as we do here) instead of completely removing it.



    The above should work properly on this valid ANSI C code for instance that tries to include a few corner cases:




    #include <stdio.h>
    int main()
    {
    printf("%d %s %c%c%c%c%c %s %s %dn",
    1-/* comment */-1,
    /
    * comment */
    "/* not a comment */",
    /* multiline
    comment */
    '"' /* comment */ , '"',
    ''','"'/* comment */,
    '

    "', /* comment */
    "\
    " /* not a comment */ ",
    "??/" /* not a comment */ ",
    '??''+'"' /* "comment" */);
    return 0;
    }


    Which gives this output:




    #include <stdio.h>
    int main()
    {
    printf("%d %s %c%c%c%c%c %s %s %dn",
    1- -1,

    "/* not a comment */",

    '"' , '"',
    ''','"' ,
    '

    "',
    "\
    " /* not a comment */ ",
    "??/" /* not a comment */ ",
    '??''+'"' );
    return 0;
    }


    Both printing the same output when compiled and run.



    You can compare with the output of gcc -ansi -E to see what the pre-processor would do on it. That code is also valid C99 or C11 code, however gcc disables trigraphs support by default so it won't work with gcc unless you specify the standard like gcc -std=c99 or gcc -std=c11 or add the -trigraphs option).



    It also works on this C99/C11 (non-ANSI/C90) code:




    // comment
    /
    / comment
    // multiline
    comment
    "// not a comment"


    (compare with gcc -E/gcc -std=c99 -E/gcc -std=c11 -E)



    ANSI C didn't support the // form of comment. // is not otherwise valid in ANSI C so wouldn't appear there. One contrived case where // may genuinely appear in ANSI C (as noted there, and you may find the rest of the discussion interesting) is when the stringify operator is in use.



    This is a valid ANSI C code:



    #define s(x) #x
    s(//not a comment)


    And at the time of the discussion in 2004, gcc -ansi -E did indeed expand it to "//not a comment". However today, gcc-5.4 returns an error on it, so I'd doubt we'll find a lot of C code using this kind of construct.



    The GNU sed equivalent could be something like:



    lc='([\%]n|[\%]rn?)'
    sed -zE "
    s/_/_u/g;s/!/_b/g;s/</_l/g;s/>/_r/g;s/:/_c/g;s/;/_s/g;s/@/_a/g;s/%/_p/g;
    s@??/@%@g;s@/$lc**@:&@g;s@*$lc*/@;&@g
    s:/$lc*/:@&:g;s/??'/!/g
    s#:/$lc**[^;]*;*$lc*/|@/$lc*/$lc*|("([\\%]$lc*.|[^\\%"])*"|'$lc*([\\%]$lc*.)?[^\\%']*'|[^'"@;:]+)#<5>#g
    s/<>/ /g;s/!/??'/g;s@%@??/@g;s/[<>@:;]//g
    s/_p/%/g;s/_a/@/g;s/_s/;/g;s/_c/:/g;s/_r/>/g;s/_l/</g;s/_b/!/g;s/_u/_/g"


    If your GNU sed is too old to support -E or -z, you can replace the first line with:



    sed -r ":1;$!{N;b1}





    share|improve this answer














    I once came up with this which we can refine to:



    perl -0777 -pe '
    BEGIN{
    $bs=qr{(?:\|??/)};
    $lc=qr{(?:$bsn|$bsrn?)}
    }
    s{
    /$lc**.*?*$lc*/
    | /$lc*/(?:$lc|[^rn])*
    | (
    "(?:$bs$lc*.|.)*?"
    | '''$lc*(?:$bs$lc*(?:??.|.))?(?:??.|.)*?'''
    | ??'''
    | .[^'''"/?]*
    )
    }{$1 eq "" ? " " : "$1"}exsg'


    to handle a few more corner cases.



    Note that if you remove a comment, you could change the meaning of the code (1-/* comment */-1 is parsed like 1 - -1 while 1--1 (which you'd obtain if you removed the comment) would give you an error). It's better to replace the comment with a space character (as we do here) instead of completely removing it.



    The above should work properly on this valid ANSI C code for instance that tries to include a few corner cases:




    #include <stdio.h>
    int main()
    {
    printf("%d %s %c%c%c%c%c %s %s %dn",
    1-/* comment */-1,
    /
    * comment */
    "/* not a comment */",
    /* multiline
    comment */
    '"' /* comment */ , '"',
    ''','"'/* comment */,
    '

    "', /* comment */
    "\
    " /* not a comment */ ",
    "??/" /* not a comment */ ",
    '??''+'"' /* "comment" */);
    return 0;
    }


    Which gives this output:




    #include <stdio.h>
    int main()
    {
    printf("%d %s %c%c%c%c%c %s %s %dn",
    1- -1,

    "/* not a comment */",

    '"' , '"',
    ''','"' ,
    '

    "',
    "\
    " /* not a comment */ ",
    "??/" /* not a comment */ ",
    '??''+'"' );
    return 0;
    }


    Both printing the same output when compiled and run.



    You can compare with the output of gcc -ansi -E to see what the pre-processor would do on it. That code is also valid C99 or C11 code, however gcc disables trigraphs support by default so it won't work with gcc unless you specify the standard like gcc -std=c99 or gcc -std=c11 or add the -trigraphs option).



    It also works on this C99/C11 (non-ANSI/C90) code:




    // comment
    /
    / comment
    // multiline
    comment
    "// not a comment"


    (compare with gcc -E/gcc -std=c99 -E/gcc -std=c11 -E)



    ANSI C didn't support the // form of comment. // is not otherwise valid in ANSI C so wouldn't appear there. One contrived case where // may genuinely appear in ANSI C (as noted there, and you may find the rest of the discussion interesting) is when the stringify operator is in use.



    This is a valid ANSI C code:



    #define s(x) #x
    s(//not a comment)


    And at the time of the discussion in 2004, gcc -ansi -E did indeed expand it to "//not a comment". However today, gcc-5.4 returns an error on it, so I'd doubt we'll find a lot of C code using this kind of construct.



    The GNU sed equivalent could be something like:



    lc='([\%]n|[\%]rn?)'
    sed -zE "
    s/_/_u/g;s/!/_b/g;s/</_l/g;s/>/_r/g;s/:/_c/g;s/;/_s/g;s/@/_a/g;s/%/_p/g;
    s@??/@%@g;s@/$lc**@:&@g;s@*$lc*/@;&@g
    s:/$lc*/:@&:g;s/??'/!/g
    s#:/$lc**[^;]*;*$lc*/|@/$lc*/$lc*|("([\\%]$lc*.|[^\\%"])*"|'$lc*([\\%]$lc*.)?[^\\%']*'|[^'"@;:]+)#<5>#g
    s/<>/ /g;s/!/??'/g;s@%@??/@g;s/[<>@:;]//g
    s/_p/%/g;s/_a/@/g;s/_s/;/g;s/_c/:/g;s/_r/>/g;s/_l/</g;s/_b/!/g;s/_u/_/g"


    If your GNU sed is too old to support -E or -z, you can replace the first line with:



    sed -r ":1;$!{N;b1}






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jul 23 '16 at 7:13

























    answered Jul 21 '16 at 13:16









    Stéphane Chazelas

    294k54555898




    294k54555898












    • perl solution have problem with multi line: test it with this output => echo -e "BEGIN/*comment*/ COMMAND /*comnment*/END"
      – بارپابابا
      Jul 21 '16 at 14:18










    • @Babby, works for me. I've added a multi-line comment and the resulting output in my test case.
      – Stéphane Chazelas
      Jul 21 '16 at 14:28










    • The best thing to compare to nowadays would be gcc -std=c11 -E -P (-ansi is just another name for -std=c90).
      – zwol
      Jul 21 '16 at 17:16










    • @zwol, the idea is to be able to handle code written for any C/C++ standard (c90, c11 or other). Strictly speaking, it's not possible (see my 2nd contrived example). The code still tries to handle C90 constructs (like ??'), hence we compare with cpp -ansi for those and C99/C11... one (like // xxx), hence we compare with cpp (or cpp -std=c11...)
      – Stéphane Chazelas
      Jul 21 '16 at 17:29










    • @zwol, I've split the test case in an attempt to clarify a bit. It looks like trigraphs are still in C11, so my second test case is not standard C anyway.
      – Stéphane Chazelas
      Jul 21 '16 at 17:47


















    • perl solution have problem with multi line: test it with this output => echo -e "BEGIN/*comment*/ COMMAND /*comnment*/END"
      – بارپابابا
      Jul 21 '16 at 14:18










    • @Babby, works for me. I've added a multi-line comment and the resulting output in my test case.
      – Stéphane Chazelas
      Jul 21 '16 at 14:28










    • The best thing to compare to nowadays would be gcc -std=c11 -E -P (-ansi is just another name for -std=c90).
      – zwol
      Jul 21 '16 at 17:16










    • @zwol, the idea is to be able to handle code written for any C/C++ standard (c90, c11 or other). Strictly speaking, it's not possible (see my 2nd contrived example). The code still tries to handle C90 constructs (like ??'), hence we compare with cpp -ansi for those and C99/C11... one (like // xxx), hence we compare with cpp (or cpp -std=c11...)
      – Stéphane Chazelas
      Jul 21 '16 at 17:29










    • @zwol, I've split the test case in an attempt to clarify a bit. It looks like trigraphs are still in C11, so my second test case is not standard C anyway.
      – Stéphane Chazelas
      Jul 21 '16 at 17:47
















    perl solution have problem with multi line: test it with this output => echo -e "BEGIN/*comment*/ COMMAND /*comnment*/END"
    – بارپابابا
    Jul 21 '16 at 14:18




    perl solution have problem with multi line: test it with this output => echo -e "BEGIN/*comment*/ COMMAND /*comnment*/END"
    – بارپابابا
    Jul 21 '16 at 14:18












    @Babby, works for me. I've added a multi-line comment and the resulting output in my test case.
    – Stéphane Chazelas
    Jul 21 '16 at 14:28




    @Babby, works for me. I've added a multi-line comment and the resulting output in my test case.
    – Stéphane Chazelas
    Jul 21 '16 at 14:28












    The best thing to compare to nowadays would be gcc -std=c11 -E -P (-ansi is just another name for -std=c90).
    – zwol
    Jul 21 '16 at 17:16




    The best thing to compare to nowadays would be gcc -std=c11 -E -P (-ansi is just another name for -std=c90).
    – zwol
    Jul 21 '16 at 17:16












    @zwol, the idea is to be able to handle code written for any C/C++ standard (c90, c11 or other). Strictly speaking, it's not possible (see my 2nd contrived example). The code still tries to handle C90 constructs (like ??'), hence we compare with cpp -ansi for those and C99/C11... one (like // xxx), hence we compare with cpp (or cpp -std=c11...)
    – Stéphane Chazelas
    Jul 21 '16 at 17:29




    @zwol, the idea is to be able to handle code written for any C/C++ standard (c90, c11 or other). Strictly speaking, it's not possible (see my 2nd contrived example). The code still tries to handle C90 constructs (like ??'), hence we compare with cpp -ansi for those and C99/C11... one (like // xxx), hence we compare with cpp (or cpp -std=c11...)
    – Stéphane Chazelas
    Jul 21 '16 at 17:29












    @zwol, I've split the test case in an attempt to clarify a bit. It looks like trigraphs are still in C11, so my second test case is not standard C anyway.
    – Stéphane Chazelas
    Jul 21 '16 at 17:47




    @zwol, I've split the test case in an attempt to clarify a bit. It looks like trigraphs are still in C11, so my second test case is not standard C anyway.
    – Stéphane Chazelas
    Jul 21 '16 at 17:47










    up vote
    6
    down vote













    with sed:



    UPDATE



    //*/ {
    /*// {
    s//*.**///g;
    b next
    };

    :loop;
    /*//! {
    N;
    b loop
    };
    /*// {
    s//*.**//n/g
    }
    :next
    }


    support all possible (multi line comment, data after [or and] befor, );



     e1/*comment*/
    -------------------
    e1/*comment*/e2
    -------------------
    /*comment*/e2
    -------------------
    e1/*com
    ment*/
    -------------------
    e1/*com
    ment*/e2
    -------------------
    /*com
    ment*/e2
    -------------------
    e1/*com
    1
    2
    ment*/
    -------------------
    e1/*com
    1
    2
    ment*/e2
    -------------------
    /*com
    1
    2
    ment*/e2
    -------------------


    run:

    $ sed -f command.sed FILENAME

    e1
    -------------------
    e1e2
    -------------------
    e2
    -------------------
    e1

    -------------------
    e1
    e2
    -------------------

    e2
    -------------------
    e1

    -------------------
    e1
    e2
    -------------------

    e2
    -------------------





    share|improve this answer























    • won't work for a comment starting after data, like proc print data 2nd /*another comment is here*/
      – mazs
      Jul 21 '16 at 12:44












    • @mazs updated, check it
      – بارپابابا
      Jul 21 '16 at 13:19










    • This does not handle comments inside string literals, which may actually matter, depending on what the SQL does
      – zwol
      Jul 21 '16 at 17:18















    up vote
    6
    down vote













    with sed:



    UPDATE



    //*/ {
    /*// {
    s//*.**///g;
    b next
    };

    :loop;
    /*//! {
    N;
    b loop
    };
    /*// {
    s//*.**//n/g
    }
    :next
    }


    support all possible (multi line comment, data after [or and] befor, );



     e1/*comment*/
    -------------------
    e1/*comment*/e2
    -------------------
    /*comment*/e2
    -------------------
    e1/*com
    ment*/
    -------------------
    e1/*com
    ment*/e2
    -------------------
    /*com
    ment*/e2
    -------------------
    e1/*com
    1
    2
    ment*/
    -------------------
    e1/*com
    1
    2
    ment*/e2
    -------------------
    /*com
    1
    2
    ment*/e2
    -------------------


    run:

    $ sed -f command.sed FILENAME

    e1
    -------------------
    e1e2
    -------------------
    e2
    -------------------
    e1

    -------------------
    e1
    e2
    -------------------

    e2
    -------------------
    e1

    -------------------
    e1
    e2
    -------------------

    e2
    -------------------





    share|improve this answer























    • won't work for a comment starting after data, like proc print data 2nd /*another comment is here*/
      – mazs
      Jul 21 '16 at 12:44












    • @mazs updated, check it
      – بارپابابا
      Jul 21 '16 at 13:19










    • This does not handle comments inside string literals, which may actually matter, depending on what the SQL does
      – zwol
      Jul 21 '16 at 17:18













    up vote
    6
    down vote










    up vote
    6
    down vote









    with sed:



    UPDATE



    //*/ {
    /*// {
    s//*.**///g;
    b next
    };

    :loop;
    /*//! {
    N;
    b loop
    };
    /*// {
    s//*.**//n/g
    }
    :next
    }


    support all possible (multi line comment, data after [or and] befor, );



     e1/*comment*/
    -------------------
    e1/*comment*/e2
    -------------------
    /*comment*/e2
    -------------------
    e1/*com
    ment*/
    -------------------
    e1/*com
    ment*/e2
    -------------------
    /*com
    ment*/e2
    -------------------
    e1/*com
    1
    2
    ment*/
    -------------------
    e1/*com
    1
    2
    ment*/e2
    -------------------
    /*com
    1
    2
    ment*/e2
    -------------------


    run:

    $ sed -f command.sed FILENAME

    e1
    -------------------
    e1e2
    -------------------
    e2
    -------------------
    e1

    -------------------
    e1
    e2
    -------------------

    e2
    -------------------
    e1

    -------------------
    e1
    e2
    -------------------

    e2
    -------------------





    share|improve this answer














    with sed:



    UPDATE



    //*/ {
    /*// {
    s//*.**///g;
    b next
    };

    :loop;
    /*//! {
    N;
    b loop
    };
    /*// {
    s//*.**//n/g
    }
    :next
    }


    support all possible (multi line comment, data after [or and] befor, );



     e1/*comment*/
    -------------------
    e1/*comment*/e2
    -------------------
    /*comment*/e2
    -------------------
    e1/*com
    ment*/
    -------------------
    e1/*com
    ment*/e2
    -------------------
    /*com
    ment*/e2
    -------------------
    e1/*com
    1
    2
    ment*/
    -------------------
    e1/*com
    1
    2
    ment*/e2
    -------------------
    /*com
    1
    2
    ment*/e2
    -------------------


    run:

    $ sed -f command.sed FILENAME

    e1
    -------------------
    e1e2
    -------------------
    e2
    -------------------
    e1

    -------------------
    e1
    e2
    -------------------

    e2
    -------------------
    e1

    -------------------
    e1
    e2
    -------------------

    e2
    -------------------






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jul 21 '16 at 13:17

























    answered Jul 21 '16 at 12:28









    بارپابابا

    1,6911530




    1,6911530












    • won't work for a comment starting after data, like proc print data 2nd /*another comment is here*/
      – mazs
      Jul 21 '16 at 12:44












    • @mazs updated, check it
      – بارپابابا
      Jul 21 '16 at 13:19










    • This does not handle comments inside string literals, which may actually matter, depending on what the SQL does
      – zwol
      Jul 21 '16 at 17:18


















    • won't work for a comment starting after data, like proc print data 2nd /*another comment is here*/
      – mazs
      Jul 21 '16 at 12:44












    • @mazs updated, check it
      – بارپابابا
      Jul 21 '16 at 13:19










    • This does not handle comments inside string literals, which may actually matter, depending on what the SQL does
      – zwol
      Jul 21 '16 at 17:18
















    won't work for a comment starting after data, like proc print data 2nd /*another comment is here*/
    – mazs
    Jul 21 '16 at 12:44






    won't work for a comment starting after data, like proc print data 2nd /*another comment is here*/
    – mazs
    Jul 21 '16 at 12:44














    @mazs updated, check it
    – بارپابابا
    Jul 21 '16 at 13:19




    @mazs updated, check it
    – بارپابابا
    Jul 21 '16 at 13:19












    This does not handle comments inside string literals, which may actually matter, depending on what the SQL does
    – zwol
    Jul 21 '16 at 17:18




    This does not handle comments inside string literals, which may actually matter, depending on what the SQL does
    – zwol
    Jul 21 '16 at 17:18










    up vote
    4
    down vote













     $ cat file | perl -pe 'BEGIN{$/=undef}s!/*.+?*/!!sg'

    proc print data=sashelp.cars;
    run;

    data abc;
    set xyz;
    run;


    Remove blank lines if any:



     $ cat file | perl -pe 'BEGIN{$/=undef}s!/*.+?*/n?!!sg'


    Edit - the shorter version by Stephane:



     $ cat file | perl -0777 -pe 's!/*.*?*/!!sg'





    share|improve this answer























    • well, I agree with terdon: Lets see the expected output.
      – Hans Schou
      Jul 21 '16 at 12:20










    • BTW: What should happen to a single line containing: "/*foo*/run;/*bar*/" ? Should that just be "run;" ?
      – Hans Schou
      Jul 21 '16 at 12:29










    • Great! Then my solution works. Note I use non-greedy: ".+?"
      – Hans Schou
      Jul 21 '16 at 12:32






    • 2




      See -0777 as a shorter way to do BEGIN{$/=undef}
      – Stéphane Chazelas
      Jul 21 '16 at 13:30






    • 1




      Perhaps .*? instead of .+? if /**/ is a valid comment too.
      – ilkkachu
      Jul 21 '16 at 20:57















    up vote
    4
    down vote













     $ cat file | perl -pe 'BEGIN{$/=undef}s!/*.+?*/!!sg'

    proc print data=sashelp.cars;
    run;

    data abc;
    set xyz;
    run;


    Remove blank lines if any:



     $ cat file | perl -pe 'BEGIN{$/=undef}s!/*.+?*/n?!!sg'


    Edit - the shorter version by Stephane:



     $ cat file | perl -0777 -pe 's!/*.*?*/!!sg'





    share|improve this answer























    • well, I agree with terdon: Lets see the expected output.
      – Hans Schou
      Jul 21 '16 at 12:20










    • BTW: What should happen to a single line containing: "/*foo*/run;/*bar*/" ? Should that just be "run;" ?
      – Hans Schou
      Jul 21 '16 at 12:29










    • Great! Then my solution works. Note I use non-greedy: ".+?"
      – Hans Schou
      Jul 21 '16 at 12:32






    • 2




      See -0777 as a shorter way to do BEGIN{$/=undef}
      – Stéphane Chazelas
      Jul 21 '16 at 13:30






    • 1




      Perhaps .*? instead of .+? if /**/ is a valid comment too.
      – ilkkachu
      Jul 21 '16 at 20:57













    up vote
    4
    down vote










    up vote
    4
    down vote









     $ cat file | perl -pe 'BEGIN{$/=undef}s!/*.+?*/!!sg'

    proc print data=sashelp.cars;
    run;

    data abc;
    set xyz;
    run;


    Remove blank lines if any:



     $ cat file | perl -pe 'BEGIN{$/=undef}s!/*.+?*/n?!!sg'


    Edit - the shorter version by Stephane:



     $ cat file | perl -0777 -pe 's!/*.*?*/!!sg'





    share|improve this answer














     $ cat file | perl -pe 'BEGIN{$/=undef}s!/*.+?*/!!sg'

    proc print data=sashelp.cars;
    run;

    data abc;
    set xyz;
    run;


    Remove blank lines if any:



     $ cat file | perl -pe 'BEGIN{$/=undef}s!/*.+?*/n?!!sg'


    Edit - the shorter version by Stephane:



     $ cat file | perl -0777 -pe 's!/*.*?*/!!sg'






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jul 25 '16 at 11:17

























    answered Jul 21 '16 at 12:06









    Hans Schou

    785




    785












    • well, I agree with terdon: Lets see the expected output.
      – Hans Schou
      Jul 21 '16 at 12:20










    • BTW: What should happen to a single line containing: "/*foo*/run;/*bar*/" ? Should that just be "run;" ?
      – Hans Schou
      Jul 21 '16 at 12:29










    • Great! Then my solution works. Note I use non-greedy: ".+?"
      – Hans Schou
      Jul 21 '16 at 12:32






    • 2




      See -0777 as a shorter way to do BEGIN{$/=undef}
      – Stéphane Chazelas
      Jul 21 '16 at 13:30






    • 1




      Perhaps .*? instead of .+? if /**/ is a valid comment too.
      – ilkkachu
      Jul 21 '16 at 20:57


















    • well, I agree with terdon: Lets see the expected output.
      – Hans Schou
      Jul 21 '16 at 12:20










    • BTW: What should happen to a single line containing: "/*foo*/run;/*bar*/" ? Should that just be "run;" ?
      – Hans Schou
      Jul 21 '16 at 12:29










    • Great! Then my solution works. Note I use non-greedy: ".+?"
      – Hans Schou
      Jul 21 '16 at 12:32






    • 2




      See -0777 as a shorter way to do BEGIN{$/=undef}
      – Stéphane Chazelas
      Jul 21 '16 at 13:30






    • 1




      Perhaps .*? instead of .+? if /**/ is a valid comment too.
      – ilkkachu
      Jul 21 '16 at 20:57
















    well, I agree with terdon: Lets see the expected output.
    – Hans Schou
    Jul 21 '16 at 12:20




    well, I agree with terdon: Lets see the expected output.
    – Hans Schou
    Jul 21 '16 at 12:20












    BTW: What should happen to a single line containing: "/*foo*/run;/*bar*/" ? Should that just be "run;" ?
    – Hans Schou
    Jul 21 '16 at 12:29




    BTW: What should happen to a single line containing: "/*foo*/run;/*bar*/" ? Should that just be "run;" ?
    – Hans Schou
    Jul 21 '16 at 12:29












    Great! Then my solution works. Note I use non-greedy: ".+?"
    – Hans Schou
    Jul 21 '16 at 12:32




    Great! Then my solution works. Note I use non-greedy: ".+?"
    – Hans Schou
    Jul 21 '16 at 12:32




    2




    2




    See -0777 as a shorter way to do BEGIN{$/=undef}
    – Stéphane Chazelas
    Jul 21 '16 at 13:30




    See -0777 as a shorter way to do BEGIN{$/=undef}
    – Stéphane Chazelas
    Jul 21 '16 at 13:30




    1




    1




    Perhaps .*? instead of .+? if /**/ is a valid comment too.
    – ilkkachu
    Jul 21 '16 at 20:57




    Perhaps .*? instead of .+? if /**/ is a valid comment too.
    – ilkkachu
    Jul 21 '16 at 20:57










    up vote
    2
    down vote













    Solution by Using SED command and no Script



    Here you are:



    sed 's/*//n&/g' test | sed '//*/,/*//d'



    N.B. This doesn't work on OS X, unless you install gnu-sed. But it works on Linux Distros.






    share|improve this answer



















    • 1




      you can use -i option to edit file in-place instead of redirecting output to new file. or much safer -i.bak to backup file
      – Rahul
      Jul 21 '16 at 12:18






    • 1




      It is not working for all the cases too, try to put a comment in the same line and watch what happens... Example set xy; /*test*/ I think we will need perl too solve this in a easy way.
      – Luciano Andress Martini
      Jul 21 '16 at 12:19












    • @Rahul exactly, thanks for mentioning. I just wanted to keep it more simple.
      – FarazX
      Jul 21 '16 at 12:21










    • Im very sorry to say that it is not working for comments in the same line.
      – Luciano Andress Martini
      Jul 21 '16 at 12:38












    • @LucianoAndressMartini Now it does!
      – FarazX
      Jul 21 '16 at 18:28















    up vote
    2
    down vote













    Solution by Using SED command and no Script



    Here you are:



    sed 's/*//n&/g' test | sed '//*/,/*//d'



    N.B. This doesn't work on OS X, unless you install gnu-sed. But it works on Linux Distros.






    share|improve this answer



















    • 1




      you can use -i option to edit file in-place instead of redirecting output to new file. or much safer -i.bak to backup file
      – Rahul
      Jul 21 '16 at 12:18






    • 1




      It is not working for all the cases too, try to put a comment in the same line and watch what happens... Example set xy; /*test*/ I think we will need perl too solve this in a easy way.
      – Luciano Andress Martini
      Jul 21 '16 at 12:19












    • @Rahul exactly, thanks for mentioning. I just wanted to keep it more simple.
      – FarazX
      Jul 21 '16 at 12:21










    • Im very sorry to say that it is not working for comments in the same line.
      – Luciano Andress Martini
      Jul 21 '16 at 12:38












    • @LucianoAndressMartini Now it does!
      – FarazX
      Jul 21 '16 at 18:28













    up vote
    2
    down vote










    up vote
    2
    down vote









    Solution by Using SED command and no Script



    Here you are:



    sed 's/*//n&/g' test | sed '//*/,/*//d'



    N.B. This doesn't work on OS X, unless you install gnu-sed. But it works on Linux Distros.






    share|improve this answer














    Solution by Using SED command and no Script



    Here you are:



    sed 's/*//n&/g' test | sed '//*/,/*//d'



    N.B. This doesn't work on OS X, unless you install gnu-sed. But it works on Linux Distros.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jul 21 '16 at 14:32

























    answered Jul 21 '16 at 11:36









    FarazX

    2,000418




    2,000418








    • 1




      you can use -i option to edit file in-place instead of redirecting output to new file. or much safer -i.bak to backup file
      – Rahul
      Jul 21 '16 at 12:18






    • 1




      It is not working for all the cases too, try to put a comment in the same line and watch what happens... Example set xy; /*test*/ I think we will need perl too solve this in a easy way.
      – Luciano Andress Martini
      Jul 21 '16 at 12:19












    • @Rahul exactly, thanks for mentioning. I just wanted to keep it more simple.
      – FarazX
      Jul 21 '16 at 12:21










    • Im very sorry to say that it is not working for comments in the same line.
      – Luciano Andress Martini
      Jul 21 '16 at 12:38












    • @LucianoAndressMartini Now it does!
      – FarazX
      Jul 21 '16 at 18:28














    • 1




      you can use -i option to edit file in-place instead of redirecting output to new file. or much safer -i.bak to backup file
      – Rahul
      Jul 21 '16 at 12:18






    • 1




      It is not working for all the cases too, try to put a comment in the same line and watch what happens... Example set xy; /*test*/ I think we will need perl too solve this in a easy way.
      – Luciano Andress Martini
      Jul 21 '16 at 12:19












    • @Rahul exactly, thanks for mentioning. I just wanted to keep it more simple.
      – FarazX
      Jul 21 '16 at 12:21










    • Im very sorry to say that it is not working for comments in the same line.
      – Luciano Andress Martini
      Jul 21 '16 at 12:38












    • @LucianoAndressMartini Now it does!
      – FarazX
      Jul 21 '16 at 18:28








    1




    1




    you can use -i option to edit file in-place instead of redirecting output to new file. or much safer -i.bak to backup file
    – Rahul
    Jul 21 '16 at 12:18




    you can use -i option to edit file in-place instead of redirecting output to new file. or much safer -i.bak to backup file
    – Rahul
    Jul 21 '16 at 12:18




    1




    1




    It is not working for all the cases too, try to put a comment in the same line and watch what happens... Example set xy; /*test*/ I think we will need perl too solve this in a easy way.
    – Luciano Andress Martini
    Jul 21 '16 at 12:19






    It is not working for all the cases too, try to put a comment in the same line and watch what happens... Example set xy; /*test*/ I think we will need perl too solve this in a easy way.
    – Luciano Andress Martini
    Jul 21 '16 at 12:19














    @Rahul exactly, thanks for mentioning. I just wanted to keep it more simple.
    – FarazX
    Jul 21 '16 at 12:21




    @Rahul exactly, thanks for mentioning. I just wanted to keep it more simple.
    – FarazX
    Jul 21 '16 at 12:21












    Im very sorry to say that it is not working for comments in the same line.
    – Luciano Andress Martini
    Jul 21 '16 at 12:38






    Im very sorry to say that it is not working for comments in the same line.
    – Luciano Andress Martini
    Jul 21 '16 at 12:38














    @LucianoAndressMartini Now it does!
    – FarazX
    Jul 21 '16 at 18:28




    @LucianoAndressMartini Now it does!
    – FarazX
    Jul 21 '16 at 18:28










    up vote
    1
    down vote













    sed operates on one line at a time, but some of the comments in the input span multiple lines. As per https://unix.stackexchange.com/a/152389/90751 , you can first use tr to turn the line-breaks into some other character. Then sed can process the input as a single line, and you use tr again to restore the line-breaks.



    tr 'n' '' | sed ... | tr '' n'


    I've used null bytes, but you can pick any character that doesn't appear in your input file.



    * has a special meaning in regular expressions, so it will need escaping as * to match a literal *.



    .* is greedy -- it will match the longest possible text, including more */ and /*. That means the first comment, the last comment, and everything in between. To restrict this, replace .* with a stricter pattern: comments can contain anything that's not a "*", and also "*" followed by anything that's not a "/". Runs of multiple *s also have to be accounted for:



    tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' | tr '' 'n'


    This will remove any linebreaks in the multiline comments, ie.



    data1 /* multiline
    comment */ data2


    will become



    data1  data2


    If this isn't what was wanted, sed can be told to keep one of the linebreaks. This means picking a linebreak replacement character that can be matched.



    tr 'n' 'f' | sed -e 's,/*((f)|[^*]|*+[^*/])**+/,2,g' | tr 'f' 'n'


    The special character f, and the use of a back-reference that may not have matched anything, aren't guaranteed to work as intended in all sed implementations. (I confirmed it works on GNU sed 4.07 and 4.2.2.)






    share|improve this answer























    • Could you please let mne know how it will work .I tried as below. tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' test.sas | tr '' 'n' and i got as below: /*This is to print the output data*/data abcdf; set cfgtr; run; proc print data=sashelp.cars; run; data abc; set xyz; run;
      – Sharique Alam
      Aug 5 '16 at 13:25












    • @ShariqueAlam You've put test.sas in the middle of the pipeline there, so sed reads from it directly, and the first tr has no effect. You need to use cat test.sas | tr ...
      – JigglyNaga
      Aug 6 '16 at 14:49















    up vote
    1
    down vote













    sed operates on one line at a time, but some of the comments in the input span multiple lines. As per https://unix.stackexchange.com/a/152389/90751 , you can first use tr to turn the line-breaks into some other character. Then sed can process the input as a single line, and you use tr again to restore the line-breaks.



    tr 'n' '' | sed ... | tr '' n'


    I've used null bytes, but you can pick any character that doesn't appear in your input file.



    * has a special meaning in regular expressions, so it will need escaping as * to match a literal *.



    .* is greedy -- it will match the longest possible text, including more */ and /*. That means the first comment, the last comment, and everything in between. To restrict this, replace .* with a stricter pattern: comments can contain anything that's not a "*", and also "*" followed by anything that's not a "/". Runs of multiple *s also have to be accounted for:



    tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' | tr '' 'n'


    This will remove any linebreaks in the multiline comments, ie.



    data1 /* multiline
    comment */ data2


    will become



    data1  data2


    If this isn't what was wanted, sed can be told to keep one of the linebreaks. This means picking a linebreak replacement character that can be matched.



    tr 'n' 'f' | sed -e 's,/*((f)|[^*]|*+[^*/])**+/,2,g' | tr 'f' 'n'


    The special character f, and the use of a back-reference that may not have matched anything, aren't guaranteed to work as intended in all sed implementations. (I confirmed it works on GNU sed 4.07 and 4.2.2.)






    share|improve this answer























    • Could you please let mne know how it will work .I tried as below. tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' test.sas | tr '' 'n' and i got as below: /*This is to print the output data*/data abcdf; set cfgtr; run; proc print data=sashelp.cars; run; data abc; set xyz; run;
      – Sharique Alam
      Aug 5 '16 at 13:25












    • @ShariqueAlam You've put test.sas in the middle of the pipeline there, so sed reads from it directly, and the first tr has no effect. You need to use cat test.sas | tr ...
      – JigglyNaga
      Aug 6 '16 at 14:49













    up vote
    1
    down vote










    up vote
    1
    down vote









    sed operates on one line at a time, but some of the comments in the input span multiple lines. As per https://unix.stackexchange.com/a/152389/90751 , you can first use tr to turn the line-breaks into some other character. Then sed can process the input as a single line, and you use tr again to restore the line-breaks.



    tr 'n' '' | sed ... | tr '' n'


    I've used null bytes, but you can pick any character that doesn't appear in your input file.



    * has a special meaning in regular expressions, so it will need escaping as * to match a literal *.



    .* is greedy -- it will match the longest possible text, including more */ and /*. That means the first comment, the last comment, and everything in between. To restrict this, replace .* with a stricter pattern: comments can contain anything that's not a "*", and also "*" followed by anything that's not a "/". Runs of multiple *s also have to be accounted for:



    tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' | tr '' 'n'


    This will remove any linebreaks in the multiline comments, ie.



    data1 /* multiline
    comment */ data2


    will become



    data1  data2


    If this isn't what was wanted, sed can be told to keep one of the linebreaks. This means picking a linebreak replacement character that can be matched.



    tr 'n' 'f' | sed -e 's,/*((f)|[^*]|*+[^*/])**+/,2,g' | tr 'f' 'n'


    The special character f, and the use of a back-reference that may not have matched anything, aren't guaranteed to work as intended in all sed implementations. (I confirmed it works on GNU sed 4.07 and 4.2.2.)






    share|improve this answer














    sed operates on one line at a time, but some of the comments in the input span multiple lines. As per https://unix.stackexchange.com/a/152389/90751 , you can first use tr to turn the line-breaks into some other character. Then sed can process the input as a single line, and you use tr again to restore the line-breaks.



    tr 'n' '' | sed ... | tr '' n'


    I've used null bytes, but you can pick any character that doesn't appear in your input file.



    * has a special meaning in regular expressions, so it will need escaping as * to match a literal *.



    .* is greedy -- it will match the longest possible text, including more */ and /*. That means the first comment, the last comment, and everything in between. To restrict this, replace .* with a stricter pattern: comments can contain anything that's not a "*", and also "*" followed by anything that's not a "/". Runs of multiple *s also have to be accounted for:



    tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' | tr '' 'n'


    This will remove any linebreaks in the multiline comments, ie.



    data1 /* multiline
    comment */ data2


    will become



    data1  data2


    If this isn't what was wanted, sed can be told to keep one of the linebreaks. This means picking a linebreak replacement character that can be matched.



    tr 'n' 'f' | sed -e 's,/*((f)|[^*]|*+[^*/])**+/,2,g' | tr 'f' 'n'


    The special character f, and the use of a back-reference that may not have matched anything, aren't guaranteed to work as intended in all sed implementations. (I confirmed it works on GNU sed 4.07 and 4.2.2.)







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jul 21 '16 at 12:48

























    answered Jul 21 '16 at 12:06









    JigglyNaga

    3,469828




    3,469828












    • Could you please let mne know how it will work .I tried as below. tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' test.sas | tr '' 'n' and i got as below: /*This is to print the output data*/data abcdf; set cfgtr; run; proc print data=sashelp.cars; run; data abc; set xyz; run;
      – Sharique Alam
      Aug 5 '16 at 13:25












    • @ShariqueAlam You've put test.sas in the middle of the pipeline there, so sed reads from it directly, and the first tr has no effect. You need to use cat test.sas | tr ...
      – JigglyNaga
      Aug 6 '16 at 14:49


















    • Could you please let mne know how it will work .I tried as below. tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' test.sas | tr '' 'n' and i got as below: /*This is to print the output data*/data abcdf; set cfgtr; run; proc print data=sashelp.cars; run; data abc; set xyz; run;
      – Sharique Alam
      Aug 5 '16 at 13:25












    • @ShariqueAlam You've put test.sas in the middle of the pipeline there, so sed reads from it directly, and the first tr has no effect. You need to use cat test.sas | tr ...
      – JigglyNaga
      Aug 6 '16 at 14:49
















    Could you please let mne know how it will work .I tried as below. tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' test.sas | tr '' 'n' and i got as below: /*This is to print the output data*/data abcdf; set cfgtr; run; proc print data=sashelp.cars; run; data abc; set xyz; run;
    – Sharique Alam
    Aug 5 '16 at 13:25






    Could you please let mne know how it will work .I tried as below. tr 'n' '' | sed -e 's,/*([^*]|*+[^*/])**+/,,g' test.sas | tr '' 'n' and i got as below: /*This is to print the output data*/data abcdf; set cfgtr; run; proc print data=sashelp.cars; run; data abc; set xyz; run;
    – Sharique Alam
    Aug 5 '16 at 13:25














    @ShariqueAlam You've put test.sas in the middle of the pipeline there, so sed reads from it directly, and the first tr has no effect. You need to use cat test.sas | tr ...
    – JigglyNaga
    Aug 6 '16 at 14:49




    @ShariqueAlam You've put test.sas in the middle of the pipeline there, so sed reads from it directly, and the first tr has no effect. You need to use cat test.sas | tr ...
    – JigglyNaga
    Aug 6 '16 at 14:49










    up vote
    0
    down vote













    using one line sed to remove comments:



    sed '//*/d;/*//d' file

    proc print data=sashelp.cars;
    run;
    data abc;
    set xyz;
    run;





    share|improve this answer

























      up vote
      0
      down vote













      using one line sed to remove comments:



      sed '//*/d;/*//d' file

      proc print data=sashelp.cars;
      run;
      data abc;
      set xyz;
      run;





      share|improve this answer























        up vote
        0
        down vote










        up vote
        0
        down vote









        using one line sed to remove comments:



        sed '//*/d;/*//d' file

        proc print data=sashelp.cars;
        run;
        data abc;
        set xyz;
        run;





        share|improve this answer












        using one line sed to remove comments:



        sed '//*/d;/*//d' file

        proc print data=sashelp.cars;
        run;
        data abc;
        set xyz;
        run;






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 19 '17 at 15:37









        user5337995

        213




        213






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f297346%2fhow-can-i-delete-all-characters-falling-under-including%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Morgemoulin

            Scott Moir

            Souastre