Finding all “Non-Binary” files











up vote
35
down vote

favorite
8












Is it possible to use the find command to find all the "non-binary" files in a directory? Here's the problem I'm trying to solve.



I've received an archive of files from a windows user. This archive contains source code and image files. Our build system doesn't play nice with files that have windows line endings. I have a command line program (flip -u) that will flip line endings between *nix and windows. So, I'd like to do something like this



find . -type f | xargs flip -u


However, if this command is run against an image file, or other binary media file, it will corrupt the file. I realize I could build a list of file extensions and filter with that, but I'd rather have something that's not reliant on me keeping that list up to date.



So, is there a way to find all the non-binary files in a directory tree? Or is there an alternate solution I should consider?










share|improve this question




















  • 1




    You could use the file utility somewhere in your script/pipeline to identify whether the file is data or text
    – lk-
    Aug 24 '12 at 18:59








  • 1




    What do you mean by non-binary (everything on a modern computer is binary). I am guessing you are using the distinction from the old C/PM operating system, that had text and binary files. Text files could be of any length but had to end with a ctrl-z, and binary files had to be a multiple of a 512byte block. If so you are meaning text file. (I also note that you write about line ending in non-binary files, this also would suggest that they are text files) Is this correct?
    – ctrl-alt-delor
    Jan 6 '17 at 17:05










  • All files are binary, it is just a mater of interpretation. Are you asking for how to find text files?
    – ctrl-alt-delor
    May 17 '17 at 20:21










  • @richard I come form an era where we called files meant to be interpreted as plain-text plain text, and all other files (images, word processing docs, etc.) binary. I know its all just one's and zeros under the hood :)
    – Alan Storm
    May 17 '17 at 20:28






  • 1




    Ah, I see what you mean about my terms -- I'll use binary/text in the future to avoid confusion. Re: the rn thing -- its my understand those are the ASCII characters for a typewriter's carriage return (move to the beginning of the line) and line feed (move down one line). So rn is a "more accurate" model of the real world physical thing an end of line character was for. Pre OS X, Macs used just a r for this. I usually write the whole thing off as "arbitrary choices made in a rush that we're still dealing with"
    – Alan Storm
    May 17 '17 at 22:29















up vote
35
down vote

favorite
8












Is it possible to use the find command to find all the "non-binary" files in a directory? Here's the problem I'm trying to solve.



I've received an archive of files from a windows user. This archive contains source code and image files. Our build system doesn't play nice with files that have windows line endings. I have a command line program (flip -u) that will flip line endings between *nix and windows. So, I'd like to do something like this



find . -type f | xargs flip -u


However, if this command is run against an image file, or other binary media file, it will corrupt the file. I realize I could build a list of file extensions and filter with that, but I'd rather have something that's not reliant on me keeping that list up to date.



So, is there a way to find all the non-binary files in a directory tree? Or is there an alternate solution I should consider?










share|improve this question




















  • 1




    You could use the file utility somewhere in your script/pipeline to identify whether the file is data or text
    – lk-
    Aug 24 '12 at 18:59








  • 1




    What do you mean by non-binary (everything on a modern computer is binary). I am guessing you are using the distinction from the old C/PM operating system, that had text and binary files. Text files could be of any length but had to end with a ctrl-z, and binary files had to be a multiple of a 512byte block. If so you are meaning text file. (I also note that you write about line ending in non-binary files, this also would suggest that they are text files) Is this correct?
    – ctrl-alt-delor
    Jan 6 '17 at 17:05










  • All files are binary, it is just a mater of interpretation. Are you asking for how to find text files?
    – ctrl-alt-delor
    May 17 '17 at 20:21










  • @richard I come form an era where we called files meant to be interpreted as plain-text plain text, and all other files (images, word processing docs, etc.) binary. I know its all just one's and zeros under the hood :)
    – Alan Storm
    May 17 '17 at 20:28






  • 1




    Ah, I see what you mean about my terms -- I'll use binary/text in the future to avoid confusion. Re: the rn thing -- its my understand those are the ASCII characters for a typewriter's carriage return (move to the beginning of the line) and line feed (move down one line). So rn is a "more accurate" model of the real world physical thing an end of line character was for. Pre OS X, Macs used just a r for this. I usually write the whole thing off as "arbitrary choices made in a rush that we're still dealing with"
    – Alan Storm
    May 17 '17 at 22:29













up vote
35
down vote

favorite
8









up vote
35
down vote

favorite
8






8





Is it possible to use the find command to find all the "non-binary" files in a directory? Here's the problem I'm trying to solve.



I've received an archive of files from a windows user. This archive contains source code and image files. Our build system doesn't play nice with files that have windows line endings. I have a command line program (flip -u) that will flip line endings between *nix and windows. So, I'd like to do something like this



find . -type f | xargs flip -u


However, if this command is run against an image file, or other binary media file, it will corrupt the file. I realize I could build a list of file extensions and filter with that, but I'd rather have something that's not reliant on me keeping that list up to date.



So, is there a way to find all the non-binary files in a directory tree? Or is there an alternate solution I should consider?










share|improve this question















Is it possible to use the find command to find all the "non-binary" files in a directory? Here's the problem I'm trying to solve.



I've received an archive of files from a windows user. This archive contains source code and image files. Our build system doesn't play nice with files that have windows line endings. I have a command line program (flip -u) that will flip line endings between *nix and windows. So, I'd like to do something like this



find . -type f | xargs flip -u


However, if this command is run against an image file, or other binary media file, it will corrupt the file. I realize I could build a list of file extensions and filter with that, but I'd rather have something that's not reliant on me keeping that list up to date.



So, is there a way to find all the non-binary files in a directory tree? Or is there an alternate solution I should consider?







files find text newlines






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 25 '16 at 0:33

























asked Aug 24 '12 at 18:46









Alan Storm

5152615




5152615








  • 1




    You could use the file utility somewhere in your script/pipeline to identify whether the file is data or text
    – lk-
    Aug 24 '12 at 18:59








  • 1




    What do you mean by non-binary (everything on a modern computer is binary). I am guessing you are using the distinction from the old C/PM operating system, that had text and binary files. Text files could be of any length but had to end with a ctrl-z, and binary files had to be a multiple of a 512byte block. If so you are meaning text file. (I also note that you write about line ending in non-binary files, this also would suggest that they are text files) Is this correct?
    – ctrl-alt-delor
    Jan 6 '17 at 17:05










  • All files are binary, it is just a mater of interpretation. Are you asking for how to find text files?
    – ctrl-alt-delor
    May 17 '17 at 20:21










  • @richard I come form an era where we called files meant to be interpreted as plain-text plain text, and all other files (images, word processing docs, etc.) binary. I know its all just one's and zeros under the hood :)
    – Alan Storm
    May 17 '17 at 20:28






  • 1




    Ah, I see what you mean about my terms -- I'll use binary/text in the future to avoid confusion. Re: the rn thing -- its my understand those are the ASCII characters for a typewriter's carriage return (move to the beginning of the line) and line feed (move down one line). So rn is a "more accurate" model of the real world physical thing an end of line character was for. Pre OS X, Macs used just a r for this. I usually write the whole thing off as "arbitrary choices made in a rush that we're still dealing with"
    – Alan Storm
    May 17 '17 at 22:29














  • 1




    You could use the file utility somewhere in your script/pipeline to identify whether the file is data or text
    – lk-
    Aug 24 '12 at 18:59








  • 1




    What do you mean by non-binary (everything on a modern computer is binary). I am guessing you are using the distinction from the old C/PM operating system, that had text and binary files. Text files could be of any length but had to end with a ctrl-z, and binary files had to be a multiple of a 512byte block. If so you are meaning text file. (I also note that you write about line ending in non-binary files, this also would suggest that they are text files) Is this correct?
    – ctrl-alt-delor
    Jan 6 '17 at 17:05










  • All files are binary, it is just a mater of interpretation. Are you asking for how to find text files?
    – ctrl-alt-delor
    May 17 '17 at 20:21










  • @richard I come form an era where we called files meant to be interpreted as plain-text plain text, and all other files (images, word processing docs, etc.) binary. I know its all just one's and zeros under the hood :)
    – Alan Storm
    May 17 '17 at 20:28






  • 1




    Ah, I see what you mean about my terms -- I'll use binary/text in the future to avoid confusion. Re: the rn thing -- its my understand those are the ASCII characters for a typewriter's carriage return (move to the beginning of the line) and line feed (move down one line). So rn is a "more accurate" model of the real world physical thing an end of line character was for. Pre OS X, Macs used just a r for this. I usually write the whole thing off as "arbitrary choices made in a rush that we're still dealing with"
    – Alan Storm
    May 17 '17 at 22:29








1




1




You could use the file utility somewhere in your script/pipeline to identify whether the file is data or text
– lk-
Aug 24 '12 at 18:59






You could use the file utility somewhere in your script/pipeline to identify whether the file is data or text
– lk-
Aug 24 '12 at 18:59






1




1




What do you mean by non-binary (everything on a modern computer is binary). I am guessing you are using the distinction from the old C/PM operating system, that had text and binary files. Text files could be of any length but had to end with a ctrl-z, and binary files had to be a multiple of a 512byte block. If so you are meaning text file. (I also note that you write about line ending in non-binary files, this also would suggest that they are text files) Is this correct?
– ctrl-alt-delor
Jan 6 '17 at 17:05




What do you mean by non-binary (everything on a modern computer is binary). I am guessing you are using the distinction from the old C/PM operating system, that had text and binary files. Text files could be of any length but had to end with a ctrl-z, and binary files had to be a multiple of a 512byte block. If so you are meaning text file. (I also note that you write about line ending in non-binary files, this also would suggest that they are text files) Is this correct?
– ctrl-alt-delor
Jan 6 '17 at 17:05












All files are binary, it is just a mater of interpretation. Are you asking for how to find text files?
– ctrl-alt-delor
May 17 '17 at 20:21




All files are binary, it is just a mater of interpretation. Are you asking for how to find text files?
– ctrl-alt-delor
May 17 '17 at 20:21












@richard I come form an era where we called files meant to be interpreted as plain-text plain text, and all other files (images, word processing docs, etc.) binary. I know its all just one's and zeros under the hood :)
– Alan Storm
May 17 '17 at 20:28




@richard I come form an era where we called files meant to be interpreted as plain-text plain text, and all other files (images, word processing docs, etc.) binary. I know its all just one's and zeros under the hood :)
– Alan Storm
May 17 '17 at 20:28




1




1




Ah, I see what you mean about my terms -- I'll use binary/text in the future to avoid confusion. Re: the rn thing -- its my understand those are the ASCII characters for a typewriter's carriage return (move to the beginning of the line) and line feed (move down one line). So rn is a "more accurate" model of the real world physical thing an end of line character was for. Pre OS X, Macs used just a r for this. I usually write the whole thing off as "arbitrary choices made in a rush that we're still dealing with"
– Alan Storm
May 17 '17 at 22:29




Ah, I see what you mean about my terms -- I'll use binary/text in the future to avoid confusion. Re: the rn thing -- its my understand those are the ASCII characters for a typewriter's carriage return (move to the beginning of the line) and line feed (move down one line). So rn is a "more accurate" model of the real world physical thing an end of line character was for. Pre OS X, Macs used just a r for this. I usually write the whole thing off as "arbitrary choices made in a rush that we're still dealing with"
– Alan Storm
May 17 '17 at 22:29










9 Answers
9






active

oldest

votes

















up vote
18
down vote



accepted










I'd use file and pipe the output into grep or awk to find text files, then extract just the filename portion of file's output and pipe that into xargs.



something like:



file * | awk -F: '/ASCII text/ {print $1}' | xargs -d'n' -r flip -u


Note that the grep searches for 'ASCII text' rather than any just 'text' - you probably don't want to mess with Rich Text documents or unicode text files etc.



You can also use find (or whatever) to generate a list of files to examine with file:



find /path/to/files -type f -exec file {} + | 
awk -F: '/ASCII text/ {print $1}' | xargs -d'n' -r flip -u


The -d'n' argument to xargs makes xargs treat each input line as a separate argument, thus catering for filenames with spaces and other problematic characters. i.e. it's an alternative to xargs -0 when the input source doesn't or can't generate NULL-separated output (such as find's -print0 option). According to the changelog, xargs got the -d/--delimiter option in Sep 2005 so should be in any non-ancient linux distro (I wasn't sure, which is why I checked - I just vaguely remembered it was a "recent" addition).



Note that a linefeed is a valid character in filenames, so this will break if any filenames have linefeeds in them. For typical unix users, this is pathologically insane, but isn't unheard of if the files originated on Mac or Windows machines.



Also note that file is not perfect. It's very good at detecting the type of data in a file but can occasionally get confused.



I have used numerous variations of this method many times in the past with success.






share|improve this answer



















  • 1




    Thanks for this solution! For some reason file displays English text rather than ASCII text on my Solaris system, so I modified that portion accordingly. Also, I replaced awk -F: '{print $1}' with the equivalent cut -f1 -d:.
    – Andrew Cheong
    Dec 10 '13 at 18:12






  • 2




    worth saying grep -I filters binaries
    – xenoterracide
    Aug 10 '16 at 17:31












  • Looking for the word text should be sufficient. This will also pick up file descriptions like ASCII Java program text or HTML document text or troff or preprocessor input text.
    – user1024
    Nov 1 '16 at 23:02










  • My answer is partially a response/improvement upon this answer. Very good point about grepping for ASCII text to avoid messing up RTFs.
    – Wildcard
    Nov 5 '16 at 16:03






  • 1




    xenoterracide: You saved my life man ! Just a flag -I and BINGO
    – Sergio Abreu
    Jan 4 '17 at 21:38


















up vote
9
down vote













No. There is nothing special about a binary or non-binary file. You can use heuristics like 'contains only characters in 0x01–0x7F', but that'll call text files with non-ASCII characters binary files, and unlucky binary files text files.



Now, once you've ignored that...



zip files



If its coming from your Windows user as a zip file, the zip format supports marking files as either binary or text in the archive itself. You can use unzip's -a option to pay attention to this and convert. Of course, see the first paragraph for why this may not be a good idea (the zip program may have guessed wrong when it made the archive).



zipinfo will tell you which files are binary (b) or text (t) in its zipfile listing.



other files



The file command will look at a file and try to identify it. In particular, you'll probably find its -i (output MIME type) option useful; only convert files with type text/*






share|improve this answer




























    up vote
    6
    down vote















    A general solution to only process non-binary files in bash using file -b --mime-encoding:



    while IFS= read -d '' -r file; do
    [[ "$(file -b --mime-encoding "$file")" = binary ]] &&
    { echo "Skipping $file."; continue; }

    echo "Processing $file."

    # ...

    done < <(find . -type f -print0)


    I contacted the author of the file utility and he added a nifty -00 paramter in version 5.26 (released 2016-04-16, is e.g. in current Arch and Ubuntu 16.10) which prints fileresult for multiple files fed to it at once, this way you can do e.g.:



    find . -type f -exec file -00 --mime-encoding {} + |
    awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}' | …


    (The awk part is to filter out every file that isn't non-binary. ORS is the output separator.)



    Can be also used in a loop of course:



    while IFS= read -d '' -r file; do

    echo "Processing $file."

    # ...

    done < <(find . -type f -exec file -00 --mime-encoding {} + |
    awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}')


    Based of this and the previous I created a little bash script for filtering out binary files which utilizes the new method using the -00 parameter of file in newer versions of it and falls back to the previous method on older versions:



    #!/bin/bash

    # Expects files as arguments and returns the ones that do
    # not appear to be binary files as a zero-separated list.
    #
    # USAGE:
    # filter_binary_files.sh [FILES...]
    #
    # EXAMPLE:
    # find . -type f -mtime +5 -exec ./filter_binary_files.sh {} + | xargs -0 ...
    #

    [[ $# -eq 0 ]] && exit

    if [[ "$(file -v)" =~ file-([1-9][0-9]|[6-9]|5.([3-9][0-9]|2[6-9])) ]]; then
    file -00 --mime-encoding -- "$@" |
    awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}'
    else
    for f do
    [[ "$(file -b --mime-encoding -- "$f")" != binary ]] &&
    printf '%s' "$f"
    done
    fi


    Or here a more POSIX-y one, but it requires support for sort -V:



    #!/bin/sh

    # Expects files as arguments and returns the ones that do
    # not appear to be binary files as a zero-separated list.
    #
    # USAGE:
    # filter_binary_files.sh [FILES...]
    #
    # EXAMPLE:
    # find . -type f -mtime +5 -exec ./filter_binary_files.sh {} + | xargs -0 ...
    #

    [ $# -eq 0 ] && exit

    if [ "$(printf '%sn' 'file-5.26' "$(file -v | head -1)" | sort -V)" =
    'file-5.26' ]; then
    file -00 --mime-encoding -- "$@" |
    awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}'
    else
    for f do
    [ "$(file -b --mime-encoding -- "$f")" != binary ] &&
    printf '%s' "$f"
    done
    fi





    share|improve this answer






























      up vote
      4
      down vote













      Cas's answer is good, but it assumes sane filenames; in particular it is assumed that filenames will not contain newlines.



      There's no good reason to make this assumption here, since it is quite simple (and actually cleaner in my opinion) to handle that case correctly as well:



      find . -type f -exec sh -c 'file "$1" | grep -q "ASCII text"' sh {} ; -exec flip -u {} ;


      The find command only makes use of POSIX-specified features. Using -exec to run arbitrary commands as boolean tests is simple, robust (handles odd filenames correctly), and more portable than -print0.



      In fact, all parts of the command are specified by POSIX except for flip.



      Note that file doesn't guarantee accuracy of the results it returns. However, in practice grepping for "ASCII text" in its output is quite reliable.



      (It might miss some text files perhaps, but is very very unlikely to incorrectly identify a binary file as "ASCII text" and mangle it—so we are erring on the side of caution.)






      share|improve this answer























      • Argument-less file calls can be quite slow, e.g. for videos it will tell you everything about the encoding.
        – phk
        Nov 6 '16 at 17:27










      • Also you are assuming no file starts with -.
        – phk
        Nov 6 '16 at 17:29










      • And I see no reason why you wouldn't just do a single call to file, it can take multiple files as arguments.
        – phk
        Nov 6 '16 at 17:45










      • @phk, to address your comments: (1) it's good to know the potential slowness, but I see no POSIX way to prevent that; (2) I make zero assumptions about file names, as the find command will prefix ./ to any filename passed to the shell command; (3) Using grep as a test on a single file command output at a time is the only POSIX way I can see to guarantee correct handling of filenames that may contain newlines.
        – Wildcard
        Nov 6 '16 at 18:59












      • I looked over your final "POSIX-y" solution and I think it's clever—but you assume that file supports the --mime-encoding flag and the -- separator, neither of which is guaranteed by POSIX.
        – Wildcard
        Nov 6 '16 at 19:02




















      up vote
      4
      down vote













      The accepted answer didn't find all of them for me. Here is an example using grep's -I to ignore binaries, and ignoring all hidden files...



      find . -type f -not -path '*/.*' -exec grep -Il '.' {} ; | xargs -L 1 echo 


      Here it is in use in a practical application: dos2unix



      https://unix.stackexchange.com/a/365679/112190



      Hope that helps.






      share|improve this answer




























        up vote
        2
        down vote













        find . -type f -exec grep -I -q . {} ; -print


        This will find all regular files (-type f) in the current directory (or below) that grep thinks are non-empty and non-binary.



        It uses grep -I to distinguish between binary and non-binary files. The -I flag and will cause grep to exit with a non-zero exit status when it detects that a file is binary. A "binary" file is, according to grep, a file that contains character outside the printable ASCII range.



        The -q option to grep will cause it to quit with a zero exit status if the given pattern is found, without emitting any data. The pattern that we use is a single dot, which will match any character.



        If the file is found to be non-binary and if it contains at least one character, the name of the file is printed.



        If you feel brave, you can plug your flip -u into it as well:



        find . -type f -exec grep -I -q . {} ; -print -exec flip -u {} ;





        share|improve this answer






























          up vote
          1
          down vote













          Try this :



          find . -type f -print0 | xargs -0 -r grep -Z -L -U '[^         -~]' | xargs -0 -r flip -u


          Where the argument of grep '[^ -~]' is '[^<tab><space>-~]'.



          If you type it on a shell command line, type Ctrl+V before Tab.
          In an editor, there should be no problem.





          • '[^<tab><space>-~]' will match any character which is not ASCII text (carriage returns are ignore by grep).


          • -L will print only the filename of files who does not match


          • -Z will output filenames separated with a null character (for xargs -0)






          share|improve this answer























          • It's worth noting that with Perl-like Regex grep -P (if available) t is available. Alternatively, using locale translation if the shell supports it: $'t' (bash and zsh do).
            – phk
            Jan 6 '17 at 19:51




















          up vote
          1
          down vote













          Alternate solution:



          The dos2unix command will convert line endings from Windows CRLF to Unix LF, and automatically skip binary files. I apply it recursively using:



          find . -type f -exec dos2unix {} ;





          share|improve this answer





















          • Since dos2unix can take multiple filenames as argument, it is much more efficient to do find . -type f -exec dos2unix {} +
            – Anthon
            Sep 21 '17 at 20:41


















          up vote
          0
          down vote













          sudo find / ( -type f -and -path '*/git/*' -iname ‘README’ ) -exec grep -liI '100644|100755' {} ; -exec flip -u {} ;



          i.( -type f -and -path '*/git/*' -iname ‘README’ ): searches for files within a path containing the name git and file with name README. If you know any specific folder and filename to search for it will be useful.



          ii.-exec command runs a command on the file name generated by find



          iii.; indicates end of command



          iv.{} is the output of the file/foldername found from the previous find search



          v.Multiple commands can be run on subsequently. By appending -exec "command" ; such as with -exec flip -u ;



          vii.grep



          1.-l lists the name of the file
          2.-I searches only non-binary files
          3.-q quiet output
          4.'100644|100755' searches for either 100644 or 100755 within the file found. if found it then runs flip -u. | is the or operator for grep.


          you can clone this test directory and try it out: https://github.com/alphaCTzo7G/stackexchange/tree/master/linux/findSolution204092017



          more detailed answer here: https://github.com/alphaCTzo7G/stackexchange/blob/master/linux/findSolution204092017/README.md






          share|improve this answer





















            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "106"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f46276%2ffinding-all-non-binary-files%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            9 Answers
            9






            active

            oldest

            votes








            9 Answers
            9






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            18
            down vote



            accepted










            I'd use file and pipe the output into grep or awk to find text files, then extract just the filename portion of file's output and pipe that into xargs.



            something like:



            file * | awk -F: '/ASCII text/ {print $1}' | xargs -d'n' -r flip -u


            Note that the grep searches for 'ASCII text' rather than any just 'text' - you probably don't want to mess with Rich Text documents or unicode text files etc.



            You can also use find (or whatever) to generate a list of files to examine with file:



            find /path/to/files -type f -exec file {} + | 
            awk -F: '/ASCII text/ {print $1}' | xargs -d'n' -r flip -u


            The -d'n' argument to xargs makes xargs treat each input line as a separate argument, thus catering for filenames with spaces and other problematic characters. i.e. it's an alternative to xargs -0 when the input source doesn't or can't generate NULL-separated output (such as find's -print0 option). According to the changelog, xargs got the -d/--delimiter option in Sep 2005 so should be in any non-ancient linux distro (I wasn't sure, which is why I checked - I just vaguely remembered it was a "recent" addition).



            Note that a linefeed is a valid character in filenames, so this will break if any filenames have linefeeds in them. For typical unix users, this is pathologically insane, but isn't unheard of if the files originated on Mac or Windows machines.



            Also note that file is not perfect. It's very good at detecting the type of data in a file but can occasionally get confused.



            I have used numerous variations of this method many times in the past with success.






            share|improve this answer



















            • 1




              Thanks for this solution! For some reason file displays English text rather than ASCII text on my Solaris system, so I modified that portion accordingly. Also, I replaced awk -F: '{print $1}' with the equivalent cut -f1 -d:.
              – Andrew Cheong
              Dec 10 '13 at 18:12






            • 2




              worth saying grep -I filters binaries
              – xenoterracide
              Aug 10 '16 at 17:31












            • Looking for the word text should be sufficient. This will also pick up file descriptions like ASCII Java program text or HTML document text or troff or preprocessor input text.
              – user1024
              Nov 1 '16 at 23:02










            • My answer is partially a response/improvement upon this answer. Very good point about grepping for ASCII text to avoid messing up RTFs.
              – Wildcard
              Nov 5 '16 at 16:03






            • 1




              xenoterracide: You saved my life man ! Just a flag -I and BINGO
              – Sergio Abreu
              Jan 4 '17 at 21:38















            up vote
            18
            down vote



            accepted










            I'd use file and pipe the output into grep or awk to find text files, then extract just the filename portion of file's output and pipe that into xargs.



            something like:



            file * | awk -F: '/ASCII text/ {print $1}' | xargs -d'n' -r flip -u


            Note that the grep searches for 'ASCII text' rather than any just 'text' - you probably don't want to mess with Rich Text documents or unicode text files etc.



            You can also use find (or whatever) to generate a list of files to examine with file:



            find /path/to/files -type f -exec file {} + | 
            awk -F: '/ASCII text/ {print $1}' | xargs -d'n' -r flip -u


            The -d'n' argument to xargs makes xargs treat each input line as a separate argument, thus catering for filenames with spaces and other problematic characters. i.e. it's an alternative to xargs -0 when the input source doesn't or can't generate NULL-separated output (such as find's -print0 option). According to the changelog, xargs got the -d/--delimiter option in Sep 2005 so should be in any non-ancient linux distro (I wasn't sure, which is why I checked - I just vaguely remembered it was a "recent" addition).



            Note that a linefeed is a valid character in filenames, so this will break if any filenames have linefeeds in them. For typical unix users, this is pathologically insane, but isn't unheard of if the files originated on Mac or Windows machines.



            Also note that file is not perfect. It's very good at detecting the type of data in a file but can occasionally get confused.



            I have used numerous variations of this method many times in the past with success.






            share|improve this answer



















            • 1




              Thanks for this solution! For some reason file displays English text rather than ASCII text on my Solaris system, so I modified that portion accordingly. Also, I replaced awk -F: '{print $1}' with the equivalent cut -f1 -d:.
              – Andrew Cheong
              Dec 10 '13 at 18:12






            • 2




              worth saying grep -I filters binaries
              – xenoterracide
              Aug 10 '16 at 17:31












            • Looking for the word text should be sufficient. This will also pick up file descriptions like ASCII Java program text or HTML document text or troff or preprocessor input text.
              – user1024
              Nov 1 '16 at 23:02










            • My answer is partially a response/improvement upon this answer. Very good point about grepping for ASCII text to avoid messing up RTFs.
              – Wildcard
              Nov 5 '16 at 16:03






            • 1




              xenoterracide: You saved my life man ! Just a flag -I and BINGO
              – Sergio Abreu
              Jan 4 '17 at 21:38













            up vote
            18
            down vote



            accepted







            up vote
            18
            down vote



            accepted






            I'd use file and pipe the output into grep or awk to find text files, then extract just the filename portion of file's output and pipe that into xargs.



            something like:



            file * | awk -F: '/ASCII text/ {print $1}' | xargs -d'n' -r flip -u


            Note that the grep searches for 'ASCII text' rather than any just 'text' - you probably don't want to mess with Rich Text documents or unicode text files etc.



            You can also use find (or whatever) to generate a list of files to examine with file:



            find /path/to/files -type f -exec file {} + | 
            awk -F: '/ASCII text/ {print $1}' | xargs -d'n' -r flip -u


            The -d'n' argument to xargs makes xargs treat each input line as a separate argument, thus catering for filenames with spaces and other problematic characters. i.e. it's an alternative to xargs -0 when the input source doesn't or can't generate NULL-separated output (such as find's -print0 option). According to the changelog, xargs got the -d/--delimiter option in Sep 2005 so should be in any non-ancient linux distro (I wasn't sure, which is why I checked - I just vaguely remembered it was a "recent" addition).



            Note that a linefeed is a valid character in filenames, so this will break if any filenames have linefeeds in them. For typical unix users, this is pathologically insane, but isn't unheard of if the files originated on Mac or Windows machines.



            Also note that file is not perfect. It's very good at detecting the type of data in a file but can occasionally get confused.



            I have used numerous variations of this method many times in the past with success.






            share|improve this answer














            I'd use file and pipe the output into grep or awk to find text files, then extract just the filename portion of file's output and pipe that into xargs.



            something like:



            file * | awk -F: '/ASCII text/ {print $1}' | xargs -d'n' -r flip -u


            Note that the grep searches for 'ASCII text' rather than any just 'text' - you probably don't want to mess with Rich Text documents or unicode text files etc.



            You can also use find (or whatever) to generate a list of files to examine with file:



            find /path/to/files -type f -exec file {} + | 
            awk -F: '/ASCII text/ {print $1}' | xargs -d'n' -r flip -u


            The -d'n' argument to xargs makes xargs treat each input line as a separate argument, thus catering for filenames with spaces and other problematic characters. i.e. it's an alternative to xargs -0 when the input source doesn't or can't generate NULL-separated output (such as find's -print0 option). According to the changelog, xargs got the -d/--delimiter option in Sep 2005 so should be in any non-ancient linux distro (I wasn't sure, which is why I checked - I just vaguely remembered it was a "recent" addition).



            Note that a linefeed is a valid character in filenames, so this will break if any filenames have linefeeds in them. For typical unix users, this is pathologically insane, but isn't unheard of if the files originated on Mac or Windows machines.



            Also note that file is not perfect. It's very good at detecting the type of data in a file but can occasionally get confused.



            I have used numerous variations of this method many times in the past with success.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Mar 21 '17 at 10:24

























            answered Aug 25 '12 at 1:15









            cas

            38.5k450100




            38.5k450100








            • 1




              Thanks for this solution! For some reason file displays English text rather than ASCII text on my Solaris system, so I modified that portion accordingly. Also, I replaced awk -F: '{print $1}' with the equivalent cut -f1 -d:.
              – Andrew Cheong
              Dec 10 '13 at 18:12






            • 2




              worth saying grep -I filters binaries
              – xenoterracide
              Aug 10 '16 at 17:31












            • Looking for the word text should be sufficient. This will also pick up file descriptions like ASCII Java program text or HTML document text or troff or preprocessor input text.
              – user1024
              Nov 1 '16 at 23:02










            • My answer is partially a response/improvement upon this answer. Very good point about grepping for ASCII text to avoid messing up RTFs.
              – Wildcard
              Nov 5 '16 at 16:03






            • 1




              xenoterracide: You saved my life man ! Just a flag -I and BINGO
              – Sergio Abreu
              Jan 4 '17 at 21:38














            • 1




              Thanks for this solution! For some reason file displays English text rather than ASCII text on my Solaris system, so I modified that portion accordingly. Also, I replaced awk -F: '{print $1}' with the equivalent cut -f1 -d:.
              – Andrew Cheong
              Dec 10 '13 at 18:12






            • 2




              worth saying grep -I filters binaries
              – xenoterracide
              Aug 10 '16 at 17:31












            • Looking for the word text should be sufficient. This will also pick up file descriptions like ASCII Java program text or HTML document text or troff or preprocessor input text.
              – user1024
              Nov 1 '16 at 23:02










            • My answer is partially a response/improvement upon this answer. Very good point about grepping for ASCII text to avoid messing up RTFs.
              – Wildcard
              Nov 5 '16 at 16:03






            • 1




              xenoterracide: You saved my life man ! Just a flag -I and BINGO
              – Sergio Abreu
              Jan 4 '17 at 21:38








            1




            1




            Thanks for this solution! For some reason file displays English text rather than ASCII text on my Solaris system, so I modified that portion accordingly. Also, I replaced awk -F: '{print $1}' with the equivalent cut -f1 -d:.
            – Andrew Cheong
            Dec 10 '13 at 18:12




            Thanks for this solution! For some reason file displays English text rather than ASCII text on my Solaris system, so I modified that portion accordingly. Also, I replaced awk -F: '{print $1}' with the equivalent cut -f1 -d:.
            – Andrew Cheong
            Dec 10 '13 at 18:12




            2




            2




            worth saying grep -I filters binaries
            – xenoterracide
            Aug 10 '16 at 17:31






            worth saying grep -I filters binaries
            – xenoterracide
            Aug 10 '16 at 17:31














            Looking for the word text should be sufficient. This will also pick up file descriptions like ASCII Java program text or HTML document text or troff or preprocessor input text.
            – user1024
            Nov 1 '16 at 23:02




            Looking for the word text should be sufficient. This will also pick up file descriptions like ASCII Java program text or HTML document text or troff or preprocessor input text.
            – user1024
            Nov 1 '16 at 23:02












            My answer is partially a response/improvement upon this answer. Very good point about grepping for ASCII text to avoid messing up RTFs.
            – Wildcard
            Nov 5 '16 at 16:03




            My answer is partially a response/improvement upon this answer. Very good point about grepping for ASCII text to avoid messing up RTFs.
            – Wildcard
            Nov 5 '16 at 16:03




            1




            1




            xenoterracide: You saved my life man ! Just a flag -I and BINGO
            – Sergio Abreu
            Jan 4 '17 at 21:38




            xenoterracide: You saved my life man ! Just a flag -I and BINGO
            – Sergio Abreu
            Jan 4 '17 at 21:38












            up vote
            9
            down vote













            No. There is nothing special about a binary or non-binary file. You can use heuristics like 'contains only characters in 0x01–0x7F', but that'll call text files with non-ASCII characters binary files, and unlucky binary files text files.



            Now, once you've ignored that...



            zip files



            If its coming from your Windows user as a zip file, the zip format supports marking files as either binary or text in the archive itself. You can use unzip's -a option to pay attention to this and convert. Of course, see the first paragraph for why this may not be a good idea (the zip program may have guessed wrong when it made the archive).



            zipinfo will tell you which files are binary (b) or text (t) in its zipfile listing.



            other files



            The file command will look at a file and try to identify it. In particular, you'll probably find its -i (output MIME type) option useful; only convert files with type text/*






            share|improve this answer

























              up vote
              9
              down vote













              No. There is nothing special about a binary or non-binary file. You can use heuristics like 'contains only characters in 0x01–0x7F', but that'll call text files with non-ASCII characters binary files, and unlucky binary files text files.



              Now, once you've ignored that...



              zip files



              If its coming from your Windows user as a zip file, the zip format supports marking files as either binary or text in the archive itself. You can use unzip's -a option to pay attention to this and convert. Of course, see the first paragraph for why this may not be a good idea (the zip program may have guessed wrong when it made the archive).



              zipinfo will tell you which files are binary (b) or text (t) in its zipfile listing.



              other files



              The file command will look at a file and try to identify it. In particular, you'll probably find its -i (output MIME type) option useful; only convert files with type text/*






              share|improve this answer























                up vote
                9
                down vote










                up vote
                9
                down vote









                No. There is nothing special about a binary or non-binary file. You can use heuristics like 'contains only characters in 0x01–0x7F', but that'll call text files with non-ASCII characters binary files, and unlucky binary files text files.



                Now, once you've ignored that...



                zip files



                If its coming from your Windows user as a zip file, the zip format supports marking files as either binary or text in the archive itself. You can use unzip's -a option to pay attention to this and convert. Of course, see the first paragraph for why this may not be a good idea (the zip program may have guessed wrong when it made the archive).



                zipinfo will tell you which files are binary (b) or text (t) in its zipfile listing.



                other files



                The file command will look at a file and try to identify it. In particular, you'll probably find its -i (output MIME type) option useful; only convert files with type text/*






                share|improve this answer












                No. There is nothing special about a binary or non-binary file. You can use heuristics like 'contains only characters in 0x01–0x7F', but that'll call text files with non-ASCII characters binary files, and unlucky binary files text files.



                Now, once you've ignored that...



                zip files



                If its coming from your Windows user as a zip file, the zip format supports marking files as either binary or text in the archive itself. You can use unzip's -a option to pay attention to this and convert. Of course, see the first paragraph for why this may not be a good idea (the zip program may have guessed wrong when it made the archive).



                zipinfo will tell you which files are binary (b) or text (t) in its zipfile listing.



                other files



                The file command will look at a file and try to identify it. In particular, you'll probably find its -i (output MIME type) option useful; only convert files with type text/*







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Aug 24 '12 at 19:00









                derobert

                71.5k8152210




                71.5k8152210






















                    up vote
                    6
                    down vote















                    A general solution to only process non-binary files in bash using file -b --mime-encoding:



                    while IFS= read -d '' -r file; do
                    [[ "$(file -b --mime-encoding "$file")" = binary ]] &&
                    { echo "Skipping $file."; continue; }

                    echo "Processing $file."

                    # ...

                    done < <(find . -type f -print0)


                    I contacted the author of the file utility and he added a nifty -00 paramter in version 5.26 (released 2016-04-16, is e.g. in current Arch and Ubuntu 16.10) which prints fileresult for multiple files fed to it at once, this way you can do e.g.:



                    find . -type f -exec file -00 --mime-encoding {} + |
                    awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}' | …


                    (The awk part is to filter out every file that isn't non-binary. ORS is the output separator.)



                    Can be also used in a loop of course:



                    while IFS= read -d '' -r file; do

                    echo "Processing $file."

                    # ...

                    done < <(find . -type f -exec file -00 --mime-encoding {} + |
                    awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}')


                    Based of this and the previous I created a little bash script for filtering out binary files which utilizes the new method using the -00 parameter of file in newer versions of it and falls back to the previous method on older versions:



                    #!/bin/bash

                    # Expects files as arguments and returns the ones that do
                    # not appear to be binary files as a zero-separated list.
                    #
                    # USAGE:
                    # filter_binary_files.sh [FILES...]
                    #
                    # EXAMPLE:
                    # find . -type f -mtime +5 -exec ./filter_binary_files.sh {} + | xargs -0 ...
                    #

                    [[ $# -eq 0 ]] && exit

                    if [[ "$(file -v)" =~ file-([1-9][0-9]|[6-9]|5.([3-9][0-9]|2[6-9])) ]]; then
                    file -00 --mime-encoding -- "$@" |
                    awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}'
                    else
                    for f do
                    [[ "$(file -b --mime-encoding -- "$f")" != binary ]] &&
                    printf '%s' "$f"
                    done
                    fi


                    Or here a more POSIX-y one, but it requires support for sort -V:



                    #!/bin/sh

                    # Expects files as arguments and returns the ones that do
                    # not appear to be binary files as a zero-separated list.
                    #
                    # USAGE:
                    # filter_binary_files.sh [FILES...]
                    #
                    # EXAMPLE:
                    # find . -type f -mtime +5 -exec ./filter_binary_files.sh {} + | xargs -0 ...
                    #

                    [ $# -eq 0 ] && exit

                    if [ "$(printf '%sn' 'file-5.26' "$(file -v | head -1)" | sort -V)" =
                    'file-5.26' ]; then
                    file -00 --mime-encoding -- "$@" |
                    awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}'
                    else
                    for f do
                    [ "$(file -b --mime-encoding -- "$f")" != binary ] &&
                    printf '%s' "$f"
                    done
                    fi





                    share|improve this answer



























                      up vote
                      6
                      down vote















                      A general solution to only process non-binary files in bash using file -b --mime-encoding:



                      while IFS= read -d '' -r file; do
                      [[ "$(file -b --mime-encoding "$file")" = binary ]] &&
                      { echo "Skipping $file."; continue; }

                      echo "Processing $file."

                      # ...

                      done < <(find . -type f -print0)


                      I contacted the author of the file utility and he added a nifty -00 paramter in version 5.26 (released 2016-04-16, is e.g. in current Arch and Ubuntu 16.10) which prints fileresult for multiple files fed to it at once, this way you can do e.g.:



                      find . -type f -exec file -00 --mime-encoding {} + |
                      awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}' | …


                      (The awk part is to filter out every file that isn't non-binary. ORS is the output separator.)



                      Can be also used in a loop of course:



                      while IFS= read -d '' -r file; do

                      echo "Processing $file."

                      # ...

                      done < <(find . -type f -exec file -00 --mime-encoding {} + |
                      awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}')


                      Based of this and the previous I created a little bash script for filtering out binary files which utilizes the new method using the -00 parameter of file in newer versions of it and falls back to the previous method on older versions:



                      #!/bin/bash

                      # Expects files as arguments and returns the ones that do
                      # not appear to be binary files as a zero-separated list.
                      #
                      # USAGE:
                      # filter_binary_files.sh [FILES...]
                      #
                      # EXAMPLE:
                      # find . -type f -mtime +5 -exec ./filter_binary_files.sh {} + | xargs -0 ...
                      #

                      [[ $# -eq 0 ]] && exit

                      if [[ "$(file -v)" =~ file-([1-9][0-9]|[6-9]|5.([3-9][0-9]|2[6-9])) ]]; then
                      file -00 --mime-encoding -- "$@" |
                      awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}'
                      else
                      for f do
                      [[ "$(file -b --mime-encoding -- "$f")" != binary ]] &&
                      printf '%s' "$f"
                      done
                      fi


                      Or here a more POSIX-y one, but it requires support for sort -V:



                      #!/bin/sh

                      # Expects files as arguments and returns the ones that do
                      # not appear to be binary files as a zero-separated list.
                      #
                      # USAGE:
                      # filter_binary_files.sh [FILES...]
                      #
                      # EXAMPLE:
                      # find . -type f -mtime +5 -exec ./filter_binary_files.sh {} + | xargs -0 ...
                      #

                      [ $# -eq 0 ] && exit

                      if [ "$(printf '%sn' 'file-5.26' "$(file -v | head -1)" | sort -V)" =
                      'file-5.26' ]; then
                      file -00 --mime-encoding -- "$@" |
                      awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}'
                      else
                      for f do
                      [ "$(file -b --mime-encoding -- "$f")" != binary ] &&
                      printf '%s' "$f"
                      done
                      fi





                      share|improve this answer

























                        up vote
                        6
                        down vote










                        up vote
                        6
                        down vote











                        A general solution to only process non-binary files in bash using file -b --mime-encoding:



                        while IFS= read -d '' -r file; do
                        [[ "$(file -b --mime-encoding "$file")" = binary ]] &&
                        { echo "Skipping $file."; continue; }

                        echo "Processing $file."

                        # ...

                        done < <(find . -type f -print0)


                        I contacted the author of the file utility and he added a nifty -00 paramter in version 5.26 (released 2016-04-16, is e.g. in current Arch and Ubuntu 16.10) which prints fileresult for multiple files fed to it at once, this way you can do e.g.:



                        find . -type f -exec file -00 --mime-encoding {} + |
                        awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}' | …


                        (The awk part is to filter out every file that isn't non-binary. ORS is the output separator.)



                        Can be also used in a loop of course:



                        while IFS= read -d '' -r file; do

                        echo "Processing $file."

                        # ...

                        done < <(find . -type f -exec file -00 --mime-encoding {} + |
                        awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}')


                        Based of this and the previous I created a little bash script for filtering out binary files which utilizes the new method using the -00 parameter of file in newer versions of it and falls back to the previous method on older versions:



                        #!/bin/bash

                        # Expects files as arguments and returns the ones that do
                        # not appear to be binary files as a zero-separated list.
                        #
                        # USAGE:
                        # filter_binary_files.sh [FILES...]
                        #
                        # EXAMPLE:
                        # find . -type f -mtime +5 -exec ./filter_binary_files.sh {} + | xargs -0 ...
                        #

                        [[ $# -eq 0 ]] && exit

                        if [[ "$(file -v)" =~ file-([1-9][0-9]|[6-9]|5.([3-9][0-9]|2[6-9])) ]]; then
                        file -00 --mime-encoding -- "$@" |
                        awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}'
                        else
                        for f do
                        [[ "$(file -b --mime-encoding -- "$f")" != binary ]] &&
                        printf '%s' "$f"
                        done
                        fi


                        Or here a more POSIX-y one, but it requires support for sort -V:



                        #!/bin/sh

                        # Expects files as arguments and returns the ones that do
                        # not appear to be binary files as a zero-separated list.
                        #
                        # USAGE:
                        # filter_binary_files.sh [FILES...]
                        #
                        # EXAMPLE:
                        # find . -type f -mtime +5 -exec ./filter_binary_files.sh {} + | xargs -0 ...
                        #

                        [ $# -eq 0 ] && exit

                        if [ "$(printf '%sn' 'file-5.26' "$(file -v | head -1)" | sort -V)" =
                        'file-5.26' ]; then
                        file -00 --mime-encoding -- "$@" |
                        awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}'
                        else
                        for f do
                        [ "$(file -b --mime-encoding -- "$f")" != binary ] &&
                        printf '%s' "$f"
                        done
                        fi





                        share|improve this answer
















                        A general solution to only process non-binary files in bash using file -b --mime-encoding:



                        while IFS= read -d '' -r file; do
                        [[ "$(file -b --mime-encoding "$file")" = binary ]] &&
                        { echo "Skipping $file."; continue; }

                        echo "Processing $file."

                        # ...

                        done < <(find . -type f -print0)


                        I contacted the author of the file utility and he added a nifty -00 paramter in version 5.26 (released 2016-04-16, is e.g. in current Arch and Ubuntu 16.10) which prints fileresult for multiple files fed to it at once, this way you can do e.g.:



                        find . -type f -exec file -00 --mime-encoding {} + |
                        awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}' | …


                        (The awk part is to filter out every file that isn't non-binary. ORS is the output separator.)



                        Can be also used in a loop of course:



                        while IFS= read -d '' -r file; do

                        echo "Processing $file."

                        # ...

                        done < <(find . -type f -exec file -00 --mime-encoding {} + |
                        awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}')


                        Based of this and the previous I created a little bash script for filtering out binary files which utilizes the new method using the -00 parameter of file in newer versions of it and falls back to the previous method on older versions:



                        #!/bin/bash

                        # Expects files as arguments and returns the ones that do
                        # not appear to be binary files as a zero-separated list.
                        #
                        # USAGE:
                        # filter_binary_files.sh [FILES...]
                        #
                        # EXAMPLE:
                        # find . -type f -mtime +5 -exec ./filter_binary_files.sh {} + | xargs -0 ...
                        #

                        [[ $# -eq 0 ]] && exit

                        if [[ "$(file -v)" =~ file-([1-9][0-9]|[6-9]|5.([3-9][0-9]|2[6-9])) ]]; then
                        file -00 --mime-encoding -- "$@" |
                        awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}'
                        else
                        for f do
                        [[ "$(file -b --mime-encoding -- "$f")" != binary ]] &&
                        printf '%s' "$f"
                        done
                        fi


                        Or here a more POSIX-y one, but it requires support for sort -V:



                        #!/bin/sh

                        # Expects files as arguments and returns the ones that do
                        # not appear to be binary files as a zero-separated list.
                        #
                        # USAGE:
                        # filter_binary_files.sh [FILES...]
                        #
                        # EXAMPLE:
                        # find . -type f -mtime +5 -exec ./filter_binary_files.sh {} + | xargs -0 ...
                        #

                        [ $# -eq 0 ] && exit

                        if [ "$(printf '%sn' 'file-5.26' "$(file -v | head -1)" | sort -V)" =
                        'file-5.26' ]; then
                        file -00 --mime-encoding -- "$@" |
                        awk 'BEGIN{ORS=RS=""}{if(NR%2)f=$0;else if(!/binary/)print f}'
                        else
                        for f do
                        [ "$(file -b --mime-encoding -- "$f")" != binary ] &&
                        printf '%s' "$f"
                        done
                        fi






                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited Mar 24 at 18:30

























                        answered Mar 2 '16 at 11:10









                        phk

                        3,97652152




                        3,97652152






















                            up vote
                            4
                            down vote













                            Cas's answer is good, but it assumes sane filenames; in particular it is assumed that filenames will not contain newlines.



                            There's no good reason to make this assumption here, since it is quite simple (and actually cleaner in my opinion) to handle that case correctly as well:



                            find . -type f -exec sh -c 'file "$1" | grep -q "ASCII text"' sh {} ; -exec flip -u {} ;


                            The find command only makes use of POSIX-specified features. Using -exec to run arbitrary commands as boolean tests is simple, robust (handles odd filenames correctly), and more portable than -print0.



                            In fact, all parts of the command are specified by POSIX except for flip.



                            Note that file doesn't guarantee accuracy of the results it returns. However, in practice grepping for "ASCII text" in its output is quite reliable.



                            (It might miss some text files perhaps, but is very very unlikely to incorrectly identify a binary file as "ASCII text" and mangle it—so we are erring on the side of caution.)






                            share|improve this answer























                            • Argument-less file calls can be quite slow, e.g. for videos it will tell you everything about the encoding.
                              – phk
                              Nov 6 '16 at 17:27










                            • Also you are assuming no file starts with -.
                              – phk
                              Nov 6 '16 at 17:29










                            • And I see no reason why you wouldn't just do a single call to file, it can take multiple files as arguments.
                              – phk
                              Nov 6 '16 at 17:45










                            • @phk, to address your comments: (1) it's good to know the potential slowness, but I see no POSIX way to prevent that; (2) I make zero assumptions about file names, as the find command will prefix ./ to any filename passed to the shell command; (3) Using grep as a test on a single file command output at a time is the only POSIX way I can see to guarantee correct handling of filenames that may contain newlines.
                              – Wildcard
                              Nov 6 '16 at 18:59












                            • I looked over your final "POSIX-y" solution and I think it's clever—but you assume that file supports the --mime-encoding flag and the -- separator, neither of which is guaranteed by POSIX.
                              – Wildcard
                              Nov 6 '16 at 19:02

















                            up vote
                            4
                            down vote













                            Cas's answer is good, but it assumes sane filenames; in particular it is assumed that filenames will not contain newlines.



                            There's no good reason to make this assumption here, since it is quite simple (and actually cleaner in my opinion) to handle that case correctly as well:



                            find . -type f -exec sh -c 'file "$1" | grep -q "ASCII text"' sh {} ; -exec flip -u {} ;


                            The find command only makes use of POSIX-specified features. Using -exec to run arbitrary commands as boolean tests is simple, robust (handles odd filenames correctly), and more portable than -print0.



                            In fact, all parts of the command are specified by POSIX except for flip.



                            Note that file doesn't guarantee accuracy of the results it returns. However, in practice grepping for "ASCII text" in its output is quite reliable.



                            (It might miss some text files perhaps, but is very very unlikely to incorrectly identify a binary file as "ASCII text" and mangle it—so we are erring on the side of caution.)






                            share|improve this answer























                            • Argument-less file calls can be quite slow, e.g. for videos it will tell you everything about the encoding.
                              – phk
                              Nov 6 '16 at 17:27










                            • Also you are assuming no file starts with -.
                              – phk
                              Nov 6 '16 at 17:29










                            • And I see no reason why you wouldn't just do a single call to file, it can take multiple files as arguments.
                              – phk
                              Nov 6 '16 at 17:45










                            • @phk, to address your comments: (1) it's good to know the potential slowness, but I see no POSIX way to prevent that; (2) I make zero assumptions about file names, as the find command will prefix ./ to any filename passed to the shell command; (3) Using grep as a test on a single file command output at a time is the only POSIX way I can see to guarantee correct handling of filenames that may contain newlines.
                              – Wildcard
                              Nov 6 '16 at 18:59












                            • I looked over your final "POSIX-y" solution and I think it's clever—but you assume that file supports the --mime-encoding flag and the -- separator, neither of which is guaranteed by POSIX.
                              – Wildcard
                              Nov 6 '16 at 19:02















                            up vote
                            4
                            down vote










                            up vote
                            4
                            down vote









                            Cas's answer is good, but it assumes sane filenames; in particular it is assumed that filenames will not contain newlines.



                            There's no good reason to make this assumption here, since it is quite simple (and actually cleaner in my opinion) to handle that case correctly as well:



                            find . -type f -exec sh -c 'file "$1" | grep -q "ASCII text"' sh {} ; -exec flip -u {} ;


                            The find command only makes use of POSIX-specified features. Using -exec to run arbitrary commands as boolean tests is simple, robust (handles odd filenames correctly), and more portable than -print0.



                            In fact, all parts of the command are specified by POSIX except for flip.



                            Note that file doesn't guarantee accuracy of the results it returns. However, in practice grepping for "ASCII text" in its output is quite reliable.



                            (It might miss some text files perhaps, but is very very unlikely to incorrectly identify a binary file as "ASCII text" and mangle it—so we are erring on the side of caution.)






                            share|improve this answer














                            Cas's answer is good, but it assumes sane filenames; in particular it is assumed that filenames will not contain newlines.



                            There's no good reason to make this assumption here, since it is quite simple (and actually cleaner in my opinion) to handle that case correctly as well:



                            find . -type f -exec sh -c 'file "$1" | grep -q "ASCII text"' sh {} ; -exec flip -u {} ;


                            The find command only makes use of POSIX-specified features. Using -exec to run arbitrary commands as boolean tests is simple, robust (handles odd filenames correctly), and more portable than -print0.



                            In fact, all parts of the command are specified by POSIX except for flip.



                            Note that file doesn't guarantee accuracy of the results it returns. However, in practice grepping for "ASCII text" in its output is quite reliable.



                            (It might miss some text files perhaps, but is very very unlikely to incorrectly identify a binary file as "ASCII text" and mangle it—so we are erring on the side of caution.)







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Apr 13 '17 at 12:36









                            Community

                            1




                            1










                            answered Nov 5 '16 at 16:01









                            Wildcard

                            22.6k961164




                            22.6k961164












                            • Argument-less file calls can be quite slow, e.g. for videos it will tell you everything about the encoding.
                              – phk
                              Nov 6 '16 at 17:27










                            • Also you are assuming no file starts with -.
                              – phk
                              Nov 6 '16 at 17:29










                            • And I see no reason why you wouldn't just do a single call to file, it can take multiple files as arguments.
                              – phk
                              Nov 6 '16 at 17:45










                            • @phk, to address your comments: (1) it's good to know the potential slowness, but I see no POSIX way to prevent that; (2) I make zero assumptions about file names, as the find command will prefix ./ to any filename passed to the shell command; (3) Using grep as a test on a single file command output at a time is the only POSIX way I can see to guarantee correct handling of filenames that may contain newlines.
                              – Wildcard
                              Nov 6 '16 at 18:59












                            • I looked over your final "POSIX-y" solution and I think it's clever—but you assume that file supports the --mime-encoding flag and the -- separator, neither of which is guaranteed by POSIX.
                              – Wildcard
                              Nov 6 '16 at 19:02




















                            • Argument-less file calls can be quite slow, e.g. for videos it will tell you everything about the encoding.
                              – phk
                              Nov 6 '16 at 17:27










                            • Also you are assuming no file starts with -.
                              – phk
                              Nov 6 '16 at 17:29










                            • And I see no reason why you wouldn't just do a single call to file, it can take multiple files as arguments.
                              – phk
                              Nov 6 '16 at 17:45










                            • @phk, to address your comments: (1) it's good to know the potential slowness, but I see no POSIX way to prevent that; (2) I make zero assumptions about file names, as the find command will prefix ./ to any filename passed to the shell command; (3) Using grep as a test on a single file command output at a time is the only POSIX way I can see to guarantee correct handling of filenames that may contain newlines.
                              – Wildcard
                              Nov 6 '16 at 18:59












                            • I looked over your final "POSIX-y" solution and I think it's clever—but you assume that file supports the --mime-encoding flag and the -- separator, neither of which is guaranteed by POSIX.
                              – Wildcard
                              Nov 6 '16 at 19:02


















                            Argument-less file calls can be quite slow, e.g. for videos it will tell you everything about the encoding.
                            – phk
                            Nov 6 '16 at 17:27




                            Argument-less file calls can be quite slow, e.g. for videos it will tell you everything about the encoding.
                            – phk
                            Nov 6 '16 at 17:27












                            Also you are assuming no file starts with -.
                            – phk
                            Nov 6 '16 at 17:29




                            Also you are assuming no file starts with -.
                            – phk
                            Nov 6 '16 at 17:29












                            And I see no reason why you wouldn't just do a single call to file, it can take multiple files as arguments.
                            – phk
                            Nov 6 '16 at 17:45




                            And I see no reason why you wouldn't just do a single call to file, it can take multiple files as arguments.
                            – phk
                            Nov 6 '16 at 17:45












                            @phk, to address your comments: (1) it's good to know the potential slowness, but I see no POSIX way to prevent that; (2) I make zero assumptions about file names, as the find command will prefix ./ to any filename passed to the shell command; (3) Using grep as a test on a single file command output at a time is the only POSIX way I can see to guarantee correct handling of filenames that may contain newlines.
                            – Wildcard
                            Nov 6 '16 at 18:59






                            @phk, to address your comments: (1) it's good to know the potential slowness, but I see no POSIX way to prevent that; (2) I make zero assumptions about file names, as the find command will prefix ./ to any filename passed to the shell command; (3) Using grep as a test on a single file command output at a time is the only POSIX way I can see to guarantee correct handling of filenames that may contain newlines.
                            – Wildcard
                            Nov 6 '16 at 18:59














                            I looked over your final "POSIX-y" solution and I think it's clever—but you assume that file supports the --mime-encoding flag and the -- separator, neither of which is guaranteed by POSIX.
                            – Wildcard
                            Nov 6 '16 at 19:02






                            I looked over your final "POSIX-y" solution and I think it's clever—but you assume that file supports the --mime-encoding flag and the -- separator, neither of which is guaranteed by POSIX.
                            – Wildcard
                            Nov 6 '16 at 19:02












                            up vote
                            4
                            down vote













                            The accepted answer didn't find all of them for me. Here is an example using grep's -I to ignore binaries, and ignoring all hidden files...



                            find . -type f -not -path '*/.*' -exec grep -Il '.' {} ; | xargs -L 1 echo 


                            Here it is in use in a practical application: dos2unix



                            https://unix.stackexchange.com/a/365679/112190



                            Hope that helps.






                            share|improve this answer

























                              up vote
                              4
                              down vote













                              The accepted answer didn't find all of them for me. Here is an example using grep's -I to ignore binaries, and ignoring all hidden files...



                              find . -type f -not -path '*/.*' -exec grep -Il '.' {} ; | xargs -L 1 echo 


                              Here it is in use in a practical application: dos2unix



                              https://unix.stackexchange.com/a/365679/112190



                              Hope that helps.






                              share|improve this answer























                                up vote
                                4
                                down vote










                                up vote
                                4
                                down vote









                                The accepted answer didn't find all of them for me. Here is an example using grep's -I to ignore binaries, and ignoring all hidden files...



                                find . -type f -not -path '*/.*' -exec grep -Il '.' {} ; | xargs -L 1 echo 


                                Here it is in use in a practical application: dos2unix



                                https://unix.stackexchange.com/a/365679/112190



                                Hope that helps.






                                share|improve this answer












                                The accepted answer didn't find all of them for me. Here is an example using grep's -I to ignore binaries, and ignoring all hidden files...



                                find . -type f -not -path '*/.*' -exec grep -Il '.' {} ; | xargs -L 1 echo 


                                Here it is in use in a practical application: dos2unix



                                https://unix.stackexchange.com/a/365679/112190



                                Hope that helps.







                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered May 17 '17 at 17:37









                                phyatt

                                25127




                                25127






















                                    up vote
                                    2
                                    down vote













                                    find . -type f -exec grep -I -q . {} ; -print


                                    This will find all regular files (-type f) in the current directory (or below) that grep thinks are non-empty and non-binary.



                                    It uses grep -I to distinguish between binary and non-binary files. The -I flag and will cause grep to exit with a non-zero exit status when it detects that a file is binary. A "binary" file is, according to grep, a file that contains character outside the printable ASCII range.



                                    The -q option to grep will cause it to quit with a zero exit status if the given pattern is found, without emitting any data. The pattern that we use is a single dot, which will match any character.



                                    If the file is found to be non-binary and if it contains at least one character, the name of the file is printed.



                                    If you feel brave, you can plug your flip -u into it as well:



                                    find . -type f -exec grep -I -q . {} ; -print -exec flip -u {} ;





                                    share|improve this answer



























                                      up vote
                                      2
                                      down vote













                                      find . -type f -exec grep -I -q . {} ; -print


                                      This will find all regular files (-type f) in the current directory (or below) that grep thinks are non-empty and non-binary.



                                      It uses grep -I to distinguish between binary and non-binary files. The -I flag and will cause grep to exit with a non-zero exit status when it detects that a file is binary. A "binary" file is, according to grep, a file that contains character outside the printable ASCII range.



                                      The -q option to grep will cause it to quit with a zero exit status if the given pattern is found, without emitting any data. The pattern that we use is a single dot, which will match any character.



                                      If the file is found to be non-binary and if it contains at least one character, the name of the file is printed.



                                      If you feel brave, you can plug your flip -u into it as well:



                                      find . -type f -exec grep -I -q . {} ; -print -exec flip -u {} ;





                                      share|improve this answer

























                                        up vote
                                        2
                                        down vote










                                        up vote
                                        2
                                        down vote









                                        find . -type f -exec grep -I -q . {} ; -print


                                        This will find all regular files (-type f) in the current directory (or below) that grep thinks are non-empty and non-binary.



                                        It uses grep -I to distinguish between binary and non-binary files. The -I flag and will cause grep to exit with a non-zero exit status when it detects that a file is binary. A "binary" file is, according to grep, a file that contains character outside the printable ASCII range.



                                        The -q option to grep will cause it to quit with a zero exit status if the given pattern is found, without emitting any data. The pattern that we use is a single dot, which will match any character.



                                        If the file is found to be non-binary and if it contains at least one character, the name of the file is printed.



                                        If you feel brave, you can plug your flip -u into it as well:



                                        find . -type f -exec grep -I -q . {} ; -print -exec flip -u {} ;





                                        share|improve this answer














                                        find . -type f -exec grep -I -q . {} ; -print


                                        This will find all regular files (-type f) in the current directory (or below) that grep thinks are non-empty and non-binary.



                                        It uses grep -I to distinguish between binary and non-binary files. The -I flag and will cause grep to exit with a non-zero exit status when it detects that a file is binary. A "binary" file is, according to grep, a file that contains character outside the printable ASCII range.



                                        The -q option to grep will cause it to quit with a zero exit status if the given pattern is found, without emitting any data. The pattern that we use is a single dot, which will match any character.



                                        If the file is found to be non-binary and if it contains at least one character, the name of the file is printed.



                                        If you feel brave, you can plug your flip -u into it as well:



                                        find . -type f -exec grep -I -q . {} ; -print -exec flip -u {} ;






                                        share|improve this answer














                                        share|improve this answer



                                        share|improve this answer








                                        edited Dec 4 at 13:31

























                                        answered May 17 '17 at 20:09









                                        Kusalananda

                                        120k16225369




                                        120k16225369






















                                            up vote
                                            1
                                            down vote













                                            Try this :



                                            find . -type f -print0 | xargs -0 -r grep -Z -L -U '[^         -~]' | xargs -0 -r flip -u


                                            Where the argument of grep '[^ -~]' is '[^<tab><space>-~]'.



                                            If you type it on a shell command line, type Ctrl+V before Tab.
                                            In an editor, there should be no problem.





                                            • '[^<tab><space>-~]' will match any character which is not ASCII text (carriage returns are ignore by grep).


                                            • -L will print only the filename of files who does not match


                                            • -Z will output filenames separated with a null character (for xargs -0)






                                            share|improve this answer























                                            • It's worth noting that with Perl-like Regex grep -P (if available) t is available. Alternatively, using locale translation if the shell supports it: $'t' (bash and zsh do).
                                              – phk
                                              Jan 6 '17 at 19:51

















                                            up vote
                                            1
                                            down vote













                                            Try this :



                                            find . -type f -print0 | xargs -0 -r grep -Z -L -U '[^         -~]' | xargs -0 -r flip -u


                                            Where the argument of grep '[^ -~]' is '[^<tab><space>-~]'.



                                            If you type it on a shell command line, type Ctrl+V before Tab.
                                            In an editor, there should be no problem.





                                            • '[^<tab><space>-~]' will match any character which is not ASCII text (carriage returns are ignore by grep).


                                            • -L will print only the filename of files who does not match


                                            • -Z will output filenames separated with a null character (for xargs -0)






                                            share|improve this answer























                                            • It's worth noting that with Perl-like Regex grep -P (if available) t is available. Alternatively, using locale translation if the shell supports it: $'t' (bash and zsh do).
                                              – phk
                                              Jan 6 '17 at 19:51















                                            up vote
                                            1
                                            down vote










                                            up vote
                                            1
                                            down vote









                                            Try this :



                                            find . -type f -print0 | xargs -0 -r grep -Z -L -U '[^         -~]' | xargs -0 -r flip -u


                                            Where the argument of grep '[^ -~]' is '[^<tab><space>-~]'.



                                            If you type it on a shell command line, type Ctrl+V before Tab.
                                            In an editor, there should be no problem.





                                            • '[^<tab><space>-~]' will match any character which is not ASCII text (carriage returns are ignore by grep).


                                            • -L will print only the filename of files who does not match


                                            • -Z will output filenames separated with a null character (for xargs -0)






                                            share|improve this answer














                                            Try this :



                                            find . -type f -print0 | xargs -0 -r grep -Z -L -U '[^         -~]' | xargs -0 -r flip -u


                                            Where the argument of grep '[^ -~]' is '[^<tab><space>-~]'.



                                            If you type it on a shell command line, type Ctrl+V before Tab.
                                            In an editor, there should be no problem.





                                            • '[^<tab><space>-~]' will match any character which is not ASCII text (carriage returns are ignore by grep).


                                            • -L will print only the filename of files who does not match


                                            • -Z will output filenames separated with a null character (for xargs -0)







                                            share|improve this answer














                                            share|improve this answer



                                            share|improve this answer








                                            edited Jan 6 '17 at 19:49









                                            phk

                                            3,97652152




                                            3,97652152










                                            answered Jan 6 '17 at 15:24









                                            Vouze

                                            62037




                                            62037












                                            • It's worth noting that with Perl-like Regex grep -P (if available) t is available. Alternatively, using locale translation if the shell supports it: $'t' (bash and zsh do).
                                              – phk
                                              Jan 6 '17 at 19:51




















                                            • It's worth noting that with Perl-like Regex grep -P (if available) t is available. Alternatively, using locale translation if the shell supports it: $'t' (bash and zsh do).
                                              – phk
                                              Jan 6 '17 at 19:51


















                                            It's worth noting that with Perl-like Regex grep -P (if available) t is available. Alternatively, using locale translation if the shell supports it: $'t' (bash and zsh do).
                                            – phk
                                            Jan 6 '17 at 19:51






                                            It's worth noting that with Perl-like Regex grep -P (if available) t is available. Alternatively, using locale translation if the shell supports it: $'t' (bash and zsh do).
                                            – phk
                                            Jan 6 '17 at 19:51












                                            up vote
                                            1
                                            down vote













                                            Alternate solution:



                                            The dos2unix command will convert line endings from Windows CRLF to Unix LF, and automatically skip binary files. I apply it recursively using:



                                            find . -type f -exec dos2unix {} ;





                                            share|improve this answer





















                                            • Since dos2unix can take multiple filenames as argument, it is much more efficient to do find . -type f -exec dos2unix {} +
                                              – Anthon
                                              Sep 21 '17 at 20:41















                                            up vote
                                            1
                                            down vote













                                            Alternate solution:



                                            The dos2unix command will convert line endings from Windows CRLF to Unix LF, and automatically skip binary files. I apply it recursively using:



                                            find . -type f -exec dos2unix {} ;





                                            share|improve this answer





















                                            • Since dos2unix can take multiple filenames as argument, it is much more efficient to do find . -type f -exec dos2unix {} +
                                              – Anthon
                                              Sep 21 '17 at 20:41













                                            up vote
                                            1
                                            down vote










                                            up vote
                                            1
                                            down vote









                                            Alternate solution:



                                            The dos2unix command will convert line endings from Windows CRLF to Unix LF, and automatically skip binary files. I apply it recursively using:



                                            find . -type f -exec dos2unix {} ;





                                            share|improve this answer












                                            Alternate solution:



                                            The dos2unix command will convert line endings from Windows CRLF to Unix LF, and automatically skip binary files. I apply it recursively using:



                                            find . -type f -exec dos2unix {} ;






                                            share|improve this answer












                                            share|improve this answer



                                            share|improve this answer










                                            answered Sep 21 '17 at 20:08









                                            Spark

                                            112




                                            112












                                            • Since dos2unix can take multiple filenames as argument, it is much more efficient to do find . -type f -exec dos2unix {} +
                                              – Anthon
                                              Sep 21 '17 at 20:41


















                                            • Since dos2unix can take multiple filenames as argument, it is much more efficient to do find . -type f -exec dos2unix {} +
                                              – Anthon
                                              Sep 21 '17 at 20:41
















                                            Since dos2unix can take multiple filenames as argument, it is much more efficient to do find . -type f -exec dos2unix {} +
                                            – Anthon
                                            Sep 21 '17 at 20:41




                                            Since dos2unix can take multiple filenames as argument, it is much more efficient to do find . -type f -exec dos2unix {} +
                                            – Anthon
                                            Sep 21 '17 at 20:41










                                            up vote
                                            0
                                            down vote













                                            sudo find / ( -type f -and -path '*/git/*' -iname ‘README’ ) -exec grep -liI '100644|100755' {} ; -exec flip -u {} ;



                                            i.( -type f -and -path '*/git/*' -iname ‘README’ ): searches for files within a path containing the name git and file with name README. If you know any specific folder and filename to search for it will be useful.



                                            ii.-exec command runs a command on the file name generated by find



                                            iii.; indicates end of command



                                            iv.{} is the output of the file/foldername found from the previous find search



                                            v.Multiple commands can be run on subsequently. By appending -exec "command" ; such as with -exec flip -u ;



                                            vii.grep



                                            1.-l lists the name of the file
                                            2.-I searches only non-binary files
                                            3.-q quiet output
                                            4.'100644|100755' searches for either 100644 or 100755 within the file found. if found it then runs flip -u. | is the or operator for grep.


                                            you can clone this test directory and try it out: https://github.com/alphaCTzo7G/stackexchange/tree/master/linux/findSolution204092017



                                            more detailed answer here: https://github.com/alphaCTzo7G/stackexchange/blob/master/linux/findSolution204092017/README.md






                                            share|improve this answer

























                                              up vote
                                              0
                                              down vote













                                              sudo find / ( -type f -and -path '*/git/*' -iname ‘README’ ) -exec grep -liI '100644|100755' {} ; -exec flip -u {} ;



                                              i.( -type f -and -path '*/git/*' -iname ‘README’ ): searches for files within a path containing the name git and file with name README. If you know any specific folder and filename to search for it will be useful.



                                              ii.-exec command runs a command on the file name generated by find



                                              iii.; indicates end of command



                                              iv.{} is the output of the file/foldername found from the previous find search



                                              v.Multiple commands can be run on subsequently. By appending -exec "command" ; such as with -exec flip -u ;



                                              vii.grep



                                              1.-l lists the name of the file
                                              2.-I searches only non-binary files
                                              3.-q quiet output
                                              4.'100644|100755' searches for either 100644 or 100755 within the file found. if found it then runs flip -u. | is the or operator for grep.


                                              you can clone this test directory and try it out: https://github.com/alphaCTzo7G/stackexchange/tree/master/linux/findSolution204092017



                                              more detailed answer here: https://github.com/alphaCTzo7G/stackexchange/blob/master/linux/findSolution204092017/README.md






                                              share|improve this answer























                                                up vote
                                                0
                                                down vote










                                                up vote
                                                0
                                                down vote









                                                sudo find / ( -type f -and -path '*/git/*' -iname ‘README’ ) -exec grep -liI '100644|100755' {} ; -exec flip -u {} ;



                                                i.( -type f -and -path '*/git/*' -iname ‘README’ ): searches for files within a path containing the name git and file with name README. If you know any specific folder and filename to search for it will be useful.



                                                ii.-exec command runs a command on the file name generated by find



                                                iii.; indicates end of command



                                                iv.{} is the output of the file/foldername found from the previous find search



                                                v.Multiple commands can be run on subsequently. By appending -exec "command" ; such as with -exec flip -u ;



                                                vii.grep



                                                1.-l lists the name of the file
                                                2.-I searches only non-binary files
                                                3.-q quiet output
                                                4.'100644|100755' searches for either 100644 or 100755 within the file found. if found it then runs flip -u. | is the or operator for grep.


                                                you can clone this test directory and try it out: https://github.com/alphaCTzo7G/stackexchange/tree/master/linux/findSolution204092017



                                                more detailed answer here: https://github.com/alphaCTzo7G/stackexchange/blob/master/linux/findSolution204092017/README.md






                                                share|improve this answer












                                                sudo find / ( -type f -and -path '*/git/*' -iname ‘README’ ) -exec grep -liI '100644|100755' {} ; -exec flip -u {} ;



                                                i.( -type f -and -path '*/git/*' -iname ‘README’ ): searches for files within a path containing the name git and file with name README. If you know any specific folder and filename to search for it will be useful.



                                                ii.-exec command runs a command on the file name generated by find



                                                iii.; indicates end of command



                                                iv.{} is the output of the file/foldername found from the previous find search



                                                v.Multiple commands can be run on subsequently. By appending -exec "command" ; such as with -exec flip -u ;



                                                vii.grep



                                                1.-l lists the name of the file
                                                2.-I searches only non-binary files
                                                3.-q quiet output
                                                4.'100644|100755' searches for either 100644 or 100755 within the file found. if found it then runs flip -u. | is the or operator for grep.


                                                you can clone this test directory and try it out: https://github.com/alphaCTzo7G/stackexchange/tree/master/linux/findSolution204092017



                                                more detailed answer here: https://github.com/alphaCTzo7G/stackexchange/blob/master/linux/findSolution204092017/README.md







                                                share|improve this answer












                                                share|improve this answer



                                                share|improve this answer










                                                answered Sep 4 '17 at 21:04









                                                alpha_989

                                                1735




                                                1735






























                                                    draft saved

                                                    draft discarded




















































                                                    Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                                    • Please be sure to answer the question. Provide details and share your research!

                                                    But avoid



                                                    • Asking for help, clarification, or responding to other answers.

                                                    • Making statements based on opinion; back them up with references or personal experience.


                                                    To learn more, see our tips on writing great answers.





                                                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                                    Please pay close attention to the following guidance:


                                                    • Please be sure to answer the question. Provide details and share your research!

                                                    But avoid



                                                    • Asking for help, clarification, or responding to other answers.

                                                    • Making statements based on opinion; back them up with references or personal experience.


                                                    To learn more, see our tips on writing great answers.




                                                    draft saved


                                                    draft discarded














                                                    StackExchange.ready(
                                                    function () {
                                                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f46276%2ffinding-all-non-binary-files%23new-answer', 'question_page');
                                                    }
                                                    );

                                                    Post as a guest















                                                    Required, but never shown





















































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown

































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown







                                                    Popular posts from this blog

                                                    Morgemoulin

                                                    Scott Moir

                                                    Souastre