Using clamav efficiently when timeshift snapshots are present












1














I am searching for a simple way to perform a full system scan using clamav on a machine that also has Timeshift based snapshooting enabled.



As suggested by this answer on the Ubuntu site, I was using a command like:



clamscan -r --bell -i -exclude-dir="^/sys" /


(note: the -exclude-dir="^/sys" param was suggested to me by another user that pointed out that /sys is a virtual directory and probably best excluded from scans to avoid possible read-access errors)



The command works as expected, 'check all files on the computer, but only display infected files and ring a bell when found'.



This has an evident problem: "all files on the computer" includes the "/timeshift" directory, which is the directory Timeshift use to store snapshot data.



Now, taken from Timeshift official page:




In RSYNC mode, snapshots are taken using rsync and hard-links. Common files are shared between snapshots which saves disk space. Each snapshot is a full system backup that can be browsed with a file manager.




To put it simply: Timeshift duplicates the changed files, and uses hard-links to reference the unchanged ones. As far as my understanding go, this means that the "first" snapshot is probably a full copy of the filesystem (obviously excluding any file/path that Timeshift is configured to ignore) while any following snapshot only includes the changed files and mere links to to unchanged ones.



The problem: under standard settings, clamscan will also scan EVERY file in the /timeshift folder. While I am fine with scanning the changed files which are actual real files... scanning the links seems a waste since basically it means scanning the same file multiple times - one for the snapshot the file was first changed for, and then one for each link to the unchanged file in the following snapshots.



I am therefore searching for a simple way to exclude those hard-links from the scan. man clamscan shows the existence of a --follow-file-symlinks option, but even doing



clamscan -r --bell -i -exclude-dir="^/sys" --follow-file-symlinks=0 /


doesn't seem to work. After all, as far as my understanding go, that option only excludes symlinks, while Timeshift is using hard-links.



So, my question is: is there any way to perform a full system scan while skipping scanning the hard-linked files in the /timeshift directory while at the same time scanning the real ones?



(as a bonus side-question: would the same be possible using the clamtk UI too?)










share|improve this question





























    1














    I am searching for a simple way to perform a full system scan using clamav on a machine that also has Timeshift based snapshooting enabled.



    As suggested by this answer on the Ubuntu site, I was using a command like:



    clamscan -r --bell -i -exclude-dir="^/sys" /


    (note: the -exclude-dir="^/sys" param was suggested to me by another user that pointed out that /sys is a virtual directory and probably best excluded from scans to avoid possible read-access errors)



    The command works as expected, 'check all files on the computer, but only display infected files and ring a bell when found'.



    This has an evident problem: "all files on the computer" includes the "/timeshift" directory, which is the directory Timeshift use to store snapshot data.



    Now, taken from Timeshift official page:




    In RSYNC mode, snapshots are taken using rsync and hard-links. Common files are shared between snapshots which saves disk space. Each snapshot is a full system backup that can be browsed with a file manager.




    To put it simply: Timeshift duplicates the changed files, and uses hard-links to reference the unchanged ones. As far as my understanding go, this means that the "first" snapshot is probably a full copy of the filesystem (obviously excluding any file/path that Timeshift is configured to ignore) while any following snapshot only includes the changed files and mere links to to unchanged ones.



    The problem: under standard settings, clamscan will also scan EVERY file in the /timeshift folder. While I am fine with scanning the changed files which are actual real files... scanning the links seems a waste since basically it means scanning the same file multiple times - one for the snapshot the file was first changed for, and then one for each link to the unchanged file in the following snapshots.



    I am therefore searching for a simple way to exclude those hard-links from the scan. man clamscan shows the existence of a --follow-file-symlinks option, but even doing



    clamscan -r --bell -i -exclude-dir="^/sys" --follow-file-symlinks=0 /


    doesn't seem to work. After all, as far as my understanding go, that option only excludes symlinks, while Timeshift is using hard-links.



    So, my question is: is there any way to perform a full system scan while skipping scanning the hard-linked files in the /timeshift directory while at the same time scanning the real ones?



    (as a bonus side-question: would the same be possible using the clamtk UI too?)










    share|improve this question



























      1












      1








      1







      I am searching for a simple way to perform a full system scan using clamav on a machine that also has Timeshift based snapshooting enabled.



      As suggested by this answer on the Ubuntu site, I was using a command like:



      clamscan -r --bell -i -exclude-dir="^/sys" /


      (note: the -exclude-dir="^/sys" param was suggested to me by another user that pointed out that /sys is a virtual directory and probably best excluded from scans to avoid possible read-access errors)



      The command works as expected, 'check all files on the computer, but only display infected files and ring a bell when found'.



      This has an evident problem: "all files on the computer" includes the "/timeshift" directory, which is the directory Timeshift use to store snapshot data.



      Now, taken from Timeshift official page:




      In RSYNC mode, snapshots are taken using rsync and hard-links. Common files are shared between snapshots which saves disk space. Each snapshot is a full system backup that can be browsed with a file manager.




      To put it simply: Timeshift duplicates the changed files, and uses hard-links to reference the unchanged ones. As far as my understanding go, this means that the "first" snapshot is probably a full copy of the filesystem (obviously excluding any file/path that Timeshift is configured to ignore) while any following snapshot only includes the changed files and mere links to to unchanged ones.



      The problem: under standard settings, clamscan will also scan EVERY file in the /timeshift folder. While I am fine with scanning the changed files which are actual real files... scanning the links seems a waste since basically it means scanning the same file multiple times - one for the snapshot the file was first changed for, and then one for each link to the unchanged file in the following snapshots.



      I am therefore searching for a simple way to exclude those hard-links from the scan. man clamscan shows the existence of a --follow-file-symlinks option, but even doing



      clamscan -r --bell -i -exclude-dir="^/sys" --follow-file-symlinks=0 /


      doesn't seem to work. After all, as far as my understanding go, that option only excludes symlinks, while Timeshift is using hard-links.



      So, my question is: is there any way to perform a full system scan while skipping scanning the hard-linked files in the /timeshift directory while at the same time scanning the real ones?



      (as a bonus side-question: would the same be possible using the clamtk UI too?)










      share|improve this question















      I am searching for a simple way to perform a full system scan using clamav on a machine that also has Timeshift based snapshooting enabled.



      As suggested by this answer on the Ubuntu site, I was using a command like:



      clamscan -r --bell -i -exclude-dir="^/sys" /


      (note: the -exclude-dir="^/sys" param was suggested to me by another user that pointed out that /sys is a virtual directory and probably best excluded from scans to avoid possible read-access errors)



      The command works as expected, 'check all files on the computer, but only display infected files and ring a bell when found'.



      This has an evident problem: "all files on the computer" includes the "/timeshift" directory, which is the directory Timeshift use to store snapshot data.



      Now, taken from Timeshift official page:




      In RSYNC mode, snapshots are taken using rsync and hard-links. Common files are shared between snapshots which saves disk space. Each snapshot is a full system backup that can be browsed with a file manager.




      To put it simply: Timeshift duplicates the changed files, and uses hard-links to reference the unchanged ones. As far as my understanding go, this means that the "first" snapshot is probably a full copy of the filesystem (obviously excluding any file/path that Timeshift is configured to ignore) while any following snapshot only includes the changed files and mere links to to unchanged ones.



      The problem: under standard settings, clamscan will also scan EVERY file in the /timeshift folder. While I am fine with scanning the changed files which are actual real files... scanning the links seems a waste since basically it means scanning the same file multiple times - one for the snapshot the file was first changed for, and then one for each link to the unchanged file in the following snapshots.



      I am therefore searching for a simple way to exclude those hard-links from the scan. man clamscan shows the existence of a --follow-file-symlinks option, but even doing



      clamscan -r --bell -i -exclude-dir="^/sys" --follow-file-symlinks=0 /


      doesn't seem to work. After all, as far as my understanding go, that option only excludes symlinks, while Timeshift is using hard-links.



      So, my question is: is there any way to perform a full system scan while skipping scanning the hard-linked files in the /timeshift directory while at the same time scanning the real ones?



      (as a bonus side-question: would the same be possible using the clamtk UI too?)







      linux-mint hard-link clamav






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Dec 20 '18 at 6:50









      Rui F Ribeiro

      39.2k1479130




      39.2k1479130










      asked Oct 31 '18 at 14:29









      SPArchaeologist

      1085




      1085






















          1 Answer
          1






          active

          oldest

          votes


















          1














          Hard links to files are indistinguishable from what you call "real files" - it's actually same file which lies in two directories. Best solution in your case would be just to add one more -exclude-dir="^/timeshift" parameter to the clamscan command.






          share|improve this answer





















          • Ok, I have done some more research and now I understand the whole hard-link concept a little better. I see what's your point. I guess that achieving this behavior would require rewriting the scan engine to take hard links in consideration - basically building a sort of "already scanned inodes" table. Guess that I will have to live with the exclusion for now.
            – SPArchaeologist
            Dec 27 '18 at 9:26











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "106"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f478908%2fusing-clamav-efficiently-when-timeshift-snapshots-are-present%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          Hard links to files are indistinguishable from what you call "real files" - it's actually same file which lies in two directories. Best solution in your case would be just to add one more -exclude-dir="^/timeshift" parameter to the clamscan command.






          share|improve this answer





















          • Ok, I have done some more research and now I understand the whole hard-link concept a little better. I see what's your point. I guess that achieving this behavior would require rewriting the scan engine to take hard links in consideration - basically building a sort of "already scanned inodes" table. Guess that I will have to live with the exclusion for now.
            – SPArchaeologist
            Dec 27 '18 at 9:26
















          1














          Hard links to files are indistinguishable from what you call "real files" - it's actually same file which lies in two directories. Best solution in your case would be just to add one more -exclude-dir="^/timeshift" parameter to the clamscan command.






          share|improve this answer





















          • Ok, I have done some more research and now I understand the whole hard-link concept a little better. I see what's your point. I guess that achieving this behavior would require rewriting the scan engine to take hard links in consideration - basically building a sort of "already scanned inodes" table. Guess that I will have to live with the exclusion for now.
            – SPArchaeologist
            Dec 27 '18 at 9:26














          1












          1








          1






          Hard links to files are indistinguishable from what you call "real files" - it's actually same file which lies in two directories. Best solution in your case would be just to add one more -exclude-dir="^/timeshift" parameter to the clamscan command.






          share|improve this answer












          Hard links to files are indistinguishable from what you call "real files" - it's actually same file which lies in two directories. Best solution in your case would be just to add one more -exclude-dir="^/timeshift" parameter to the clamscan command.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 24 '18 at 21:25









          Lex-2008

          262




          262












          • Ok, I have done some more research and now I understand the whole hard-link concept a little better. I see what's your point. I guess that achieving this behavior would require rewriting the scan engine to take hard links in consideration - basically building a sort of "already scanned inodes" table. Guess that I will have to live with the exclusion for now.
            – SPArchaeologist
            Dec 27 '18 at 9:26


















          • Ok, I have done some more research and now I understand the whole hard-link concept a little better. I see what's your point. I guess that achieving this behavior would require rewriting the scan engine to take hard links in consideration - basically building a sort of "already scanned inodes" table. Guess that I will have to live with the exclusion for now.
            – SPArchaeologist
            Dec 27 '18 at 9:26
















          Ok, I have done some more research and now I understand the whole hard-link concept a little better. I see what's your point. I guess that achieving this behavior would require rewriting the scan engine to take hard links in consideration - basically building a sort of "already scanned inodes" table. Guess that I will have to live with the exclusion for now.
          – SPArchaeologist
          Dec 27 '18 at 9:26




          Ok, I have done some more research and now I understand the whole hard-link concept a little better. I see what's your point. I guess that achieving this behavior would require rewriting the scan engine to take hard links in consideration - basically building a sort of "already scanned inodes" table. Guess that I will have to live with the exclusion for now.
          – SPArchaeologist
          Dec 27 '18 at 9:26


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f478908%2fusing-clamav-efficiently-when-timeshift-snapshots-are-present%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Morgemoulin

          Scott Moir

          Souastre