Detecting line separators and line breaks in Python CSV











up vote
0
down vote

favorite
1












Here is how I'm detecting a field separator and line break in a CSV file for which I do not already know the format. Does this look sufficient, or are there other things I should be adding here?



SEPARATORS=['x00', 'x01',  '^', ':', ',', 't', ':', ';', '|', '~', ' ']
LINE_TERMINATORS_IN_ORDER = ['x02n', 'rn', 'n', 'r']

def get_csv_info(f):
with open(f, 'r') as csvfile:
line = next(csvfile)
dialect = csv.Sniffer().sniff(line, SEPARATORS)
for _terminator in LINE_TERMINATORS_IN_ORDER:
if line.endswith(_terminator):
terminator = _terminator
break
return (dialect.delimiter, terminator)

get_csv_info('/Users/david/Desktop/validate_headers/artist')
('x01', 'x02n')


Note: using the normal csv.Sniffer without params fails most of the time I use it.










share|improve this question
























  • Your code is broken. Please fix it :)
    – яүυк
    4 hours ago










  • This test yields the error: File "/home/main.py", line 4 def get_csv_info(file) SyntaxError: invalid syntax
    – Sᴀᴍ Onᴇᴌᴀ
    4 hours ago












  • @SᴀᴍOnᴇᴌᴀ fixed.
    – David542
    42 mins ago










  • @яүυк fixed now.
    – David542
    42 mins ago















up vote
0
down vote

favorite
1












Here is how I'm detecting a field separator and line break in a CSV file for which I do not already know the format. Does this look sufficient, or are there other things I should be adding here?



SEPARATORS=['x00', 'x01',  '^', ':', ',', 't', ':', ';', '|', '~', ' ']
LINE_TERMINATORS_IN_ORDER = ['x02n', 'rn', 'n', 'r']

def get_csv_info(f):
with open(f, 'r') as csvfile:
line = next(csvfile)
dialect = csv.Sniffer().sniff(line, SEPARATORS)
for _terminator in LINE_TERMINATORS_IN_ORDER:
if line.endswith(_terminator):
terminator = _terminator
break
return (dialect.delimiter, terminator)

get_csv_info('/Users/david/Desktop/validate_headers/artist')
('x01', 'x02n')


Note: using the normal csv.Sniffer without params fails most of the time I use it.










share|improve this question
























  • Your code is broken. Please fix it :)
    – яүυк
    4 hours ago










  • This test yields the error: File "/home/main.py", line 4 def get_csv_info(file) SyntaxError: invalid syntax
    – Sᴀᴍ Onᴇᴌᴀ
    4 hours ago












  • @SᴀᴍOnᴇᴌᴀ fixed.
    – David542
    42 mins ago










  • @яүυк fixed now.
    – David542
    42 mins ago













up vote
0
down vote

favorite
1









up vote
0
down vote

favorite
1






1





Here is how I'm detecting a field separator and line break in a CSV file for which I do not already know the format. Does this look sufficient, or are there other things I should be adding here?



SEPARATORS=['x00', 'x01',  '^', ':', ',', 't', ':', ';', '|', '~', ' ']
LINE_TERMINATORS_IN_ORDER = ['x02n', 'rn', 'n', 'r']

def get_csv_info(f):
with open(f, 'r') as csvfile:
line = next(csvfile)
dialect = csv.Sniffer().sniff(line, SEPARATORS)
for _terminator in LINE_TERMINATORS_IN_ORDER:
if line.endswith(_terminator):
terminator = _terminator
break
return (dialect.delimiter, terminator)

get_csv_info('/Users/david/Desktop/validate_headers/artist')
('x01', 'x02n')


Note: using the normal csv.Sniffer without params fails most of the time I use it.










share|improve this question















Here is how I'm detecting a field separator and line break in a CSV file for which I do not already know the format. Does this look sufficient, or are there other things I should be adding here?



SEPARATORS=['x00', 'x01',  '^', ':', ',', 't', ':', ';', '|', '~', ' ']
LINE_TERMINATORS_IN_ORDER = ['x02n', 'rn', 'n', 'r']

def get_csv_info(f):
with open(f, 'r') as csvfile:
line = next(csvfile)
dialect = csv.Sniffer().sniff(line, SEPARATORS)
for _terminator in LINE_TERMINATORS_IN_ORDER:
if line.endswith(_terminator):
terminator = _terminator
break
return (dialect.delimiter, terminator)

get_csv_info('/Users/david/Desktop/validate_headers/artist')
('x01', 'x02n')


Note: using the normal csv.Sniffer without params fails most of the time I use it.







python python-3.x csv






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 18 mins ago









Jamal

30.2k11115226




30.2k11115226










asked 4 hours ago









David542

132115




132115












  • Your code is broken. Please fix it :)
    – яүυк
    4 hours ago










  • This test yields the error: File "/home/main.py", line 4 def get_csv_info(file) SyntaxError: invalid syntax
    – Sᴀᴍ Onᴇᴌᴀ
    4 hours ago












  • @SᴀᴍOnᴇᴌᴀ fixed.
    – David542
    42 mins ago










  • @яүυк fixed now.
    – David542
    42 mins ago


















  • Your code is broken. Please fix it :)
    – яүυк
    4 hours ago










  • This test yields the error: File "/home/main.py", line 4 def get_csv_info(file) SyntaxError: invalid syntax
    – Sᴀᴍ Onᴇᴌᴀ
    4 hours ago












  • @SᴀᴍOnᴇᴌᴀ fixed.
    – David542
    42 mins ago










  • @яүυк fixed now.
    – David542
    42 mins ago
















Your code is broken. Please fix it :)
– яүυк
4 hours ago




Your code is broken. Please fix it :)
– яүυк
4 hours ago












This test yields the error: File "/home/main.py", line 4 def get_csv_info(file) SyntaxError: invalid syntax
– Sᴀᴍ Onᴇᴌᴀ
4 hours ago






This test yields the error: File "/home/main.py", line 4 def get_csv_info(file) SyntaxError: invalid syntax
– Sᴀᴍ Onᴇᴌᴀ
4 hours ago














@SᴀᴍOnᴇᴌᴀ fixed.
– David542
42 mins ago




@SᴀᴍOnᴇᴌᴀ fixed.
– David542
42 mins ago












@яүυк fixed now.
– David542
42 mins ago




@яүυк fixed now.
– David542
42 mins ago















active

oldest

votes











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209869%2fdetecting-line-separators-and-line-breaks-in-python-csv%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Code Review Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209869%2fdetecting-line-separators-and-line-breaks-in-python-csv%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Morgemoulin

Scott Moir

Souastre