error_bad_lines not work, but how do I read data successfully? #57257
Labels
Closing Candidate
May be closeable, needs more eyeballs
IO CSV
read_csv, to_csv
Needs Info
Clarification about behavior needed to assess issue
Usage Question
Since I use panda as version=2.2 I found "error_bad_lines" para was dropped, but I use pd.read_csv("unknown.csv"), Got an Error:
Traceback (most recent call last):
File "D:\work\email_reply\data_process.py", line 11, in
df = pd.read_csv('./data/data_0101.csv', on_bad_lines="warn")
File "D:\miniconda3\envs\py310\lib\site-packages\pandas\io\parsers\readers.py", line 1024, in read_csv
return _read(filepath_or_buffer, kwds)
File "D:\miniconda3\envs\py310\lib\site-packages\pandas\io\parsers\readers.py", line 624, in _read
return parser.read(nrows)
File "D:\miniconda3\envs\py310\lib\site-packages\pandas\io\parsers\readers.py", line 1921, in read
) = self._engine.read( # type: ignore[attr-defined]
File "D:\miniconda3\envs\py310\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 234, in read
chunks = self._reader.read_low_memory(nrows)
File "parsers.pyx", line 838, in pandas._libs.parsers.TextReader.read_low_memory
File "parsers.pyx", line 905, in pandas._libs.parsers.TextReader._read_rows
File "parsers.pyx", line 874, in pandas._libs.parsers.TextReader._tokenize_rows
File "parsers.pyx", line 891, in pandas._libs.parsers.TextReader._check_tokenize_status
File "parsers.pyx", line 2061, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Buffer overflow caught - possible malformed input file.
So how could I read this data sucessfully now ? If there is a better way to deal with this?
The text was updated successfully, but these errors were encountered: