Import the test.csv file using the self-contained mongoimport.exe (the test contents are as follows):
name,pass
test1,ztj"ile0
test2,"audreyhepburn"
test3,Xiaoya”””oge521
test4,""520xiangbin
Question:
Use after importfind({name:/^test/})
Query, found pass field all display errors (completely different from the original value in csv, displayed as null value or only half of the text, etc.)-How can MongoDB correctly insert the text record with double quotation marks? ?
Neither insert item by item nor batch import can insert records with double quotation marks, even if you use “\” escape, please pray for the great god!
According to CSV standards:
file = [header CRLF] record *(CRLF record) [CRLF] header = name *(COMMA name) record = field *(COMMA field) name = field field = (escaped / non-escaped) escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE non-escaped = *TEXTDATA COMMA = %x2C CR = %x0D DQUOTE = %x22 LF = %x0A CRLF = CR LF TEXTDATA = %x20-21 / %x23-2B / %x2D-7E
In your example,
test1
Andtest4
It’s illegal. Although I didn’t confirm MongoDB is strictly following RFC 4180 to parse CSV, your file format is definitely a big problem.Therefore, it is still recommended to standardize your CSV file with tools before importing it into the database. I don’t know how much data you have, but this is only simple text processing and the time consumption should be acceptable.
The following is a plan, although not perfect, but should be applicable to most situations:
# For each line except the first line: for line in file[1 ...] # Use the part before the first comma as name and the part after the comma as pass [1:name, 2:pass] = line.match /^([^,])+,(.*)/ # if name and pass exist if name and pass # If pass does not begin and end with double quotation marks when the leading and trailing spaces are ignored, or if there is a single double quotation mark in the middle of pass, escape again unless pass.trim().match(/^".*"$/) and ! pass.match(/[^"]"[^"]/) # Duplicate double quotation marks pass = pass.replace /"/, '""' # Double quote before and after pass = '"' + pass + '"' console.log [name, pass].join ','