string - apex parse csv that contains double quote in every single records -
public static list<list<string>> parsecsv(string contents,boolean skipheaders) { list<list<string>> allfields = new list<list<string>>(); // replace instances double quote begins field containing comma // in case double quote followed doubled double quote // beginning , end of field contents = contents.replaceall(',"""',',"dblqt').replaceall('""",','dblqt",'); // replace remaining double quotes - can reconstruct // fields commas inside assuming begin , end double quote contents = contents.replaceall('""','dblqt'); // not attempting handle fields newline inside of them // so, split on newline spreadsheet rows list<string> lines = new list<string>(); try { lines = contents.split('\n'); } catch (system.listexception e) { system.debug('limits exceeded?' + e.getmessage()); } integer num = 0; for(string line : lines) { // check blank csv lines (only commas) if (line.replaceall(',','').trim().length() == 0) break; list<string> fields = line.split(','); list<string> cleanfields = new list<string>(); string compositefield; boolean makecompositefield = false; for(string field : fields) { if (field.startswith('"') && field.endswith('"')) { cleanfields.add(field.replaceall('dblqt','"')); } else if (field.startswith('"')) { makecompositefield = true; compositefield = field; } else if (field.endswith('"')) { compositefield += ',' + field; cleanfields.add(compositefield.replaceall('dblqt','"')); makecompositefield = false; } else if (makecompositefield) { compositefield += ',' + field; } else { cleanfields.add(field.replaceall('dblqt','"')); } } allfields.add(cleanfields); } if(skipheaders)allfields.remove(0); return allfields; }
i use part parse csv file, find out cant parse when csv bounded double quotes.
for example, have records these "a","b","c","d,e,f","g"
after parsing, these b c d,e,f g
from i'm seen, first thing split line csv file commas, using line:
list < string > fields = line.split(',');
when own example ("a","b","c","d,e,f","g"), list of string is:
fields = ("a" | "b" | "c" | "d | e | f" | "g"), bar used separate list elements
the issue here that, if first split commas, little more difficult differentiate commas part of field (because appeared inside quotes), separate fields in csv.
i suggest trying split line quotes, give this:
fields = (a | , | b | , | c | , | d, e, f | , | g)
and filter out elements of list commas and/or spaces, achieving this:
fields = (a | b | c | d, e, f | g)
(edited)
is java you're using? anyways, here java code you're trying do:
import java.lang.*; import java.util.*; public class helloworld { public static arraylist<arraylist<string>> parsecsv(string contents,boolean skipheaders) { arraylist<arraylist<string>> allfields = new arraylist<arraylist<string>>(); // separating file in lines list<string> lines = new arraylist<string>(); lines = arrays.aslist(contents.split("\n")); // ignoring header, if needed if(skipheaders) lines.remove(0); // each line for(string line : lines) { list<string> fields = arrays.aslist(line.split("\"")); arraylist<string> cleanfields = new arraylist<string>(); boolean iscomma = false; for(string field : fields) { // ignore elements don't have useful data // (every other element after splitting quotes) iscomma = !iscomma; if (iscomma) continue; cleanfields.add(field); } allfields.add(cleanfields); } return allfields; } public static void main(string[] args) { // example of input file: // line 1: "a","b","c","d,e,f","g" // line 2: "a1","b1","c1","d1,e1,f1","g1" arraylist<arraylist<string>> strings = helloworld.parsecsv("\"a\",\"b\",\"c\",\"d,e,f\",\"g\"\n\"a1\",\"b1\",\"c1\",\"d1,e1,f1\",\"g1\"",false); system.out.println("result:"); (arraylist<string> list : strings) { system.out.println(" new list:"); (string str : list) { system.out.println(" - " + str); } } } }
Comments
Post a Comment