Jsan — Sanitizing JSON

Purpose of Jsan

This library function takes pseudo-JSON or nearly-JSON and makes it legal. It was originally created to deal with configuration files that were edited by hand and might contain hard to find syntax errors or missing quotes or odd characters. The trouble with libraries that read JSON is that they tend to report something like SYNTAX ERROR without giving details of line number etc. This library does not parse JSON, all it does is put in missing commas and quotes. As an additional feature it allows double-slash comments to appear in the original text.
// An example of totally illegal JSONComments will be removed
users : {First symbol should be [ or {
fred : {All labels have missing double quotes
uname : "fred"We seem to have forgotten our commas
pword = " loid"; // note tabDanger! Programmer at work
'george' : {Illegal singly quoted label
pword : "Méliès",Illegal characters
uname : "George",Illegal comma
tens : [ 10,1,.1,.01]Naked decimals not allowed

Cut → paste → sanitise now

Allow = instead of :
Allow ; at end of line.
Replaced by ,
Prevent converting
naked decimals
Sanitise now

Documentation and libraries

PHP version

Browser javascript version

Demo Extreme AX !!!
Try buttons above
Inside processing stages returned in returned object .dump property Object as returned converted
back into JSON for display.
Raw inputStrippedQuotedFinished

node.js javascript version

Demo Extreme
Try buttons above Needs server running on //localhost:3000/
Inside processing stages returned in .dump Object as returned converted
back into JSON
Raw inputStrippedQuotedFinished

Option flags

Good practiceThe best policy is to do the minimum of hacking just in case there are any odd artifacts caused in the data.

It is perfectly easy to construct data that will fool the hacking. It's all text replacements, no parsing.

  • Tidy up commas. Add, remove as required.
  • Add double quotes to labels
  • Change singly-quoted labels to double-quoted
  • Convert to utf8
  • Return legal JSON string
  • Allow = as substitute for :
  • Allow ; at end of line and if found be replaced by ,
  • Prevent converting naked decimals
  • Return object instead of JSON. (PHP)
  • Allow single line of compacted JSON
SZJ_EQUALSAllow = to stand in for : eg. "label"="value" becomes valid.
SZJ_SEMICOLONEOLDisregard semi-colon at end of lineIt's an easy mistake to make.
SZJ_NOTNAKEDDECIMALSSwitch off adding a 0 before decimalsNormally .1 becomes 0.1
SZJ_OBJECTReturn the objectNormally returns the JSON text
SZJ_ONELINEAllow condensed JSON. Normally we expect the source text to be multiple lines in human-readable form. Less that two lines is a probable indication of mis-understood EOLs so raises an exception. When this option is used many of the hacking features are not available. Basically this option is a quick get-you-by and not recommended.
SZJ_DEBUGCollects debugging dataThe optional third parameter is a file to dump to.

Using the code

SanitizeJsonFile(filename[,options][,dump_file])Not available(filename[,options])
Exceptions?Yes. Needs catchingNo. test returned object .ok and .errStr properties.
ReturnsString or PHP object depending on SZJ_OBJECT Object with properties .ok (boolean), .errStr (string) .json (string).obj (javascript object) .dump (Object)
Debugging dataPrinted to screen or html file specified as 3rd arg. The .dump object returned as part of the returned object contains .flags then four stages of processing .rawText, .strippedText, .quotedText and .finalText
FilesJSONsan.phpJSONsan.js, utf8_encode.jsJSONsanNode.js, utf8_encodeNode.js
CallingUse this page and associated files as examples. index.htm, index.js, demo.php, jsanServer.js