Hey folks.

I currently have a large JSON file made up of entries similar to the below. The file contains information relating to photographs:-

  "SourceFile": "C:/Users/Me/Desktop/Photos/5294030805_9497032cd7_o.jpg",
  "ExifToolVersion": 8.63,
  "FileName": "5294030805_9497032cd7_o.jpg",
  "Directory": "C:/Users/Me/Desktop/Photos",
  "FileSize": "1094 kB",
  "FileModifyDate": "2011:09:09 22:40:39+01:00",
  "FilePermissions": "rw-rw-rw-",
  "FileType": "JPEG",
  "MIMEType": "image/jpeg",
  "JFIFVersion": 1.01,
  "ExifByteOrder": "Big-endian (Motorola, MM)",
  "Make": "Apple",
  "Model": "iPhone 4",
  "Orientation": "Horizontal (normal)",
  "XResolution": 72,
  "YResolution": 72,
  "ResolutionUnit": "inches",
  "Software": "Camera+ 2.0",
  "ModifyDate": "2010:12:26 22:27:09",
  "HostComputer": "iPhone (iPhone OS 4.2.1)",
  "YCbCrPositioning": "Centered",
  "ExposureTime": "1/15",
  "FNumber": 2.8,
  "ExposureProgram": "Program AE",
  "ISO": 1000,
  "ExifVersion": "0221",
  "DateTimeOriginal": "2010:12:26 22:27:09",
  "CreateDate": "2010:12:26 22:27:09",
  "ComponentsConfiguration": "Y, Cb, Cr, -",
  "ShutterSpeedValue": "1/15",
  "ApertureValue": 2.8,
  "MeteringMode": "Spot",
  "Flash": "Off, Did not fire",
  "FocalLength": "3.9 mm",
  "SubjectArea": "1295 967 933 696",
  "FlashpixVersion": "0100",
  "ColorSpace": "sRGB",
  "ExifImageWidth": 2592,
  "ExifImageHeight": 1936,
  "SensingMethod": "One-chip color area",
  "ExposureMode": "Auto",
  "WhiteBalance": "Auto",
  "SceneCaptureType": "Standard",
  "Sharpness": "Soft",
  "ImageWidth": 2592,
  "ImageHeight": 1936,
  "EncodingProcess": "Baseline DCT, Huffman coding",
  "BitsPerSample": 8,
  "ColorComponents": 3,
  "YCbCrSubSampling": "YCbCr4:2:0 (2 2)",
  "Aperture": 2.8,
  "ImageSize": "2592x1936",
  "ShutterSpeed": "1/15",
  "FocalLength35efl": "3.9 mm",
  "LightValue": 3.6
  "SourceFile": "C:/Users/Me/Desktop/Photos/BU09-C03-07-09.JPG",
  "ExifToolVersion": 8.63,
  "FileName": "BU09-C03-07-09.JPG",
  "Directory": "C:/Users/Me/Desktop/Photos",
  "FileSize": "88 kB",
  "FileModifyDate": "2011:09:09 22:39:08+01:00",
  "FilePermissions": "rw-rw-rw-",
  "FileType": "JPEG",
  "MIMEType": "image/jpeg",
  "JFIFVersion": 1.01,
  "ResolutionUnit": "None",
  "XResolution": 1,
  "YResolution": 1,
  "Comment": "CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 95\n",
  "ImageWidth": 600,
  "ImageHeight": 400,
  "EncodingProcess": "Baseline DCT, Huffman coding",
  "BitsPerSample": 8,
  "ColorComponents": 3,
  "YCbCrSubSampling": "YCbCr4:2:0 (2 2)",
  "ImageSize": "600x400"

As you can see, the first entry contains a lot of information whereas the second entry contains less information (no entry for "flash" for example).

What I am looking to do is, through C# (or another programming language should you see it more suitable), scan through the JSON file and then populate a MySQL table with the unique column headers. So, for example, the first entry in the file has c.53 unique headings such as filename, filesize, make, model etc.... I'd like the script to put these as column headers and then move on to image 2. If new headers are found, these should be appended and if any duplicates encountered, ignores. Once the JSON file has been completely sifted through for unique entry types, I'd then like the code to then commence populating the respective values into the columns so for example, for image 1, "Apple" would be put into the "Make" field and "iPhone" into the "Model" field etc....

Any suggestions how to do this programatically? Is there a more efficient way of doing this than I suggest above?


You'll have to make two passes over the JSON file, the first to get all the columns and types, second to populate them.

In the first pass I'd create a dictionary with the key being the column name, and the value being the full type. What I mean by full is, for example, you have a string value I'd store 'varchar(50)'. As you pass through the file any time you encounter that key again I'd make sure that my size was enough to store whatever you found.

Then you'd construct the create table SQL command from the items in the dictionary.

Then, make the second pass, populating the columns with the data.

It appears that you only have 4 data types to deal with: int, double, DateTime and String. Create a routine that figures out which you have (string/DateTime always start with ", int/double do not).

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, learning, and sharing knowledge.