I have a json file with about 1,200,000 records.
I want to read this file with pyspark as :
spark.read.option("multiline","true").json('file.json')
But it causes this error:
AnalysisException: Unable to infer schema for JSON. It must be specified manually.
When I create a json file with a smaller record count in the main file, this code can read the file.
I can read this json file with pandas, when I set the encoding to utf-8-sig:
pd.read_json("file.json", encoding = 'utf-8-sig')
How can I solve this problem?